spacekit.skopes.hst.cal.train¶
Spacekit HST “Calibration in the Cloud” (calcloud) Job Resource Allocation Model Training
This script imports and preprocesses job metadata for the Hubble Space Telescope data calibration pipeline, which is then used as inputs to build, train and evaluate 3 neural networks for estimating AWS batch compute job resource requirements.
The networks include one multi-class classifier and two linear regression estimators. The classifier predicts which of 4 possible memory bin sizes (and therefore compute instance type) is most appropriate for reprocessing a given ipppssoot (i.e. “job”). The wallclock regressor estimates the maximum execution time (“wallclock” or “kill” time) in seconds needed to complete the job.
Ex: python -m spacekit.skopes.hst.cal.train data/2021-11-04-1636048291
To load results from disk in a separate session (for plotting, analysis etc):
bcom2 = ComputeMulti(res_path=f”{res_path}/mem_bin”) bin_out = bcom2.upload() bcom2.load_results(bin_out) test_idx = bin_out[“test_idx”]