Notes on AWS SageMaker hyperparameter tuning jobs

Hyperparameter tuning job in AWS SageMaker is essentially a composition of n training jobs. For each of them supervisor pass parameters from the predefined range (for integer and continuous values) or set (for categorical values) and control execution. For the case of using custom algorithms, parameters to pass are saved into the /opt/ml/input/config/hyperparameters.json. All values into it are strings even if they pretend to be integers or floats. So it is necessary to parse and validate the hyperparameters.json to not fail a training job with obscure errors right after the job start.

Important parameters for hyperparameter jobs are concurrent training jobs per hyperparameter tuning job and a number of training jobs per hyperparameter tuning job. For each run supervisor chose parameters based on information it has so far and to run a successful search supervisor has to have enough information to build a probability density function to chose hyperparameters. A good guess is to set the number of concurrent training jobs as N and the number of training jobs per hyperparameter tuning jobs as at least 3N. Otherwise, a result of the tuning job might be not fine-tuned to produce the best possible result.

Although hyperparameter tuning might not improve the performance of the algorithm, as mentioned in the documentation, it is worth to run it systematically. In my experience, hyperparameter tuning might give up to 2% of the advance for the parameter optimized. It is a tiny change, but for highly optimized algorithms the improvement is notable.