Hyperparameter-Tuning on Sagemaker - FP16 parameter not responsive

Hi there!

I am currently trying to run some hyperparameter tuning on my models using the HyperparameterTuner. However, I fail to vary the FP16 parameter.

I define the search space with

## define hyperparameter space for tuning
hyperparameter_ranges = {
    "epochs": CategoricalParameter([3, 5, 10]),
    "train_batch_size": CategoricalParameter([16, 32, 64]), # adjust dependent on balanced or not
    "learning_rate": ContinuousParameter(1e-5, 1e-3),
    "warmup_ratio": ContinuousParameter(0, 0.2),
    "weight_decay": ContinuousParameter(0.0, 0.3),
    "fp16": CategoricalParameter([True, False]),
    "seed": IntegerParameter(0, 42),
    "class_weight_factor": ContinuousParameter(0.1, 1.0),
}

And pass that to the tuner.

While this mostly runs fine, the fp16 mixed precision does not change over the runs, even though the hyperparameter specification changes:

In the training script, I recover the parameters like this:

parser.add_argument("--fp16", type=bool, default=os.environ["SM_HP_FP16"])

and pass them to the training arguments like this:

# define training args
core_args = {...}

extra_args = {}
  
if args.fp16:
    extra_args["fp16"]=args.fp16

training_args = {**core_args, **extra_args}
training_args = TrainingArguments(**training_args)

which are then passed to the trainer.

Any idea what the issue might be? I am out of ideas - interestingly it always runs with fp16 even though the default is ‘False’.

Best,
Nico

1 Like