Hello everyone!
I’ve been trying to perform a simple hyperparameter research on ‘distilroberta-base’. When using Kaggle notebooks there is a 20 GB limit on outputs. Even when i am doing only 5 trials the output directory fills up. I am used to using GridSearchCV for HP-search which only keeps track of the best parameters and retrains at the end without saving individual models during the search.
I have tried using ‘overwrite_output_dir=True’ in TrainingArguments but this doesn’t seem to reduce any output.
I apologize if this is documented somewhere in the documentation, but I have not been able to find it.
My code:
model_training_arguments = TrainingArguments(
"./model_output",
evaluation_strategy = "epoch",
fp16=True,
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
num_train_epochs=4,
seed=2,
load_best_model_at_end=True,
overwrite_output_dir=True
)
model_trainer = Trainer(
model_init=model_init,
args=model_training_arguments,
train_dataset=train_dataset,
eval_dataset=valid_dataset,
tokenizer=tokenizer,
compute_metrics=metrics,
)
model_trainer.hyperparameter_search(
direction = "minimize",
backend = "optuna",
n_trials=5
)
Thanks in advance!