I’ve been trying to perform a simple hyperparameter research on ‘distilroberta-base’. When using Kaggle notebooks there is a 20 GB limit on outputs. Even when i am doing only 5 trials the output directory fills up. I am used to using GridSearchCV for HP-search which only keeps track of the best parameters and retrains at the end without saving individual models during the search.
I have tried using ‘overwrite_output_dir=True’ in TrainingArguments but this doesn’t seem to reduce any output.
I apologize if this is documented somewhere in the documentation, but I have not been able to find it.
model_training_arguments = TrainingArguments( "./model_output", evaluation_strategy = "epoch", fp16=True, per_device_train_batch_size=batch_size, per_device_eval_batch_size=batch_size, num_train_epochs=4, seed=2, load_best_model_at_end=True, overwrite_output_dir=True ) model_trainer = Trainer( model_init=model_init, args=model_training_arguments, train_dataset=train_dataset, eval_dataset=valid_dataset, tokenizer=tokenizer, compute_metrics=metrics, ) model_trainer.hyperparameter_search( direction = "minimize", backend = "optuna", n_trials=5 )
Thanks in advance!