Save only best model in Trainer

I am using the below -

  args = TrainingArguments(
      output_dir=f"./out_fold{i}",
      overwrite_output_dir = 'True',
      evaluation_strategy="steps",
      eval_steps=40,
      logging_steps = 40,
      learning_rate = 5e-5,
      per_device_train_batch_size=8,
      per_device_eval_batch_size=8,
      num_train_epochs=10,
      seed=0,
      save_total_limit = 1,
      # report_to = "none",
  #     logging_steps = 'epoch',
      load_best_model_at_end=True,
      save_strategy = "no"
  )
  trainer = Trainer(
      model=model,
      args=args,
      train_dataset=train_dataset,
      eval_dataset=val_dataset,
      # compute_metrics=compute_metrics,
      
      # callbacks=[EarlyStoppingCallback(early_stopping_pa)],
  )
  trainer.train()
  trainer.save_model(f'out_fold{i}')

Here, thought save_strategy = “no” , the checkpoints are saved at start in disk (as below) due to which disk goes full. Can you suggest what’s going wrong?

***** Running Evaluation *****
Num examples = 567
Batch size = 8
Saving model checkpoint to ./out_fold0/checkpoint-40
Configuration saved in ./out_fold0/checkpoint-40/config.json
Model weights saved in ./out_fold0/checkpoint-40/pytorch_model.bin
Deleting older checkpoint [out_fold0/checkpoint-760] due to args.save_total_limit
***** Running Evaluation *****

3 Likes