Does Trainer load checkpoints from previous fold in k-fold Cross Validation?

I am running k-fold cross validation for fine-tuning a pre-trained model. I have set up my trainer with load_best_model_at_end=True and each fold runs for 30 epochs.

  1. Dose it start training - from the second fold - using the best model obtained from the previous fold (i.e. after the 30 epochs completed in the previous fold), as load_best_model_at_end is set to be true?

  2. How can disable all caches of huggingface? As I don’t want it to use any configuration, any data - such as checkpoint etc from the previous fold. My sample code snippet is as follows:

for fold in range(100):
	# config
	config = AutoConfig.from_pretrained(
			pretained_model,
			no_of_labels=no_of_labels,
			label2id={label: i for i, label in enumerate(labels)},
			id2label={i: label for i, label in enumerate(labels)},               
		)

	model = SomeModelClass.from_pretrained(
		pretained_model,
		config=config,
		ignore_mismatched_sizes=True
	)

	model.to(device)

	training_args = TrainingArguments(
		output_dir=".\checkpoints\\" 
		per_device_train_batch_size=64,
		per_device_eval_batch_size=128,            
		gradient_accumulation_steps=2, 
		num_train_epochs = 30,
		learning_rate=5e-5,
		weight_decay=0.0001,
		warmup_ratio=0.1,
		gradient_checkpointing=True,
		fp16=True,
		evaluation_strategy="epoch",
		save_strategy="epoch",
		logging_steps=500,
		report_to=["tensorboard"],
		logging_dir= ".\logs\\" ,
		load_best_model_at_end=True,
		metric_for_best_model="accuracy",
		greater_is_better=False,
		push_to_hub=False,
	)

	trainer = Trainer(
		model=model,
		args=training_args,
		train_dataset=train_dataset,
		eval_dataset=eval_dataset,
		compute_metrics=compute_metrics,
		tokenizer=feature_extractor,
	)

	train_result  = trainer.train()
3 Likes