restart train tihs
trainer = transformers.Trainer(
model=model,
train_dataset=data["train"],
args=transformers.TrainingArguments(
save_steps = 100, # 500 -> 100 when restarting code with resume_from_checkpoint option
...
),
)
trainer.train(resume_from_checkpoint = '/checkpoint_path')
try changing TrainerArguments ‘save_step’ = 500 to 100. but is not apply… Is it impossilbe?
Hello!
If I understand you’re question correctly, you are resuming from checkpoint and you want to to start saving checkpoints from there in increments of 100 instead of 500? Tell me if this example is what you want.
Example of your directory:
->checkpoint1 - 500
->checkpoint2 - 1000
->checkpoint3 - 1500
->checkpoint4 - 2000
Example of desired directory:
->checkpoint1 - 500
->checkpoint2 - 1000
->checkpoint3 - 1500
->checkpoint4 - 2000
->checkpoint5 - 2100
->checkpoint6 - 2200
->checkpoint7 - 2300 …
I want result
->checkpoint1 - 500 (Frist train args)
->checkpoint2 - 600 (restart train change args option save_step)
->checkpoint3 - 700
->checkpoint4 - 800
…
but still save_setp option is 500
->checkpoint1 - 500
->checkpoint2 - 1000
…