Trainer to handle finetuning a GPT2 model. I see in
TrainingArguments there is a
max_steps that overrides
For a batch size of 32, is setting
max_steps=1000000 the equivalent of setting
Also, what happens if I have a batch size of 6 but want to set
Trainer stop training after the nearest divisible whole number or does it change the batch size at the end?
Thanks in advance!!