Continuing off this topic (@dblakely thank you for your answers).
When I set max_steps
high enough that multiple epoch
s of training (or more precisely finetuning starcoderbase-3b) should occur, at the end I get reported only 1 epoch
. And it doesn’t seem that it’s this logger that prints out anything. Maybe because I am using wandb
(as it was used in the finetune script from the starcoder repo?)
The output at the very end of the training is this
{'loss': 0.0569, 'learning_rate': 0.0, 'epoch': 1.0}
{'eval_loss': 0.49919337034225464, 'eval_runtime': 40.2624, 'eval_samples_per_second': 1.267, 'eval_steps_per_second': 1.267, 'epoch': 1.0}
{'train_runtime': 328146.1378, 'train_samples_per_second': 0.244, 'train_steps_per_second': 0.061, 'train_loss': 0.2867981531023979, 'epoch': 1.0}
followed by a summary from wandb
. As per the previous post, I should get more than 10 epochs.