This is an appreciation post for the feature --resume_from_checkpoint
which actually works with deepspeed
and do iterate data up-to certain iteration to mimic the same experiment.
Even we can sync loss in wandb.
A big thumbs up for @sgugger.
This is an appreciation post for the feature --resume_from_checkpoint
which actually works with deepspeed
and do iterate data up-to certain iteration to mimic the same experiment.
Even we can sync loss in wandb.
A big thumbs up for @sgugger.