This is an appreciation post for the feature --resume_from_checkpoint which actually works with deepspeed and do iterate data up-to certain iteration to mimic the same experiment.
Even we can sync loss in wandb. 
A big thumbs up for @sgugger.
This is an appreciation post for the feature --resume_from_checkpoint which actually works with deepspeed and do iterate data up-to certain iteration to mimic the same experiment.
Even we can sync loss in wandb. 
A big thumbs up for @sgugger.