The original answer on SO was referencing the correct example in
huggingface docs, but had a bug in describing it. Namely, the parameter resume_from_checkpoint
should be used in the train
call rather than in the init
.
5 Likes