I used to use checkpoint callback in Keras, Is there any alternative in Huggingface?
If I re-run the training cell it continues from the last loss so it is automatically saved?
Could anyone please explain more about how Huggingface saves partial checkpoints so I can continue later from this point?
Yes, you can control how to deal with checkpoints within the Trainer
class. Have a read through the documentation, which should help you.
1 Like
Thanks, BramVanroy
I think this documentation is for PyTorch and I am currently using TensorFlow.
So that means there is no such solution now in TensorFlow
I am no familiar with the TF code base of the libraries, but it seems that some checkpointing is implemented:
Maybe someone else can chime in, who knows more about TF.
1 Like