Please someone who have done it before can explain.

Here is what i did, I ran the training for 11.5 hours on kaggle p100 free gpu while saving checkpoints and limiting it to 1 checkpoint by using save_total_limit=1.

The session ended then i ran a new session and loaded the saved checkpoint using:

last_checkpoint = None
if os.path.isdir(training_args.output_dir) and not training_args.overwrite_output_dir:
    last_checkpoint = transformers.trainer_utils.get_last_checkpoint(

And used ignore_data_skip=True, to skip to the checkpoint as it say in the trainer docs If set to True, the training will begin faster (as that skipping step can take a long time)

Then started the training from the saved checkpoint:


But now the model is taking exactly the same time to train, It didn’t start faster at all, Also it started from step 0.

So i’m confused, Am i doing something wrong, or this flag doesnt work as it should be ?