Checkpoints - still confused

Hi all, I have read information about checkpoints but I am still confused.

When training my model, I can read the following messages:
Saving model checkpoint to repo-name/checkpoint-3500
Configuration saved in repo-name/checkpoint-3500/config.json
Model weights saved in repo-name/checkpoint-3500/pytorch_model.bin
Feature extractor saved in repo-name/checkpoint-3500/preprocessor_config.json
Deleting older checkpoint [repo-name/checkpoint-1500] due to args.save_total_limit

In my huggingface repository, I see the following files:

.gitattributes
special_tokens_map.json
tokenizer_config.json
vocab.json

After the training was interrupted by a Colab disconnection, I tried to continue the training by loading the last checkpoint so I ran the following command:
trainer.train(resume_from_checkpoint=true)

I got an error stating that there is no valid checkpoint in the folder.
In fact I was expecting to find folders like “checkpoint-XXX” but I do not see them. Please could you help me to understand where are the checkpoints saved and how to retrieve them to continue training?

Steps I followed:
training_args: I used the same definition for both trainings (same output_dir)

training_args = TrainingArguments(
output_dir=repo_name,
overwrite_output_dir=False,
group_by_length=True,
per_device_train_batch_size=8,
evaluation_strategy=“steps”,
num_train_epochs=30,
fp16=True,
gradient_checkpointing=True,
save_steps=500,
eval_steps=500,
logging_steps=500,
save_strategy = “steps”,
learning_rate=1e-4,
weight_decay=0.005,
warmup_steps=1000,
save_total_limit=2,
logging_dir=repo_name,
)
Same definitions of tokenizer and model in both trainings.

Same trainer:
from transformers import Trainer

trainer = Trainer(
model=model,
data_collator=data_collator,
args=training_args,
compute_metrics=compute_metrics,
train_dataset=timit[“train”],
eval_dataset=timit[“test”],
tokenizer=processor.feature_extractor,
)

I only replaced the command “trainer.train()” used in the first training by
“trainer.train(resume_from_checkpoint=true)” to resume the training from the checkpoint.

Please could you help me to learn how to continue training by using a checkpoint and where is it saved?
Thank you