Getting total train_runtime even if training stopped in the middle

Hi.
I need to document the total train_runtime of the models I train (preferably only by using the trainer class’s extensions).

At first I customized the train function and saved its output, where “train_runtime” key is available. However, I have cases where the model training process is terminated due to time limitations, and I only later restart it. It successfully continues training from the last saved checkpoint, however, the “train_runtime” key at the end of the training holds the time since I restarted the training, and doesn’t account for the time the model trained before that.

Could you please assist?
Thank you.

1 Like