Restarting gpt-2 finetuning after power failure

rgwatwormhill · November 10, 2020, 11:49am

[I am assuming that gpt-2 saving works in the same way as BERT saving. I am not an expert.]

did you save the optimizer state-dictionary?

In order to restart a previous training run, you need to have both the saved model state and the state of the optimizer’s parameters. (These take up a surprisingly large amount of memory - about half the size of the model).

If you haven’t got the optimizer state-dict, then you can still load the saved model from the model checkpoint, but you will need to start a new training run . You will probably need to estimate how far along the first run was, and what Learning Rate it might have got up to.

This thread might help:

Topic		Replies	Views
Loading finetuned model to generate text 🤗Transformers	12	3312	August 7, 2023
Language-modeling script "killed" when fine-tuning gpt2-medium Beginners	3	6896	May 19, 2023
How to train gpt-2 from scratch? (no fine-tuning) Beginners	17	19043	December 14, 2022
Finetuning GPT2 with user defined loss Beginners	56	16089	July 23, 2023
Training models for smaller epochs and then continue trianing 🤗Transformers	5	1319	January 16, 2021

Restarting gpt-2 finetuning after power failure

Related topics