How to continue to pre-train gpt2?

xiaoyaoyou · October 5, 2022, 2:17am

Hello, guys!
I want to further pre-train GPT2 in some few specific texts and the library provides scripts for this.

However, I found the model in the scripts belong to AutoModelForCausalLM, and GPT2 in my case is LMHead. Can this model be used for continuing pre-training as well?
If it can should I use the script or the model?

Thanks a lot!

lianghsun · October 27, 2022, 5:45pm

Hi @xiaoyaoyou Of course you can use run_clm.py to fine-tune on the specific texts. Just specify --model_type gpt2 (maybe some other options which depends on your desired), and you train the new model for downstream task. I’ll suggest you use this script to train the model, bcz it is prone to use DDP.

Shashank14 · July 1, 2023, 1:34am

In the case of the GPT-style model, is pre-training the same as fine-tuning as the objective is the same? next token prediction? Correct me if I am wrong. If i just use unlabeled data and continue training my GPT model, will it be called continued pre-trained model?

Topic		Replies	Views
How to train gpt-2 from scratch? (no fine-tuning) Beginners	17	19011	December 14, 2022
Train GPT2 on wikitext from scratch Beginners	5	3833	October 25, 2021
How can I continue to train my fine-tuned model with new datasets? 🤗Transformers	0	372	September 14, 2023
Add a classification head to a fine-tuned language model Beginners	3	4697	May 31, 2024
Extending a GPT2 Model (Dialo) Beginners	0	571	September 9, 2021

How to continue to pre-train gpt2?

Related topics