How to train gpt-2 from scratch? (no fine-tuning)

lewtun · January 23, 2021, 8:24am

Hi @iamnotapenguin, the place I would start is by adapting the following script for causal language modelling to your dataset: transformers/run_clm.py at master · huggingface/transformers · GitHub

This script allows you to specify both the tokenizer and the model architecture, plus you can do multi-gpu training which is advisable if you’re training from scratch.

Hope that helps!

Topic		Replies	Views
GPT2 Training from scratch in German 🤗Transformers	3	2313	October 3, 2020
Fine-tune, or train from scratch? Beginners	6	3462	September 16, 2020
Train GPT2 on wikitext from scratch Beginners	5	3843	October 25, 2021
Need help with gpt2 model Beginners	0	588	July 9, 2023
Training GPT-2 from scratch Beginners	2	1232	August 3, 2020

How to train gpt-2 from scratch? (no fine-tuning)

Related topics