Size of GPT-2 in fine tuning tutorial

bosllm · March 23, 2023, 11:09pm

Hi! I’m following this tutorial to fine tune GPT-2 on wikitext. But the tutorial doesn’t specify what size of GPT-2 they are fine tuning on. I was curious how can I find the size of the model and if there’s any way I can fine-tune it on a different size.

dblakely · March 27, 2023, 2:23pm

Hi there,

The --model_name_or_path=gpt2 arg passed to the script indicates that it’s the default gpt2 model from Huggingface. That would be this one, which says “This is the smallest version of GPT-2, with 124M parameters.”

To change the size of the GPT2 model you’re using, you can pass any of these GPT2 models to that argument:

gpt2
gpt2-large
gpt2-medium
gpt2-xl

In general, the models available from Huggingface always have a short name/ID such as “gpt2”, “t5-3b”, etc and you can use that name to look up documentation about that particular model version (how big it is, how it was trained, etc) on huggingface.co.

That being said, I noticed that you linked to Huggingface documentation for version 2.0.0, which is several years old and contains a lot of stuff that’s outdated nowadays. I’d recommend using the documentation and GitHub version from the latest release (4.27.3). That’ll give you access to more models, better features, and it’ll be easier for people in the community to assist you.

Topic		Replies	Views
Language-modeling script "killed" when fine-tuning gpt2-medium Beginners	3	6895	May 19, 2023
Fine Tuning GPT-2 - Training job only using test sample size of 5 Amazon SageMaker	4	2139	February 6, 2023
Sort models by parameter count Site Feedback	4	370	August 29, 2024
Is there any reason why GPT-Neo would behave differently (fundamentally) from GPT2? Models	0	426	January 15, 2023
How to change the size of model_max_length? 🤗Tokenizers	0	944	March 3, 2023

Size of GPT-2 in fine tuning tutorial

Related topics