How do GPT2 pretrained models allow custom hyperparams?

gglass · March 10, 2021, 6:38pm

When creating a pretrained GPT2 model such as GPT2LMHeadModel(config).from_pretrained(‘gpt2’),
we can specify a config with a custom params in the config such as vocab size, n_positions, n_embd, activation_function, n_head.

How is it possible to use custom values for these on a pretrained model. For example, how could we choose the number of attention heads after the model has already been trained? Are there simply a large number of pretrained models corresponding to each config?

Topic		Replies	Views
PretrainedConfig example to use it in GPT2 text-generation pipeline 🤗Transformers	1	594	February 6, 2021
How to create custom GPT-2 model with different number of attention heads in different layers? 🤗Transformers	0	397	July 17, 2023
Perplexity from fine-tuned GPT2LMHeadModel with and without lm_head as a parameter Intermediate	4	2062	May 10, 2022
Train GPT2 on wikitext from scratch Beginners	5	3903	October 25, 2021
Nuance in usage of GPT2 when setting the attribute trainable 🤗Transformers	0	206	August 27, 2021

How do GPT2 pretrained models allow custom hyperparams?

Related topics