How to fine-tune "openai-gpt" model for sequence classification?

skrishna · March 7, 2023, 6:15pm

Hi,

It is really embarrassing that the process of fine-tuning is still not clean, and let me share what I mean by that.

If we follow the tutorial to fine-tune BERT in this link (Fine-tune a pretrained model) to fine-tune “openai-gpt”, we first get an error in the tokenization process that asks to add a pad token because the original tokenizer doesn’t have that. It’s not a big issue as I think add a pad token (tokenizer.pad_token = pad_token where pad_token = ‘[pad]’). All goes well with tokenization.

Here comes the crazy part. When I call the sequence classification model using model = AutoModelForSequenceClassification.from_pretrained(“openai-gpt”, num_labels=2) , I again get the error while training that there is no padding token! So again I add it as model.config.pad_token_id = tokenizer.pad_token_id but it seems like it doesn’t add an embedding when we add token like this!

Now here is my simple question : Since GPT models are autoregressive, I am not sure we really need [pad] tokens to learn isn’t it? If we really do, then is it too much work from HuggingFace community to provide a blog about these nuances in fine-tuning? Else all these blogs on a relatively easier case of fine-tuning BERT is of no (infact negative) use if someone has to sit for many hours trying to figure the small details.

skgouda · March 7, 2023, 6:17pm

Encountered the same problem recently, used model.config.pad_token_id = tokenizer.eos_token_id. Since we don’t want to compute loss on pad tokens, this was sufficient.

skrishna · March 8, 2023, 5:49am

Actually tokenizer.eos_token_id is not defined as well so it doesn’t help.

kkoz · September 5, 2024, 8:48pm

hello, did you able to get a solution for this error?

Topic		Replies	Views
Tokenizer.pad_token=what? 🤗Tokenizers	2	10089	November 8, 2022
Code is working fine for Bert and Roberta However Fails During GPTNeo Beginners	2	291	February 27, 2024
0% accuracy when finetuning from certain models. [CLS] token embeddings not learned 🤗Transformers	1	610	November 2, 2023
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation Beginners	5	46175	September 24, 2024
GPT2 returns sequence of <\|endoftext\|> after finetuning 🤗Transformers	2	248	January 31, 2024

How to fine-tune "openai-gpt" model for sequence classification?

Related topics