Can't figure out how to implement gpt2 tokenizer in fine-tuning

Kickbub · July 22, 2022, 10:58pm

I am using the Huggingface Fine-Tune training notebook made for BERT but trying to convert it to use GPT-2. The training data is a chat log from Facebook, so there are no special tokens. Every message is separated by line. I didn’t add special tokens because I wanted to retain continuity in generated text, much like a book.

Here is a link to the notebook

Open to any tips, thanks.

Topic		Replies	Views
GPT-2 fine-tuning Beginners	0	1609	June 12, 2023
Training GPT-2 from scratch Beginners	2	1230	August 3, 2020
Fine-tuned transformers model generats nonsensical results Beginners	0	216	July 10, 2024
Fine tuning and retokenizing Beginners	0	589	May 29, 2022
Tutorial: Fine-tuning with custom datasets – sentiment, NER, and question answering 🤗Transformers	19	12843	February 12, 2024

Can't figure out how to implement gpt2 tokenizer in fine-tuning

Related topics