I am using the Huggingface Fine-Tune training notebook made for BERT but trying to convert it to use GPT-2. The training data is a chat log from Facebook, so there are no special tokens. Every message is separated by line. I didn’t add special tokens because I wanted to retain continuity in generated text, much like a book.
Here is a link to the notebook
Open to any tips, thanks.