Finetuning GPT2 with user defined loss

No, for the data collator, your dataset would return simple texts (and labels), and the function would receive a list of texts that it can encode together. Then you can return in that function the proper dictionary.