How to train GPT-2 for text summarization?

there’s a mentioning that you could simply add “TL;DR” at the end of the input text to get summaries. I didn’t try it, but check it out

Although trained as an auto-regressive language model, you can make GPT-2 generate summaries by appending “TL;DR” at the end of the input text.

Please notice that GPT-2 is not encoder-decoder so the architecture is not possibly the best one for generating summaries. And for same reason I think you cannot use Seq2Seq trainer or datacollator for fine tuning.

Here’s one article on fine tuning the model:

…but for summarization you probably are better of by fine tuning for example “Flan-T5” (such as “google/flan-t5-small”) by google. Good luck :slight_smile: