LM few shot and fine tuning on summarization task

marcoabrate · October 8, 2020, 12:29pm

hello there,

i am new to the forum and to nlp in general. i start this topic to try to understand more about language models and how huggingface can be used for few shot learning and fine tuning.

i am interested in the text summarization task. i know there are already some pre trained models such as BART, T5 and Pegasus that perform summarization quite well and i have already played with them using the huggingface transformers library. i know there are some community notebooks such as this and that, and the GitHub issue #4406 (could not link it since i am new) on how to fine tune these models for more specific summarization tasks that differ from the common CNN/dailymail corpus.

what i am here to understand more is how to use a generic LM for text summarization. T5 and BART have a ForConditionalGeneration class. however, models like BERT, flauBERT, gpt, gpt2, XLM do not have this class, but only a LM head. i have read the gpt3 LM can perform any given task by just “looking” at some examples. i am wondering if there is a way to do the same with the LM i have just cited. moreover, most of the summarization discussions are focused on BART and T5 and i could not find any guide on how to actually fine tune generic LM models (BERT, flauBERT, gpt, gpt2, XLM, etc.) on such a task.

i have used gpt2 for text summarization by feeding the article and then the string “TL;DR:”, but the results are quite bad.

TL;DR: in synthesis, my questions are: how can i do few shot learning on summarization with LM models such as gpt2 that only have a LM head class, using huggingface? how can i properly fine tune them on such a task?

P.S. the huggingface transformers library is amazing, i hope the community will keep on thriving!

ronnys · July 19, 2024, 9:15am

Hello i am new to the community.
Do you still learn and work in the hugging face ecosystem

Topic		Replies	Views
BART finetuning for summarization without seq2seq trainer Beginners	1	819	October 31, 2022
Finetuned gpt2 model generates from very begining not from summary Beginners	0	186	June 25, 2023
Autotrain Advanced Summarization 🤗AutoTrain	0	266	September 4, 2023
Distilling T5-small for summarization 🤗Transformers	0	462	May 25, 2022
Which of the sshleifer/* models can be used as-is for text summarization? Beginners	5	462	July 15, 2020

LM few shot and fine tuning on summarization task

Related topics