Summarization taks, looking for clarifications before getting started

lewtun · February 15, 2021, 3:09pm

Hi @neuralpat, no bother at all

If I understood your original aim, you’d like to perform summarization right? As far as I know, you won’t be able to use the xlm-r model fine-tuned on token classification since what you really need is a language modelling head to generate the summary.

How long are you documents? Depending on time / cost, I would be tempted to still run an experiment with the encode-decoder approach just to get a feel for how well this baseline performs on the dataset. For example, the CNN / DailyMail dataset has articles that are longer than most Transformer model’s context size, yet the summaries are not so bad.

If length is really an issue, then you might want to check out the LongFormer model: allenai/led-base-16384 · Hugging Face which can process 16k tokens

There’s also a long thread here with a discussion related to your issue, so you might find some relevant ideas there: Summarization on long documents

What is generally true is that the pretrained checkpoints can be fine-tuned on a variety of downstream tasks via transfer learning - perhaps this is what you had in mind?

Topic		Replies	Views
BART with custom encoder and decoder Models	5	919	May 25, 2023
Warm-started encoder-decoder models (Bert2Gpt2 and Bert2Bert) Beginners	11	2486	June 9, 2024
How to finetune a bert model to a Summarizer Beginners	2	4969	March 7, 2022
Training Bert2GPT2 model Summarization doesn't lead to acceptable results Models	0	451	December 8, 2021
[Beginner] fine-tune Bart with custom dataset in other language? Beginners	2	3229	January 22, 2021

Summarization taks, looking for clarifications before getting started

Related topics