Longformer for text summarization

kbmr812 · July 26, 2020, 8:47am

Hello! Does anyone know how to summarize long documents/news articles using the Longformer library? I am aware that using T5, the token limit is 512.

I would really appreciate any help in this area! Thank you

valhalla · July 26, 2020, 8:55am

Hi, it’s possible to use Longformer for summerization, the way its done now, is taking BART model and then replacing it’s self attention with longformer sliding window attention so that it can take longer sequences. Check this two issues, first, second, and this branch of longformer repo

kbmr812 · July 26, 2020, 10:42am

Hi, I followed the example from the branch of longformer repo, but it seems that the final output is a tensor instead of words/text. How can I convert it into words?

valhalla · July 26, 2020, 11:29am

Note that, these models are not yet fine-tuned for long summarization, you’ll need to fine-tune them yourself or wait till someone does that. And yes, the model returns a tensor, to generate text you’ll need to use the generate method.

Here’s nice blog post on the generate method

guilleVENT · January 22, 2021, 11:02am

the links are broken

kmfoda · November 30, 2021, 8:59am

Hey @valhalla. Hope you’re well. In your earlier comment here you mention that Longformer for summarisation takes the BART model and replaces it’s self attention. I was under the impression that this model was based off ROBERTA. Can you confirm if there is a long former model based off BART and if so where it is on the hub?

nielsr · November 30, 2021, 9:10am

Hi,

LongFormer is an encoder-only Transformer (similar to BERT/RoBERTa), it only has a different attention mechanism, allowing it to be used on longer sequences.

The author also released LED (LongFormer Encoder Decoder), which is a seq2seq model (like BART, T5) but with LongFormer as encoder, hence allowing it to be used to summarize long documents for instance, or translate long texts.

Weights are on the hub: Models - Hugging Face

kmfoda · November 30, 2021, 9:45am

Amazing that’s very helpful thank you. I can see on that link that Allen AI’s LED model is based of bart-base which is ideal. If I were to look to try and convert bart-large to a LED would this notebook still be the right approach or is this for encoder-only models?

nielsr · November 30, 2021, 10:55am

I don’t have access to this notebook.

kmfoda · November 30, 2021, 11:59am

okay no problem. Thanks for your help!

NightMachinery · August 6, 2022, 4:54pm

Any updates? Can someone share the relevant code snippet from that private notebook publicly?

Topic		Replies	Views
Can't make inference from Longformer model build on top of MBART 🤗Transformers	0	471	April 4, 2022
Summarization taks, looking for clarifications before getting started Beginners	10	983	February 16, 2021
Longt5 summarization using huggingface sample code 🤗Transformers	1	851	July 8, 2022
Converting MBart to Longformer version Models	0	509	August 8, 2021
Datasets for generating longer summaries Models	0	291	December 3, 2020

Longformer for text summarization

Related topics