Bart summarization

Hildweig · August 8, 2020, 11:49am

Good morning/evening
I am trying to understand how does distilbart generate summaries, like what is the logic behind when you fine tune it with texts and their reference summaries, how does it learn to summarize with a specified length with new words? The way I see it is: I feed a text into the model, it gets encoded & then decoded with only the tokens containing important information? How does the model spots kinda the good sentences tokens?

astariul · August 10, 2020, 1:10am

You should read more about “Sequence to Sequence”.

Bart is a seq2seq model : the input text is encoded with attention, and then output text is generated token by token, with attention over the input and the generated output so far. Since the output is generated token by token, we can choose how many token we want to generate.

Jung · August 10, 2020, 3:06am

Yes as @astariul said. And learn from the master is one of the best way
Here’s Andrew Ng Video Series on Seq2Seq model :

Even though it’s not transformer, but the big picture concept is applicable .

Hildweig · August 10, 2020, 1:57pm

Thank you very much, I will!

Topic		Replies	Views
Convert Bart to seq to seq form 🤗Transformers	0	308	July 5, 2022
Bart input confusion Beginners	2	3900	September 14, 2020
[Beginner] fine-tune Bart with custom dataset in other language? Beginners	2	3235	January 22, 2021
T5 outperforms BART when fine-tuned for summarization task Intermediate	3	4018	August 8, 2022
Worse output when using a bart summarizer directly (vs pipeline api)? Beginners	2	31	October 14, 2024

Bart summarization

Related topics