Is attention_mask needed for training Bart?

kouohhashi · March 10, 2021, 12:21am

Hi, I’m experimenting fine-tuning Bart for summarization task.

I tried both “with attention_mask” and “without attention_mask”.
And it seems both worked.

Could someone teach when to use attention_mask and why?

Thanks for advance.

neuralpat · March 10, 2021, 9:02am

Topic		Replies	Views
Train Bart for Conditional Generation (e.g. Summarization) Models	14	17155	November 22, 2023
Role of attention mask in base Bert 🤗Transformers	0	329	December 22, 2022
BART fill-mask and generate summaries Models	0	321	October 11, 2021
BART pre-training? Beginners	5	1839	August 5, 2023
Why does Bart decoder's attention mask mark relevant indices with 0 instead of 1? Models	1	1914	May 31, 2021