Some questions about BART

minji · November 21, 2022, 11:24am

I have a few questions in the BART paper…

In what ways can this paper be said to have “generalized” BERT and GPT?

Exactly what role does the encoder play in the BART pretraining structure?

In chapter 2.2 Pre-training, BART calculates the cross-entropy between the decoder output and the original document.
Since the input of the Decoder is given as teacher forcing, isn’t this method only possible to learn Decoder?

Does the encoder calculate the loss between the decoder output and the masked input?
I wonder how the model can learn masking from the encoder.

Chapter 3.3 contains the statement, “In both of these tasks, information is copied from the input but manipulated, …”
What does “manipulated” mean here?

Topic		Replies	Views
BART with custom encoder and decoder Models	5	921	May 25, 2023
BART from finetuned BERT Intermediate	2	472	September 9, 2021
BART generate() output not related to input Intermediate	1	814	February 17, 2022
BartDecoder outputs perfect predictions even when untrained Beginners	0	149	October 27, 2023
Encoder Decoder Embedding layer shared in BartModel code 🤗Transformers	1	344	September 20, 2023

Some questions about BART

Related topics