[Bart] Question for BartModel Output shape

ben9004 · July 19, 2020, 9:48am

Hi I’m trying to fine-tune multilingual-Bart with Korean to generate some texts.
while I tried to pass my datas into model, I can’t under stand why the output shape of the model is different from what I expected.

Settings : I used MBartTokenizer and BartForConditionalGeneration

for batch, I used prepare_translation_batch to make datas as batch, inputs_ids and target_ids.
I also need decoder_input_ids(form like [tgt_lang_code, seqence, eos]), so I made it.

Here’s the problem.
BartForConditionalGeneration’s forward pass needs input_ids necessarily. From the Docs of Bart, If I pass only the ‘input_ids’ to the model(it can include attention_mask), the model’s decoder wouldn’t have it’s own input, so It takes ‘input_ids’ as their input.
And the shape of the prediction_scores in returns should be (batch_size, seq_len, vocab_size) and It worked right

But, When I pass the ‘input_ids’ and ‘decoder_input_ids’ together to the model, the shape of the prediction_scores shows (batch_size, 1 , vocab_size) all the time.

I think when I pass the inputs together, The shape of prediction_scores in returns should be like (batch_size, decoder_input_seq_len, vocab_size)

I don’t know why this happen. Maybe I totally misunderstood the model at all.
The reason I made this topic is I need a clean view of this problem.
Any advice would be appreciated.
Thank you.

sshleifer · July 19, 2020, 9:47pm

pass use_cache=False to forward. Confusing, I know.

ben9004 · July 20, 2020, 5:51am

It worked. Thank you very much!

Topic		Replies	Views
BartForConditionalGeneration "logits" shape is wrong/unexpected 🤗Transformers	4	925	November 11, 2020
Training BART, error when preparing decoder_input_ids. Shape of input_ids? Beginners	3	1458	August 7, 2020
BART seq2seq -100 tokens in prediction Models	0	186	December 25, 2023
A question about the modeling_bart.py Models	1	324	November 12, 2020
BartDecoder outputs perfect predictions even when untrained Beginners	0	150	October 27, 2023

[Bart] Question for BartModel Output shape

Related topics