[Bart] Question for BartModel Output shape

Hi I’m trying to fine-tune multilingual-Bart with Korean to generate some texts.
while I tried to pass my datas into model, I can’t under stand why the output shape of the model is different from what I expected.

Settings : I used MBartTokenizer and BartForConditionalGeneration

for batch, I used prepare_translation_batch to make datas as batch, inputs_ids and target_ids.
I also need decoder_input_ids(form like [tgt_lang_code, seqence, eos]), so I made it.

Here’s the problem.
BartForConditionalGeneration’s forward pass needs input_ids necessarily. From the Docs of Bart, If I pass only the ‘input_ids’ to the model(it can include attention_mask), the model’s decoder wouldn’t have it’s own input, so It takes ‘input_ids’ as their input.
And the shape of the prediction_scores in returns should be (batch_size, seq_len, vocab_size) and It worked right

But, When I pass the ‘input_ids’ and ‘decoder_input_ids’ together to the model, the shape of the prediction_scores shows (batch_size, 1 , vocab_size) all the time.

I think when I pass the inputs together, The shape of prediction_scores in returns should be like (batch_size, decoder_input_seq_len, vocab_size)

I don’t know why this happen. Maybe I totally misunderstood the model at all.
The reason I made this topic is I need a clean view of this problem.
Any advice would be appreciated.
Thank you.

1 Like

pass use_cache=False to forward. Confusing, I know.

2 Likes

It worked. Thank you very much!