Hi I’m trying to fine-tune multilingual-Bart with Korean to generate some texts.
while I tried to pass my datas into model, I can’t under stand why the output shape of the model is different from what I expected.
Settings : I used MBartTokenizer and BartForConditionalGeneration
for batch, I used prepare_translation_batch to make datas as batch, inputs_ids and target_ids.
I also need decoder_input_ids(form like [tgt_lang_code, seqence, eos]), so I made it.
Here’s the problem.
BartForConditionalGeneration’s forward pass needs input_ids necessarily. From the Docs of Bart, If I pass only the ‘input_ids’ to the model(it can include attention_mask), the model’s decoder wouldn’t have it’s own input, so It takes ‘input_ids’ as their input.
And the shape of the prediction_scores in returns should be (batch_size, seq_len, vocab_size) and It worked right
But, When I pass the ‘input_ids’ and ‘decoder_input_ids’ together to the model, the shape of the prediction_scores shows (batch_size, 1 , vocab_size) all the time.
I think when I pass the inputs together, The shape of prediction_scores in returns should be like (batch_size, decoder_input_seq_len, vocab_size)
I don’t know why this happen. Maybe I totally misunderstood the model at all.
The reason I made this topic is I need a clean view of this problem.
Any advice would be appreciated.
Thank you.