Google/MT5 model: While generating always starts with the same token, after `<pad>`

spranjal25 · May 8, 2023, 7:58am

Hi,

I am Fine-tuning an mt5-small model on a custom dataset, for a downstream task.
The model loss appears to be converging but the only issue is that the output from model.generate() always starts with a specific token:
eg: a. For each input , …
I’m getting the outputs: “a <Generated Output for sentence 1>”, “a <Generated Output for sentence 2>…”

I’m using the beam search strategy for decoding, have tried with different beam widths and other parameters (like repeat_penalty etc) but the output from generate() always starts with the same token a

Is there any known reason why this might be happening?

Topic		Replies	Views
Generation but constraining first few tokens 🤗Transformers	0	732	December 25, 2022
Prevent repeat tokens in GPT2LMHeadModel text generation with max_new_tokens=1 Beginners	0	1115	November 19, 2021
T5 Model Generate and Model Outputs Vastly Different Beginners	1	811	September 11, 2022
Finetuned MT5 model generating the same first token for any input Intermediate	0	231	May 9, 2023
Bos token for T5? 🤗Transformers	0	690	November 6, 2022

Google/MT5 model: While generating always starts with the same token, after `<pad>`

Related topics