I think the change in generation config happening here might be the reason:
GenerationConfig {
"bos_token_id": 1,
"eos_token_id": [
32000,
32001,
32007
],
"pad_token_id": 32000
}
(Pdb) n
> /opt/conda/lib/python3.10/site-packages/transformers/generation/utils.py(
1284)_prepare_generation_config()
-> if new_generation_config != self.generation_config:
(Pdb) p new_generation_config
GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 32000,
"output_hidden_states": true,
"pad_token_id": 32000,
"return_dict_in_generate": true
}