Packing issue, SFTTrainer

Hi,
I’m fine-tuning llama-v2 using SFTTrainer. When I set packing=False my model overall performance gets better but on inference it just cant stop, it generates words until it hits max_new_tokens. This as he could not learn eos token…
Maybe some one have any idea what could cause this problem?

Thanks
Tomek

1 Like