- Use a different token for padding. For instance the
eos_token
that I already set the tokenizer to use in the snippet above.
I see. You can pass pad_token_id
to the generate
call to do this. For example:
generated_ids = model.generate(**encoding, pad_token_id=tokenizer.eos_token_id)
Or you can set it in the model’s generation config:
model.generation_config.pad_token_id = tokenizer.eos_token_id
As for this part:
- Change the padding side to
right
.
I’m still a bit confused about what you’re asking. During generation, the padding is already on the right in the outputs you posted. Or you’re saying you want the tokenizer to pad on the right?