Why does the falcon QLoRA tutorial code use eos_token as pad_token?

It seems like you know a lot about how this works. So, if setting tokenizer.pad_token = tokenizer.eos_token causes falcon to infinitely generate text up to the cutoff point, how do you stop this from happening? Do you have time to provide a code snippet? All I can think of is:

raw_pad_token = “<pad>”
processed_token = tokenizer(raw_pad_token)
tokenizer.pad_token = processed_token

But based on this thread, this isn’t enough to work

1 Like