T5 decoder predicting tokens even after hitting end of sequence token, i.e </s>

No that seems correct, so the model has generated the end of sequence token (with ID=1), after which generation stops. One usually provides skip_special_tokens=True as well to the batch_decode method in order to skip special tokens (like end of sequence, or padding tokens):

generated_text = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
1 Like