Hello to everyone,
I’m using t5 to do summarization tasks in english, however, always at the end of the outputs strange symbols are generated.
this is the code:
summary_ids = model.generate(tokenized_text,
num_beams=3,
no_repeat_ngram_size=2,
min_length=300,
max_length=600)
output = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print ("\n\nSummarized text: \n",output)
At the end of the output there are symbols like:
á là és ê e †ô óà unà uneestànèàêm–asôsè1á2al
Is this normal? Is there any way to prevent it?