what happens if you specify model_max_len=512
when you load the tokenizer? i’d try that and do a sanity check with tokenizer(text)
to make sure the truncation is working as expected
what happens if you specify model_max_len=512
when you load the tokenizer? i’d try that and do a sanity check with tokenizer(text)
to make sure the truncation is working as expected