Token indices sequence length is longer than the specified maximum sequence length

Hi, when running the run_t5_mlm_flax.py script I am getting this error:

Token indices sequence length is longer than the specified maximum sequence length for this model (523 > 512). Running this sequence through the model will result in indexing errors.

I have specified model_max_length =512 within the tokenizer.
And passed --max_seq_length=“512” \ to the run_t5_mlm_flax.py script.

Unfortunately I still get the same warning.

Hi @antoine2323231 , can you try the following code to see if it works?

tokenizer(batch_sentences, padding='max_length', truncation=True)

Hey lianghsun, I tried that but getting the same result. It is strange…

I am also getting a similar error. Did you resolve this?

My walk-around is to reduce the length of the prompt. For example, if you are doing question answering, the context + question should be short than 512 tokens.