Repetitions after pre-training T5X

Hello, I am pre-training T5_1_1 using the t5x pretraining script to translate to Japanese on a large corpus of text. After training, I tried to translate a simple “Hello”, but it ends up repeating the “Hello” in Japanese several times in escape unicode sequences. The number of times it repeats is equivalent to the number of task feature lengths I have defined.

  1. Is there a setting I can tweak to reduce the number of repetitions similar to CTranslate2?
  2. In my preprocessor for the training task, I add the EOS tokens automatically as follows:
    preprocessors=[
        seqio.preprocessors.tokenize,
        seqio.preprocessors.append_eos_after_trim,
    ],
  1. Any tips on how to reduce repetitions?