While Longt5 has the same tokenizer as T5, in its paper (and github code) it uses Pegasus for pretraining. In Pegasus there are two types of tokens (MASK_1 and MASK_2). These are assigned token_id 2 and 3 respectively:
Since LongT5 uses the same pretraining masking strategy:
One would assume that token ids 2 and 3 are those used for masking. However these are assigned as ‘’, ‘▁’ in longT5 tokenizer. Were those ids changed when porting the model to HF? If so, which are the masking tokens for LongT5 models?