I want to further a T5 model and have been looking around and came across transformers/run_t5_mlm_flax.py at main · huggingface/transformers · GitHub
I am not sure which part of this huge script do I need for my data preparation. I have an input file that I want to process. I am trying to understand how to find the masking function that adds the extra_tokens_ids.
Is it better to try this whole script with my input rather than trying to pull out the data processing script?
Thank you would appreciate help!