How to convert the new t5x models to huggingface transformers

Hey there, so i have pretrained a T5_1_1 base model using the t5x pretraining script. and i was looking forward to convert it to HF. but sadly i couldnt do it via the conversion script

the config file for the model (config.gin) is in .gin format and the conversion script expects the config file to be in a json format

@patrickvonplaten @valhalla could you please let me know on how to go pass this.

Hey @StephennFernandes,

You will have to create the T5_1_1 config.json file yourself. You could simply to try to use an existing matching config of T5_1_1 models on the Hub: https://huggingface.co/models?search=t5-v1

@patrickvonplaten thanks a ton for replying back. I have trained my own sentencepiece model for my T5 model. How to convert the tokenizer for T5 to Huggingface. I believe the conversion script only converts the model. How to convert the tokenizer?

You should be able to just load it into PretrainedTokenierFast:

from transformers import PreTrainedTokenizerFast

tok = PreTrainedTokenizerFast.from_pretrained("/path/to/your/trained/tokenizer/directory")
tok.push_to_hub("/url/to/your/repo")

@patrickvonplaten ive pretrained the scalable mt5 base variant t5x/examples/scalable_t5/mt5/base.gin using the t5x but i wasnt able to find the appropriate config.json for the model.

followingly i tried several efficient-t5-base config.json along with a couple of mt5 config.json. But nothing worked for me.

Could you please tell me where i could find the appropriate scalable mt5 and byt5 variants from (small to xxl) config.json files to convert these models to huggingface