Hey there, so i have pretrained a T5_1_1 base model using the t5x pretraining script. and i was looking forward to convert it to HF. but sadly i couldnt do it via the conversion script
the config file for the model (config.gin) is in .gin format and the conversion script expects the config file to be in a json format
@patrickvonplaten @valhalla could you please let me know on how to go pass this.
You will have to create the T5_1_1 config.json file yourself. You could simply to try to use an existing matching config of T5_1_1 models on the Hub: https://huggingface.co/models?search=t5-v1
@patrickvonplaten thanks a ton for replying back. I have trained my own sentencepiece model for my T5 model. How to convert the tokenizer for T5 to Huggingface. I believe the conversion script only converts the model. How to convert the tokenizer?
You should be able to just load it into
from transformers import PreTrainedTokenizerFast
tok = PreTrainedTokenizerFast.from_pretrained("/path/to/your/trained/tokenizer/directory")
@patrickvonplaten ive pretrained the scalable mt5 base variant
t5x/examples/scalable_t5/mt5/base.gin using the t5x but i wasnt able to find the appropriate config.json for the model.
followingly i tried several efficient-t5-base config.json along with a couple of mt5 config.json. But nothing worked for me.
Could you please tell me where i could find the appropriate scalable mt5 and byt5 variants from (small to xxl) config.json files to convert these models to huggingface