Convert mT5 to HF weights?

congcongwang · October 24, 2020, 7:32pm

Hi,

I have been attempting to convert the mT5 weights available here to the HF weights for TFT5ForConditionalGeneration or T5ForConditionalGeneration? Any ideas on how to do this?

Zack · October 24, 2020, 7:52pm

I have a question, can this new model be used for summarization on other languages other than English and without fine-tuning it ?

congcongwang · October 24, 2020, 8:57pm

If the model is pre-trained in a multi-task learning way, then the answer is yes. However, if further fine-tuning on a specific downstream task such as summarization, that may give better performance. Sure, as opposed to the original T5, mT5 supports more than 100 languages (for example, T5Tokenizer can not tokenize Chinese).

stykat · October 26, 2020, 1:04pm

Hi @congcongwang!

How have you been trying to do it, if I may ask?
There is a way of doing it for other models, as shown here but T5 is not among them.

Thank you!

Jung · October 27, 2020, 4:42am

Hi guys,

It seems mT5 employs the T5.1.1 architecture (not the original T5 arch), as you can see from the name T5-XL and T5-XXL instead of T5-3B and T5-11B.

In this case, HF still doesn’t have implementation on this T5.1.1 yet. Please see : https://github.com/huggingface/transformers/issues/6285

UPDATED (Nov 17, 2020) : will be released soon by amazing Patrick – https://github.com/huggingface/transformers/pull/8552

congcongwang · October 27, 2020, 9:18am

Thanks for the info. They are similar but different in some places.

patrickvonplaten · November 17, 2020, 2:41pm

Improved T5 models (small to large):

and mT5 models (small to large):

are in the model hub Will upload the 3b and 11b versions in the coming days (modifié)

Topic		Replies	Views
Multilingual T5 Model Not Found? Models	3	1123	November 17, 2020
mT5/T5v1.1 Fine-Tuning Results Models	16	7475	March 8, 2022
How to convert the new t5x models to huggingface transformers 🤗Transformers	4	1741	September 8, 2022
Finetune T5 with T5ForConditionalGeneration to multitask for Q&A and Summarization 🤗Transformers	0	638	November 28, 2023
Can t5 be used to text-generation? Beginners	7	8808	April 26, 2023

Convert mT5 to HF weights?

Related topics