How is T5 pretrained?

Hi all.

I’m creating a pretrianed T5 model with:


How is this model pretrained? It seems to me that the model weights I get here were trained at least on the GLUE dataset (and probably others).

I’d like it to only be pretrained on C4. Are those weights around somewhere? How do I get a model pretrained that way?


This is quite a general question. You should find everything you need in their paper.

[1910.10683] Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (

Yes, I’ve read the paper. My question is about the Huggingface implementation.

This is the same model as the one release by the authors of that paper.