How to use architecture of T5 without pretrained weights

I would like to study the effect of pre-trained model, so I want to test t5 model with and without pre-trained weights. Using pre-trained weights is straight forward, but I cannot figure out how to use the architecture of T5 from hugging face without the weights.


You can instantiate a HuggingFace model in 2 ways:

  • using a config, making sure all weights are randomly initialized
  • using the from_pretrained method, which will use some pretrained weights.

For T5, you can instantiate it with randomly initialized weights as follows:

from transformers import T5Config, T5ForConditionalGeneration

config = T5Config()
model = T5ForConditionalGeneration(config)
1 Like

Thank you so much! That worked perfectly!