Initializing T5Encoder model

arunwzd · May 14, 2022, 10:51pm

Hi, I want to get the final hidden state representations of few sentences. So, I decided to use the T5EncoderModel and my code from this example from huggingface: T5

When I initialize the T5Encoder model, I get a warning saying Some weights of T5EncoderModel were not initialized from the model checkpoint at t5-small and are newly initialized: ['encoder.embed_tokens.weight']). I am worried if this means that the embedding matrix of size (vocab_size, model_hidden_size) Is being newly initialized.

I want to get the best representations for sentences and if the embeddings were newly initialized, I guess the hidden_state representations will not be the best ones to use.

I want to know if:

The above warning actually means new embedding matrix initialization
Whats the best model/method to get sentence representations?

Thanks in advance.

arunwzd · June 8, 2022, 3:41pm

@sgugger can you help me understand the issue?

dblakely · June 20, 2022, 1:10pm

Hi,

It looks to me like the issue is that when you initialize a T5EncoderModel using t5-small, the code doesn’t realize how to initialize the encoder’s embedding layer, as that layer is shared between the encoder and the decoder. In the code, it’s this variable. Because it’s shared between both the encoder and the decoder, it’s first initialized in T5Model, not inside the encoder or decoder. And so my guess is that inside the serialized t5-small model (the file on disk), the input embedding weights are only saved by referencing T5Model.shared, and not T5Model.encoder.shared. Then when you call T5EncoderModel.from_pretrained("t5-small"), the code looks for encoder.shared, but it doesn’t exist.

However, I noticed that T5EncoderModel has a set_input_embeddings method. This feels a little hacky, but it seems that at least one way to solve the problem is to do something like:

from transformers import T5Model, T5EncoderModel

model = T5Model.from_pretrained("t5-small")
encoder = T5EncoderModel.from_pretrained("t5-small")
encoder.set_input_embeddings(model.shared)
encoder.save_pretrained("encoder-checkpoint")

# Loads without any warnings
encoder = T5EncoderModel.from_pretrained("encoder-checkpoint")

I’d be curious to learn if there’s a less circuitous solution. For one, there might be better checkpoints out there to use than t5-small, that were intended specifically for a T5Encoder.

Topic		Replies	Views
Warning when loading T5 encoders 🤗Transformers	3	1919	May 15, 2023
How to use T5 for sentence embedding? Research	6	16035	May 27, 2023
Load EncoderDecoderModel from a checkpoint Models	0	294	March 9, 2023
How to use the encoder only from T5? Beginners	0	673	April 9, 2022
Untrained T5 model outputting logits that argmax to the decoder_input_ids Beginners	0	499	September 28, 2022

Initializing T5Encoder model

Related topics