Hi, I want to get the final hidden state representations of few sentences. So, I decided to use the T5EncoderModel
and my code from this example from huggingface: T5
When I initialize the T5Encoder model, I get a warning saying Some weights of T5EncoderModel were not initialized from the model checkpoint at t5-small and are newly initialized: ['encoder.embed_tokens.weight'])
. I am worried if this means that the embedding matrix of size (vocab_size, model_hidden_size) Is being newly initialized.
I want to get the best representations for sentences and if the embeddings were newly initialized, I guess the hidden_state representations will not be the best ones to use.
I want to know if:
- The above warning actually means new embedding matrix initialization
- Whats the best model/method to get sentence representations?
Thanks in advance.