Do we need to load a model twice to get embeddings and probabilities?

Hello the dream team!

I have fine-tuned a bert model for sentence classification.

Everything works correctly but I need to both classify a sentence and extract its embeddings (based on the CLS token).

Right now I am doing something (likely) very inefficient which is to load the model twice:

myinput = 'huggingface is great but I am learning every day'

model_for_embeddings= TFAutoModel.from_pretrained(r"Z:\mymodel")

#get the embeddings each of dimension 768
input_ids = tf.constant(tokenizer.encode(myinput))[None,:]
outputs = model_for_embeddings(input_ids)
outputs[0][0]    

And now I also load the same model to get a classification prediction :sweat_smile:

model_for_classification = TFAutoModelForSequenceClassification.from_pretrained((r"Z:\mymodel")

encoding = tokenizer([myinput], 
                         max_length=280, 
                         truncation=True, 
                         padding=True,
                         return_tensors="tf")
# forward pass
outputs = model_for_classification(encoding)
logits = outputs.logits
# transform to array with probabilities
probs = tf.nn.softmax(preds, axis=1).numpy() 

I think seems extremely inefficient. Can I load the model just once and do both tasks?
Thanks! :100:

1 Like

Yes you can get both outputs with just a single forward pass. In HuggingFace Transformers, you can pass in output_hidden_states=True when performing a forward pass for a given model. This allows you to get both the logits (for classification) and the hidden states of all layers of the model.

from transformers import AutoTokenizer, TFAutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained(r"Z:\mymodel")
model = TFAutoModelForSequenceClassification.from_pretrained(r"Z:\mymodel")

text = "hello world"
encoding = tokenizer(text, return_tensors="tf")

# forward pass
outputs = model(encoding, output_hidden_states=True)

# get the logits
logits = outputs.logits
# get the hidden states
hidden_states = outputs.hidden_states

Note that the hidden_states are a tuple of tensors. It contains the hidden states of all layers, as well as the embedding layer. This means that you can get the last hidden states as hidden_states[-1].

2 Likes

thanks @nielsr , this is super helpful!

@nielsr I hope all is well. May I just follow up on this? What would be the equivalent way of doing that you suggested in pytorch?

tensorflow:

# forward pass
outputs = model(encoding, output_hidden_states=True)

# get the logits
logits = outputs.logits
# get the hidden states
hidden_states = outputs.hidden_states

Thanks!