Hello the dream team!
I have fine-tuned a bert model for sentence classification.
Everything works correctly but I need to both classify a sentence and extract its embeddings (based on the CLS
token).
Right now I am doing something (likely) very inefficient which is to load the model twice:
myinput = 'huggingface is great but I am learning every day'
model_for_embeddings= TFAutoModel.from_pretrained(r"Z:\mymodel")
#get the embeddings each of dimension 768
input_ids = tf.constant(tokenizer.encode(myinput))[None,:]
outputs = model_for_embeddings(input_ids)
outputs[0][0]
And now I also load the same model to get a classification prediction
model_for_classification = TFAutoModelForSequenceClassification.from_pretrained((r"Z:\mymodel")
encoding = tokenizer([myinput],
max_length=280,
truncation=True,
padding=True,
return_tensors="tf")
# forward pass
outputs = model_for_classification(encoding)
logits = outputs.logits
# transform to array with probabilities
probs = tf.nn.softmax(preds, axis=1).numpy()
I think seems extremely inefficient. Can I load the model just once and do both tasks?
Thanks!