Generating sentence embeddings from pretrained transformers model

prithvisrinivasan · January 20, 2021, 7:45pm

Hi, I have a pretrained BERT based model hosted on huggingface.

How do I generate sentence vectors using this model? I have explored sentence bert but it doesn’t allow you to use custom trained models. I have also seen Bert as a client. It works but for my current scenario, I was wondering if there’s something which could be done without running a server for converting to vectors.

marcoabrate · January 22, 2021, 8:31am

I think the best method is to go with sentence-bert. Indeed, you can use your own model, just try reproducing what they do in the paper.
You add a pooling layer at the output of your model, from the paper:

We experiment with three pooling strategies: Using the output of the CLS-token,computing the mean of all output vectors (MEAN-strategy), and computing a max-over-time of the output vectors (MAX-strategy). The default configuration is MEAN.

Finally, you might want to fine-tune your model for comparison, too:

In order to fine-tune BERT / RoBERTa (your model), we create siamese and triplet networks (Schroff et al.,2015) to update the weights such that the produced sentence embeddings are semantically meaningful and can be compared with cosine-similarity.

Hope this help!

Topic		Replies	Views
How to get sentence embedding using a fine-tuned model Intermediate	0	263	April 18, 2023
What should be used as sentence embedding for BertModel? Beginners	0	1909	May 24, 2021
Sentences' embeddings from BERT cross-encoder 🤗Transformers	0	275	December 22, 2022
Generate raw word embeddings using transformer models like BERT for downstream process Beginners	9	39970	October 4, 2021
How to obtain a good sentence embedding? 🤗Transformers	2	2278	June 25, 2024

Generating sentence embeddings from pretrained transformers model

Related topics