Which loss function is used for paraphrase-multilingual-MiniLM-L12-v2

JeanMaridor · May 31, 2022, 7:00am

Hello !

I would have liked to know which loss function is used for this model and how I could have found it without asking the question here! The hugging face page on this model is a bit succinct.

It seems to me that according to the SBERT article there are 2 possible models in training. One directly using the cosine similarity between 2 sentences, the other concatenating the vectors of the 2 sentences and assigning with a softmax classifier to classes 0,1,-1

Which one is used when training?

Thank you in advance

Topic		Replies	Views
Sentence transformers - SoftmaxLoss Models	1	961	June 20, 2024
RAG Embeddings: German language Beginners	10	6530	May 23, 2024
Computing similarity between sentences Intermediate	4	3279	July 31, 2021
How to obtain similarity values from embeddings? Beginners	2	426	April 29, 2022
[Nov 16th Event] Lewis Tunstall: Simple Training with the 🤗 Transformers Trainer Course	12	501	November 16, 2021

Which loss function is used for paraphrase-multilingual-MiniLM-L12-v2

Related topics