Which loss function is used for paraphrase-multilingual-MiniLM-L12-v2

I would have liked to know which loss function is used for this model and how I could have found it without asking the question here! The hugging face page on this model is a bit succinct.

It seems to me that according to the SBERT article there are 2 possible models in training. One directly using the cosine similarity between 2 sentences, the other concatenating the vectors of the 2 sentences and assigning with a softmax classifier to classes 0,1,-1

Which one is used when training?

