Hi There,
Iām a new fan of hugging face (also NLP / LLMs). All these are fresh new to me. Iām trying the SentenceTransformers library these days.
I copied below SCRIPT from Semantic Textual Similarity ā Sentence-Transformers documentation,
the code generate embedding for sentence,I cannot get the similarity content from the page but only the cosine_scores, does anybody can give some clues how to do that? Thank you in advance.
SCRIPT:
from sentence_transformers import SentenceTransformer, util
model = SentenceTransformer(ā/Users/xxx/xxx/all-MiniLM-L6-v2ā)
sentences = [āThe cat sits outsideā,
āA man is playing guitarā,
āI love pastaā,
āThe new movie is awesomeā,
āThe cat plays in the gardenā,
āA woman watches TVā,
āThe new movie is so greatā,
āDo you like pizza?ā]
#Compute embeddings
embeddings = model.encode(sentences, convert_to_tensor=True)
#Compute cosine-similarities for each sentence with each other sentence
cosine_scores = util.cos_sim(embeddings, embeddings)
#Find the pairs with the highest cosine similarity scores
pairs =
for i in range(len(cosine_scores)-1):
for j in range(i+1, len(cosine_scores)):
pairs.append({āindexā: [i, j], āscoreā: cosine_scores[i][j]})
#Sort scores in decreasing order
pairs = sorted(pairs, key=lambda x: x[āscoreā], reverse=True)
for pair in pairs[0:10]:
i, j = pair[āindexā]
print(ā{} \t\t {} \t\t Score: {:.4f}ā.format(sentences[i], sentences[j], pair[āscoreā]))
print(pair)
RESULTS:
The new movie is awesome The new movie is so great Score: 0.9286
{āindexā: [3, 6], āscoreā: tensor(0.9286)}
ā¦ ā¦