Sentence similarity discrimination power

tavakolih · September 15, 2022, 8:35pm

Hi,
As a beginner in this field, I have a hypothetical question. For the sentence similarity task, I expected that the better model could discriminate better. For instance, I have a particular sentence and would like to extract the most similar sentences out of 100 examples. A general model like all-MiniLM-L6-v2 gave me ten sentences with a score of more than 0.9. but when I fine-tuned it on the specific topic( 50k samples), the discrimination power decreased, and now all 100 sentences have a score of more than 0.99
Could you please let me know why that has happened
Regards

Topic		Replies	Views
Sentence similarity Beginners	1	945	September 16, 2021
Fine tuning a sentence-transformer for cosine sim on 500k sentence pairs without labels-- advice 🤗Transformers	2	1196	April 20, 2024
Sentence-transformers/all-mpnet-base-v2 requires Input Text after Cleaning or Raw Text Only Models	0	590	January 6, 2022
Sentence transformer poor performance after fine tuning 🤗Transformers	1	1587	September 11, 2022
Compare the likelihood of various sentences in a LM? Beginners	1	391	July 18, 2021

Sentence similarity discrimination power

Related topics