Sentence similarity discrimination power

As a beginner in this field, I have a hypothetical question. For the sentence similarity task, I expected that the better model could discriminate better. For instance, I have a particular sentence and would like to extract the most similar sentences out of 100 examples. A general model like all-MiniLM-L6-v2 gave me ten sentences with a score of more than 0.9. but when I fine-tuned it on the specific topic( 50k samples), the discrimination power decreased, and now all 100 sentences have a score of more than 0.99
Could you please let me know why that has happened