I am trying to extract the meaning of a sentence and compare it with the meaning of another. The normal transformer ‘similarity’ models tend to compare words in both sentences and claim similarity if both sentences have similar words.
I am trying to compare, not work similarity, but meaning. As an example:
1 - Jack went to the movies
2 - Jack did not go to the movies.
These two sentences may be considered ‘similar’ as they use similar words, but they mean entirely different things. ChatGPT does understand their difference and that is exactly what I am looking for,
I would be grateful for your guidance.
Were you able to find anything similar to this? I am working on a task to find the duplicate pairs, and a pre-trained model would speed things up significantly. Thanks!
While many models are trained for text generation, I was unable to find a model that specifically trained for semantic comparison, although I did not do an exhaustive search by any means. My suggestion would be to take a relatively large model (LLM), summarize both texts and use a comparison like cosine similarity and try it to see if it offers what you need. If not, it would be much easier to Fine Tune such an LLM than to either try to find one or to build one.