How to compare the meaning of documents

nader01 · May 26, 2023, 5:45pm

I am trying to extract the meaning of a sentence and compare it with the meaning of another. The normal transformer ‘similarity’ models tend to compare words in both sentences and claim similarity if both sentences have similar words.

I am trying to compare, not work similarity, but meaning. As an example:

1 - Jack went to the movies
2 - Jack did not go to the movies.

These two sentences may be considered ‘similar’ as they use similar words, but they mean entirely different things. ChatGPT does understand their difference and that is exactly what I am looking for,

I would be grateful for your guidance.

ParasRupani · April 3, 2024, 3:46am

Were you able to find anything similar to this? I am working on a task to find the duplicate pairs, and a pre-trained model would speed things up significantly. Thanks!

nader01 · April 3, 2024, 12:24pm

While many models are trained for text generation, I was unable to find a model that specifically trained for semantic comparison, although I did not do an exhaustive search by any means. My suggestion would be to take a relatively large model (LLM), summarize both texts and use a comparison like cosine similarity and try it to see if it offers what you need. If not, it would be much easier to Fine Tune such an LLM than to either try to find one or to build one.

Topic		Replies	Views
Document Similarity of long documents e.g. legal contracts 🤗Transformers	6	8840	July 2, 2024
Can Similarity Sentence Returns the Similarity Content? 🤗Transformers	0	324	April 27, 2023
I don't understand the difference between asymmetric retrieval, sentence similarity, and semantic search Beginners	2	6178	July 28, 2023
Test a Model's knowledge Beginners	0	252	May 3, 2022
Sentence Similarity task differences Beginners	0	246	January 16, 2023

How to compare the meaning of documents

Related topics