Basic question on cosine similarity

Hi there ! I am new to NLP techniques. I am currently using the cosine_similarity() function of the sklearn package in Python. I am wondering if it takes into account the sentence structure and its length. I can’t find any type of documentation on this topic, can someone would be able to explain to me how these caracteristics are taken into account when the cosine similarity is calculated ? do the sentence embeddings tend to be very different when the grammatical construction of the sentences differ ? why so ? i would greatly appreciate your answer :smiley:

1 Like