Is cosine similarity the only way to measure similarity?

In most word embedding models like w2v or Glove or … we use cosine similarity when we want to see the performance of the model. However, suppose that we have two vectors in the same direction but with different sizes, regarding cosine similarity they are similar but with another one like Euclidian they are different. (Like these vectors: [10,0] , [1,0])
What is the best way to measure the performance of and embedding model?

1 Like

depends on what property you are interested in

a nice property of cosine similarity is its scale invariance, as the example you give demonstrates

Yes, that is true. So, how can I make sure in feature space, two words which is not similar to each other, take the vector in same direction but with different size like [1,0], [10,0]. Because they may place anywhere in space. many of the more than 20 different distance measures defined in scipy can probably be easily converted to similarity measures. For 6 (Y = cdist(XA, XB, ‘cosine’)) for example, 1 - cdist(XA, XB, ‘cosine’)) is cosine(XA, XB)?

1 Like