Cosine Similarity

Cosine similarity measures the cosine of the angle between two multi-dimensional vectors. The smaller the angle, the higher the cosine similarity. Unlike measuring Euclidean distance, cosine similarity captures the orientation of the documents and not the magnitude.

For example, if a word appears 30 times in one document and 5 times in another document, measurement by Euclidean distance places them far apart. But cosine similarity would detect a smaller angle between them, thus establishing a similarity.

To use this method you’ll first need to convert the two objects into vectors (A and B) and then find the cosine similarity using the formula, cosine similarity (CS) = (A . B) / (||A|| ||B||).

  • Calculate the dot product between A and B.
  • Calculate the magnitude of vector A.
  • Calculate the magnitude of vector B.
  • Calculate the cosine similarity.
Euclidean Distance & Cosine Similarity