Cosine similarity measures the cosine of the angle between two multi-dimensional vectors. The smaller the angle, the higher the cosine similarity. Unlike measuring Euclidean distance, cosine similarity captures the orientation of the documents and not the magnitude.
For example, if a word appears 30 times in one document and 5 times in another document, measurement by Euclidean distance places them far apart. But cosine similarity would detect a smaller angle between them, thus establishing a similarity.
To use this method you’ll first need to convert the two objects into vectors (A and B) and then find the cosine similarity using the formula, cosine similarity (CS) = (A . B) / (||A|| ||B||).
- Calculate the dot product between A and B.
- Calculate the magnitude of vector A.
- Calculate the magnitude of vector B.
- Calculate the cosine similarity.