Understanding Vectors

Vectors have dimensionality and a direction. For example, the following image depicts a two-dimensional vector in the cartesian coordinate system pictured as an arrow.

The head of the vector is at the point . The x coordinate value is and the y coordinate value is . The coordinates are also referred to as the components of the vector.

Similarity

Several mathematical formulas can be used to determine if two vectors are similar. One of the most intuitive to visualize and understand is cosine similarity. Consider the following images that show three sets of graphs:

The vectors and are considered similar, when they are pointing close to each other, as in the first diagram. The vectors are considered unrelated when pointing perpendicular to each other and opposite when they point away from each other.

The angle between them, , is a good measure of their similarity. How can the angle be computed?

pythagorean triangle

We are all familiar with the Pythagorean Theorem.

What about when the angle between a and b is not 90 degrees?

Enter the Law of cosines.

Law of Cosines

The following image shows this approach as a vector diagram: lawofcosines

The magnitude of this vector is defined in terms of its components as:

Magnitude

The dot product between two vectors and is defined in terms of its components as:

Dot Product

Rewriting the Law of Cosines with vector magnitudes and dot products gives the following:

Law of Cosines in Vector form

Replacing with gives the following:

Law of Cosines in Vector form only in terms of \$\vec{A}\$ and \$\vec{B}\$

Expanding this out gives us the formula for Cosine Similarity.

Cosine Similarity

This formula works for dimensions higher than 2 or 3, though it is hard to visualize. However, it can be visualized to some extent. It is common for vectors in AI/ML applications to have hundreds or even thousands of dimensions.

The similarity function in higher dimensions using the components of the vector is shown below. It expands the two-dimensional definitions of Magnitude and Dot Product given previously to N dimensions by using Summation mathematical syntax.

Cosine Similarity with vector components

This is the key formula used in the simple implementation of a vector store and can be found in the SimpleVectorStore implementation.