Class ElasticsearchVectorStore

java.lang.Object
org.springframework.ai.vectorstore.observation.AbstractObservationVectorStore
org.springframework.ai.vectorstore.elasticsearch.ElasticsearchVectorStore
All Implemented Interfaces:
Consumer<List<Document>>, DocumentWriter, VectorStore, org.springframework.beans.factory.InitializingBean

public class ElasticsearchVectorStore extends AbstractObservationVectorStore implements org.springframework.beans.factory.InitializingBean
Elasticsearch-based vector store implementation using the dense_vector field type.

The store uses an Elasticsearch index to persist vector embeddings along with their associated document content and metadata. The implementation leverages Elasticsearch's k-NN search capabilities for efficient similarity search operations.

Features:

  • Automatic schema initialization with configurable index creation
  • Support for multiple similarity functions: Cosine, L2 Norm, and Dot Product
  • Metadata filtering using Elasticsearch query strings
  • Configurable similarity thresholds for search results
  • Batch processing support with configurable strategies
  • Observation and metrics support through Micrometer

Basic usage example:


 ElasticsearchVectorStore vectorStore = ElasticsearchVectorStore.builder(restClient, embeddingModel)
     .initializeSchema(true)
     .build();

 // Add documents
 vectorStore.add(List.of(
     new Document("content1", Map.of("key1", "value1")),
     new Document("content2", Map.of("key2", "value2"))
 ));

 // Search with filters
 List<Document> results = vectorStore.similaritySearch(
     SearchRequest.query("search text")
         .withTopK(5)
         .withSimilarityThreshold(0.7)
         .withFilterExpression("key1 == 'value1'")
 );
 

Advanced configuration example:


 ElasticsearchVectorStoreOptions options = new ElasticsearchVectorStoreOptions();
 options.setIndexName("custom_vectors");
 options.setSimilarity(SimilarityFunction.dot_product);
 options.setDimensions(1536);

 ElasticsearchVectorStore vectorStore = ElasticsearchVectorStore.builder(restClient, embeddingModel)
     .options(options)
     .initializeSchema(true)
     .batchingStrategy(new TokenCountBatchingStrategy())
     .build();
 

Requirements:

  • Elasticsearch 8.0 or later
  • Index mapping with id (string), content (text), metadata (object), and embedding (dense_vector) fields

Similarity Functions:

  • cosine: Default, suitable for most use cases. Measures cosine similarity between vectors.
  • l2_norm: Euclidean distance between vectors. Lower values indicate higher similarity.
  • dot_product: Best performance for normalized vectors (e.g., OpenAI embeddings).
Since:
1.0.0
Author:
Jemin Huh, Wei Jiang, Laura Trotta, Soby Chacko, Christian Tzolov, Thomas Vitale, Ilayaperumal Gopinathan, Jonghoon Park