Class Neo4jVectorStore

All Implemented Interfaces:
Consumer<List<Document>>, DocumentWriter, VectorStore, VectorStoreRetriever, org.springframework.beans.factory.InitializingBean

public class Neo4jVectorStore extends AbstractObservationVectorStore implements org.springframework.beans.factory.InitializingBean
Neo4j-based vector store implementation using Neo4j's vector search capabilities.

The store uses Neo4j's vector search functionality to persist and query vector embeddings along with their associated document content and metadata. The implementation leverages Neo4j's HNSW (Hierarchical Navigable Small World) algorithm for efficient k-NN search operations.

Features:

  • Automatic schema initialization with configurable index creation
  • Support for multiple distance functions: Cosine and Euclidean
  • Metadata filtering using Neo4j's WHERE clause expressions
  • Configurable similarity thresholds for search results
  • Batch processing support with configurable strategies
  • Observation and metrics support through Micrometer

Basic usage example:


 Neo4jVectorStore vectorStore = Neo4jVectorStore.builder(driver, embeddingModel)
     .initializeSchema(true)
     .build();

 // Add documents
 vectorStore.add(List.of(
     new Document("content1", Map.of("key1", "value1")),
     new Document("content2", Map.of("key2", "value2"))
 ));

 // Search with filters
 List<Document> results = vectorStore.similaritySearch(
     SearchRequest.query("search text")
         .withTopK(5)
         .withSimilarityThreshold(0.7)
         .withFilterExpression("key1 == 'value1'")
 );
 

Advanced configuration example:


 Neo4jVectorStore vectorStore = Neo4jVectorStore.builder(driver, embeddingModel)
     .databaseName("neo4j")
     .distanceType(Neo4jDistanceType.COSINE)
     .dimensions(1536)
     .label("CustomDocument")
     .embeddingProperty("vector")
     .indexName("custom-vectors")
     .initializeSchema(true)
     .batchingStrategy(new TokenCountBatchingStrategy())
     .build();
 

Requirements:

  • Neo4j 5.15 or later
  • Node schema with id (string), text (string), metadata (object), and embedding (vector) properties

Distance Functions:

  • cosine: Default, suitable for most use cases. Measures cosine similarity between vectors.
  • euclidean: Euclidean distance between vectors. Lower values indicate higher similarity.
Since:
1.0.0
Author:
Gerrit Meier, Michael Simons, Christian Tzolov, Thomas Vitale, Soby Chacko, Jihoon Kim