Class PgVectorStore

All Implemented Interfaces:
Consumer<List<Document>>, DocumentWriter, VectorStore, VectorStoreRetriever, org.springframework.beans.factory.InitializingBean

public class PgVectorStore extends AbstractObservationVectorStore implements org.springframework.beans.factory.InitializingBean
PostgreSQL-based vector store implementation using the pgvector extension.

The store uses a database table to persist the vector embeddings along with their associated document content and metadata. By default, it uses the "vector_store" table in the "public" schema, but this can be configured.

Features:

  • Automatic schema initialization with configurable table and index creation
  • Support for different distance metrics: Cosine, Euclidean, and Inner Product
  • Flexible indexing options: HNSW (default), IVFFlat, or exact search (no index)
  • Metadata filtering using JSON path expressions
  • Configurable similarity thresholds for search results
  • Batch processing support with configurable batch sizes

Basic usage example:


 PgVectorStore vectorStore = PgVectorStore.builder(jdbcTemplate, embeddingModel)
     .dimensions(1536) // Optional: defaults to model dimensions or 1536
     .distanceType(PgDistanceType.COSINE_DISTANCE)
     .indexType(PgIndexType.HNSW)
     .build();

 // Add documents
 vectorStore.add(List.of(
     new Document("content1", Map.of("key1", "value1")),
     new Document("content2", Map.of("key2", "value2"))
 ));

 // Search with filters
 List<Document> results = vectorStore.similaritySearch(
     SearchRequest.query("search text")
         .withTopK(5)
         .withSimilarityThreshold(0.7)
         .withFilterExpression("key1 == 'value1'")
 );
 

Advanced configuration example:


 PgVectorStore vectorStore = PgVectorStore.builder(jdbcTemplate, embeddingModel)
     .schemaName("custom_schema")
     .vectorTableName("custom_vectors")
     .distanceType(PgDistanceType.NEGATIVE_INNER_PRODUCT)
     .removeExistingVectorStoreTable(true)
     .initializeSchema(true)
     .maxDocumentBatchSize(1000)
     .build();
 

Database Requirements:

  • PostgreSQL with pgvector extension installed
  • Required extensions: vector, hstore, uuid-ossp
  • Table schema with id (uuid), content (text), metadata (json), and embedding (vector) columns

Distance Types:

  • COSINE_DISTANCE: Default, suitable for most use cases
  • EUCLIDEAN_DISTANCE: L2 distance between vectors
  • NEGATIVE_INNER_PRODUCT: Best performance for normalized vectors (e.g., OpenAI embeddings)

Index Types:

  • HNSW: Default, better query performance but slower builds and more memory
  • IVFFLAT: Faster builds, less memory, but lower query performance
  • NONE: Exact search without indexing
Since:
1.0.0
Author:
Christian Tzolov, Josh Long, Muthukumaran Navaneethakrishnan, Thomas Vitale, Soby Chacko, Sebastien Deleuze, Jihoon Kim, YeongMin Song, Jonghoon Park