Couchbase

This section will walk you through setting up the CouchbaseSearchVectorStore to store document embeddings and perform similarity searches using Couchbase.

Couchbase is a distributed, JSON document database, with all the desired capabilities of a relational DBMS. Among other features, it allows users to query information using vector-based storage and retrieval.

Prerequisites

A running Couchbase instance. The following options are available: Couchbase * Docker * Capella - Couchbase as a Service * Install Couchbase locally * Couchbase Kubernetes Operator

Auto-configuration

Spring AI provides Spring Boot auto-configuration for the Couchbase Vector Store. To enable it, add the following dependency to your project’s Maven pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-couchbase-store-spring-boot-starter</artifactId>
</dependency>

or to your Gradle build.gradle build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-couchbase-store-spring-boot-starter'
}
Couchbase Vector search is only available in starting version 7.6 and Java SDK version 3.6.0"
Refer to the Dependency Management section to add the Spring AI BOM to your build file.
Refer to the Repositories section to add Milestone and/or Snapshot Repositories to your build file.

The vector store implementation can initialize the configured bucket, scope, collection and search index for you, with default options, but you must opt-in by specifying the initializeSchema boolean in the appropriate constructor.

This is a breaking change! In earlier versions of Spring AI, this schema initialization happened by default.

Please have a look at the list of configuration parameters for the vector store to learn about the default values and configuration options.

Additionally, you will need a configured EmbeddingModel bean. Refer to the EmbeddingModel section for more information.

Now you can auto-wire the CouchbaseSearchVectorStore as a vector store in your application.

@Autowired VectorStore vectorStore;

// ...

List <Document> documents = List.of(
    new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
    new Document("The World is Big and Salvation Lurks Around the Corner"),
    new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));

// Add the documents to Qdrant
vectorStore.add(documents);

// Retrieve documents similar to a query
List<Document> results = vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(5));

Configuration Properties

To connect to Couchbase and use the CouchbaseSearchVectorStore, you need to provide access details for your instance. A simple configuration can either be provided via Spring Boot’s application.properties,

spring.ai.openai.api-key=<key>
spring.couchbase.connection-string=<conn_string>
spring.couchbase.username=<username>
spring.couchbase.password=<password>

environment variables,

export SPRING_COUCHBASE_CONNECTION_STRINGS=<couchbase connection string like couchbase://localhost>
export SPRING_COUCHBASE_USERNAME=<couchbase username>
export SPRING_COUCHBASE_PASSWORD=<couchbase password>
# API key if needed, e.g. OpenAI
export SPRING_AI_OPENAI_API_KEY=<api-key>

or can be a mix of those. For example, if you want to store your password as an environment variable but keep the rest in the plain application.yml file.

If you choose to create a shell script for ease in future work, be sure to run it prior to starting your application by "sourcing" the file, i.e. source <your_script_name>.sh.

Spring Boot’s auto-configuration feature for the Couchbase Cluster will create a bean instance that will be used by the CouchbaseSearchVectorStore.

The Spring Boot properties starting with spring.couchbase.* are used to configure the Couchbase cluster instance:

Property Description Default Value

spring.couchbase.connection-string

A couchbase connection string

couchbase://localhost

spring.couchbase.password

Password for authentication with Couchbase.

-

spring.couchbase.username

Username for authentication with Couchbase.

-

spring.couchbase.env.io.minEndpoints

Minimum number of sockets per node.

1

spring.couchbase.env.io.maxEndpoints

Maximum number of sockets per node.

12

spring.couchbase.env.io.idleHttpConnectionTimeout

Length of time an HTTP connection may remain idle before it is closed and removed from the pool.

1s

spring.couchbase.env.ssl.enabled

Whether to enable SSL support. Enabled automatically if a "bundle" is provided unless specified otherwise.

-

spring.couchbase.env.ssl.bundle

SSL bundle name.

-

spring.couchbase.env.timeouts.connect

Bucket connect timeout.

10s

spring.couchbase.env.timeouts.disconnect

Bucket disconnect timeout.

10s

spring.couchbase.env.timeouts.key-value

Timeout for operations on a specific key-value.

2500ms

spring.couchbase.env.timeouts.key-value

Timeout for operations on a specific key-value with a durability level.

10s

spring.couchbase.env.timeouts.key-value-durable

Timeout for operations on a specific key-value with a durability level.

10s

spring.couchbase.env.timeouts.query

SQL++ query operations timeout.

75s

spring.couchbase.env.timeouts.view

Regular and geospatial view operations timeout.

75s

spring.couchbase.env.timeouts.search

Timeout for the search service.

75s

spring.couchbase.env.timeouts.analytics

Timeout for the analytics service.

75s

spring.couchbase.env.timeouts.management

Timeout for the management operations.

75s

Properties starting with the spring.ai.vectorstore.couchbase.* prefix are used to configure CouchbaseSearchVectorStore.

Property Description Default Value

spring.ai.vectorstore.couchbase.index-name

The name of the index to store the vectors.

spring-ai-document-index

spring.ai.vectorstore.couchbase.bucket-name

The name of the Couchbase Bucket, parent of the scope.

default

spring.ai.vectorstore.couchbase.scope-name

The name of the Couchbase scope, parent of the collection. Search queries will be executed in the scope context.

default

spring.ai.vectorstore.couchbase.collection-name

The name of the Couchbase collection to store the Documents.

default

spring.ai.vectorstore.couchbase.dimensions

The number of dimensions in the vector.

1536

spring.ai.vectorstore.couchbase.similarity

The similarity function to use.

dot_product

spring.ai.vectorstore.couchbase.optimization

The similarity function to use.

recall

spring.ai.vectorstore.couchbase.initialize-schema

whether to initialize the required schema

false

The following similarity functions are available:

  • l2_norm

  • dot_product

The following index optimizations are available:

  • recall

  • latency

More details about each in the Couchbase Documentation on vector searches.

Metadata Filtering

You can leverage the generic, portable metadata filters with the Couchbase store.

For example, you can use either the text expression language:

vectorStore.similaritySearch(
    SearchRequest.defaults()
    .query("The World")
    .topK(TOP_K)
    .filterExpression("author in ['john', 'jill'] && article_type == 'blog'"));

or programmatically using the Filter.Expression DSL:

FilterExpressionBuilder b = new FilterExpressionBuilder();

vectorStore.similaritySearch(SearchRequest.defaults()
    .query("The World")
    .topK(TOP_K)
    .filterExpression(b.and(
        b.in("author","john", "jill"),
        b.eq("article_type", "blog")).build()));
These filter expressions are converted into the equivalent Couchbase SQL++ filters.

Manual Configuration

Instead of using the Spring Boot auto-configuration, you can manually configure the Couchbase vector store. For this you need to add the spring-ai-couchbase-store to your project:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-couchbase-store</artifactId>
</dependency>

or to your Gradle build.gradle build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-couchbase-store'
}

Create a Couchbase Cluster bean. Read the Couchbase Documentation for more in-depth information about the configuration of a custom Cluster instance.

@Bean
public Cluster cluster() {
    Cluster cluster = Cluster.connect("couchbase://localhost",
	    "username", "password");
}

and then create the CouchbaseSearchVectorStore bean using the builder pattern:

@Bean
public VectorStore couchbaseSearchVectorStore(Cluster cluster,
                                              EmbeddingModel embeddingModel,
                                              Boolean initializeSchema) {
    return CouchbaseSearchVectorStore
            .builder(cluster, embeddingModel)
            .bucketName("test")
            .scopeName("test")
            .collectionName("test")
            .initializeSchema(initializeSchema)
            .build();
}

// This can be any EmbeddingModel implementation.
@Bean
public EmbeddingModel embeddingModel() {
    return new OpenAiEmbeddingModel(OpenAiApi.builder().apiKey(this.openaiKey).build());
}

Limitations

It is mandatory to have the following Couchbase services activated: Data, Query, Index, Search. While Data and Search could be enough, Query and Index are necessary to support the complete metadata filtering mechanism.