OpenSearch
This section guides you through setting up the OpenSearch VectorStore
to store document embeddings and perform similarity searches.
OpenSearch is an open-source search and analytics engine originally forked from Elasticsearch, distributed under the Apache License 2.0. It enhances AI application development by simplifying the integration and management of AI-generated assets. OpenSearch supports vector, lexical, and hybrid search capabilities, leveraging advanced vector database functionalities to facilitate low-latency queries and similarity searches as detailed on the vector database page. This platform is ideal for building scalable AI-driven applications and offers robust tools for data management, fault tolerance, and resource access controls.
Prerequisites
-
A running OpenSearch instance. The following options are available:
-
EmbeddingModel
instance to compute the document embeddings. Several options are available:-
If required, an API key for the EmbeddingModel to generate the embeddings stored by the
OpenSearchVectorStore
.
-
Dependencies
Add the OpenSearch Vector Store dependency to your project:
-
Maven
-
Gradle
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-opensearch-store</artifactId>
</dependency>
dependencies {
implementation 'org.springframework.ai:spring-ai-opensearch-store'
}
Refer to the Dependency Management section to add the Spring AI BOM to your build file. |
Configuration
To connect to OpenSearch and use the OpenSearchVectorStore
, you need to provide access details for your instance.
A simple configuration can either be provided via Spring Boot’s application.yml
,
spring:
ai:
vectorstore:
opensearch:
uris: <opensearch instance URIs>
username: <opensearch username>
password: <opensearch password>
indexName: <opensearch index name>
mappingJson: <JSON mapping for opensearch index>
aws:
host: <aws opensearch host>
serviceName: <aws service name>
accessKey: <aws access key>
secretKey: <aws secret key>
region: <aws region>
# API key if needed, e.g. OpenAI
openai:
apiKey: <api-key>
Check the list of configuration parameters to learn about the default values and configuration options. |
Auto-configuration
Self-Managed OpenSearch
Spring AI provides Spring Boot auto-configuration for the OpenSearch Vector Store.
To enable it, add the following dependency to your project’s Maven pom.xml
or Gradle build.gradle
build files:
-
Maven
-
Gradle
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-opensearch-store-spring-boot-starter</artifactId>
</dependency>
dependencies {
implementation 'org.springframework.ai:spring-ai-opensearch-store-spring-boot-starter'
}
Then use the spring.ai.vectorstore.opensearch.*
properties to configure the connection to the self-managed OpenSearch instance.
Amazon OpenSearch Service
To enable Amazon OpenSearch Service., add the following dependency to your project’s Maven pom.xml
or Gradle build.gradle
build files:
-
Maven
-
Gradle
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-aws-opensearch-store-spring-boot-starter</artifactId>
</dependency>
dependencies {
implementation 'org.springframework.ai:spring-ai-aws-opensearch-store-spring-boot-starter'
}
Then use the spring.ai.vectorstore.opensearch.aws.*
properties to configure the connection to the Amazon OpenSearch Service.
Refer to the Dependency Management section to add the Spring AI BOM to your build file. |
Here is an example of the needed bean:
@Bean
public EmbeddingModel embeddingModel() {
// Can be any other EmbeddingModel implementation
return new OpenAiEmbeddingModel(new OpenAiApi(System.getenv("SPRING_AI_OPENAI_API_KEY")));
}
Now you can auto-wire the OpenSearchVectorStore
as a vector store in your application.
@Autowired VectorStore vectorStore;
// ...
List <Document> documents = List.of(
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
new Document("The World is Big and Salvation Lurks Around the Corner"),
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));
// Add the documents to OpenSearch
vectorStore.add(List.of(document));
// Retrieve documents similar to a query
List<Document> results = this.vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(5));
Configuration properties
You can use the following properties in your Spring Boot configuration to customize the OpenSearch vector store.
Property | Description | Default value |
---|---|---|
|
URIs of the OpenSearch cluster endpoints. |
- |
|
Username for accessing the OpenSearch cluster. |
- |
|
Password for the specified username. |
- |
|
Name of the default index to be used within the OpenSearch cluster. |
|
|
JSON string defining the mapping for the index; specifies how documents and their fields are stored and indexed. Refer here for some sample configurations |
{ "properties":{ "embedding":{ "type":"knn_vector", "dimension":1536 } } } |
|
Hostname of the OpenSearch instance. |
- |
|
AWS service name for the OpenSearch instance. |
- |
|
AWS access key for the OpenSearch instance. |
- |
|
AWS secret key for the OpenSearch instance. |
- |
|
AWS region for the OpenSearch instance. |
- |
Customizing OpenSearch Client Configuration
In cases where the Spring Boot auto-configured OpenSearchClient with Apache HttpClient 5 Transport
bean is not what
you want or need, you can still define your own bean.
Please read the OpenSearch Java Client Documentation
Metadata Filtering
You can leverage the generic, portable metadata filters with OpenSearch as well.
For example, you can use either the text expression language:
-
SQL filter syntax
-
Filter.Expression
DSL
vectorStore.similaritySearch(SearchRequest.defaults()
.withQuery("The World")
.withTopK(TOP_K)
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
.withFilterExpression("author in ['john', 'jill'] && 'article_type' == 'blog'"));
FilterExpressionBuilder b = new FilterExpressionBuilder();
vectorStore.similaritySearch(SearchRequest.defaults()
.withQuery("The World")
.withTopK(TOP_K)
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
.withFilterExpression(b.and(
b.in("john", "jill"),
b.eq("article_type", "blog")).build()));
Those (portable) filter expressions get automatically converted into the proprietary OpenSearch Query string query. |
For example, this portable filter expression:
author in ['john', 'jill'] && 'article_type' == 'blog'
is converted into the proprietary OpenSearch filter format:
(metadata.author:john OR jill) AND metadata.article_type:blog