This version is still in development and is not considered stable yet. For the latest snapshot version, please use Spring AI 1.0.0-SNAPSHOT! |
OpenSearch
This section guides you through setting up the OpenSearch VectorStore
to store document embeddings and perform similarity searches.
OpenSearch is an open-source search and analytics engine originally forked from Elasticsearch, distributed under the Apache License 2.0. It enhances AI application development by simplifying the integration and management of AI-generated assets. OpenSearch supports vector, lexical, and hybrid search capabilities, leveraging advanced vector database functionalities to facilitate low-latency queries and similarity searches as detailed on the vector database page. This platform is ideal for building scalable AI-driven applications and offers robust tools for data management, fault tolerance, and resource access controls.
Prerequisites
-
A running OpenSearch instance. The following options are available:
-
EmbeddingModel
instance to compute the document embeddings. Several options are available:-
If required, an API key for the EmbeddingModel to generate the embeddings stored by the
OpenSearchVectorStore
.
-
Dependencies
Add the OpenSearch Vector Store dependency to your project:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-opensearch-store</artifactId>
</dependency>
or to your Gradle build.gradle
build file.
dependencies {
implementation 'org.springframework.ai:spring-ai-opensearch-store'
}
Refer to the Dependency Management section to add the Spring AI BOM to your build file. |
Configuration
To connect to OpenSearch and use the OpenSearchVectorStore
, you need to provide access details for your instance.
A simple configuration can either be provided via Spring Boot’s application.yml
,
spring:
opensearch:
uris: <opensearch instance URIs>
username: <opensearch username>
password: <opensearch password>
indexName: <opensearch index name>
mappingJson: <JSON mapping for opensearch index>
# API key if needed, e.g. OpenAI
ai:
openai:
api:
key: <api-key>
Check the list of configuration parameters to learn about the default values and configuration options. |
Auto-configuration
Spring AI provides Spring Boot auto-configuration for the OpenSearch Vector Store.
To enable it, add the following dependency to your project’s Maven pom.xml
file:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-opensearch-store-spring-boot-starter</artifactId>
</dependency>
or to your Gradle build.gradle
build file.
dependencies {
implementation 'org.springframework.ai:spring-ai-opensearch-store-spring-boot-starter'
}
Refer to the Dependency Management section to add the Spring AI BOM to your build file. |
Here is an example of the needed bean:
@Bean
public EmbeddingModel embeddingModel() {
// Can be any other EmbeddingModel implementation
return new OpenAiEmbeddingModel(new OpenAiApi(System.getenv("SPRING_AI_OPENAI_API_KEY")));
}
Now you can auto-wire the OpenSearchVectorStore
as a vector store in your application.
@Autowired VectorStore vectorStore;
// ...
List <Document> documents = List.of(
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
new Document("The World is Big and Salvation Lurks Around the Corner"),
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));
// Add the documents to OpenSearch
vectorStore.add(List.of(document));
// Retrieve documents similar to a query
List<Document> results = vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(5));
Configuration properties
You can use the following properties in your Spring Boot configuration to customize the OpenSearch vector store.
Property | Description | Default value |
---|---|---|
|
URIs of the OpenSearch cluster endpoints. |
- |
|
Username for accessing the OpenSearch cluster. |
- |
|
Password for the specified username. |
- |
|
Name of the default index to be used within the OpenSearch cluster. |
|
|
JSON string defining the mapping for the index; specifies how documents and their fields are stored and indexed. |
{ "properties":{ "embedding":{ "type":"knn_vector", "dimension":1536 } } } |
|
Hostname of the OpenSearch instance. |
- |
|
AWS service name for the OpenSearch instance. |
- |
|
AWS access key for the OpenSearch instance. |
- |
|
AWS secret key for the OpenSearch instance. |
- |
|
AWS region for the OpenSearch instance. |
- |
Customizing OpenSearch Client Configuration
In cases where the Spring Boot auto-configured OpenSearchClient with Apache HttpClient 5 Transport
bean is not what
you want or need, you can still define your own bean.
Please read the OpenSearch Java Client Documentation
for more in-depth information about the configuration of Amazon OpenSearch Service.
To enable it, add the following dependency to your project’s Maven pom.xml
file:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-aws-opensearch-store-spring-boot-starter</artifactId>
</dependency>
or to your Gradle build.gradle
build file.
dependencies {
implementation 'org.springframework.ai:spring-ai-aws-opensearch-store-spring-boot-starter'
}
Metadata Filtering
You can leverage the generic, portable metadata filters with OpenSearch as well.
For example, you can use either the text expression language:
vectorStore.similaritySearch(SearchRequest.defaults()
.withQuery("The World")
.withTopK(TOP_K)
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
.withFilterExpression("author in ['john', 'jill'] && 'article_type' == 'blog'"));
or programmatically using the Filter.Expression
DSL:
FilterExpressionBuilder b = new FilterExpressionBuilder();
vectorStore.similaritySearch(SearchRequest.defaults()
.withQuery("The World")
.withTopK(TOP_K)
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
.withFilterExpression(b.and(
b.in("john", "jill"),
b.eq("article_type", "blog")).build()));
Those (portable) filter expressions get automatically converted into the proprietary OpenSearch Query string query. |
For example, this portable filter expression:
author in ['john', 'jill'] && 'article_type' == 'blog'
is converted into the proprietary OpenSearch filter format:
(metadata.author:john OR jill) AND metadata.article_type:blog