This version is still in development and is not considered stable yet. For the latest snapshot version, please use Spring AI 1.0.0-SNAPSHOT! |
Typesense
This section walks you through setting up TypesenseVectorStore
to store document embeddings and perform similarity searches.
Typesense Typesense is an open source, typo tolerant search engine that is optimized for instant sub-50ms searches, while providing an intuitive developer experience.
Prerequisites
-
A Typesense instance
-
Typesense Cloud (recommended)
-
Docker image typesense/typesense:latest
-
-
EmbeddingModel
instance to compute the document embeddings. Several options are available:-
If required, an API key for the EmbeddingModel to generate the embeddings stored by the
TypesenseVectorStore
.
-
Auto-configuration
Spring AI provides Spring Boot auto-configuration for the Typesense Vector Sore.
To enable it, add the following dependency to your project’s Maven pom.xml
file:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-typesense-spring-boot-starter</artifactId>
</dependency>
or to your Gradle build.gradle
build file.
dependencies {
implementation 'org.springframework.ai:spring-ai-typesense-spring-boot-starter'
}
Refer to the Dependency Management section to add the Spring AI BOM to your build file. |
Refer to the Repositories section to add Milestone and/or Snapshot Repositories to your build file. |
Additionally, you will need a configured EmbeddingModel
bean. Refer to the EmbeddingModel section for more information.
Here is an example of the needed bean:
@Bean
public EmbeddingModel embeddingModel() {
// Can be any other EmbeddingModel implementation.
return new OpenAiEmbeddingModel(new OpenAiApi(System.getenv("SPRING_AI_OPENAI_API_KEY")));
}
To connect to Typesense you need to provide access details for your instance. A simple configuration can either be provided via Spring Boot’s application.yml,
spring:
ai:
vectorstore:
typesense:
collectionName: "vector_store"
embeddingDimension: 1536
client:
protocl: http
host: localhost
port: 8108
apiKey: xyz
Please have a look at the list of configuration parameters for the vector store to learn about the default values and configuration options.
Now you can Auto-wire the Typesense Vector Store in your application and use it
@Autowired VectorStore vectorStore;
// ...
List <Document> documents = List.of(
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
new Document("The World is Big and Salvation Lurks Around the Corner"),
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));
// Add the documents to Typesense
vectorStore.add(documents);
// Retrieve documents similar to a query
List<Document> results = this.vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(5));
Configuration properties
You can use the following properties in your Spring Boot configuration to customize the Typesense vector store.
Property | Description | Default value |
---|---|---|
|
HTTP Protocol |
|
|
Hostname |
|
|
Port |
|
|
ApiKey |
|
|
Whether to initialize the required schema |
|
|
Collection Name |
|
|
Embedding Dimension |
|
Metadata filtering
You can leverage the generic, portable metadata filters with TypesenseVectorStore
as well.
For example, you can use either the text expression language:
vectorStore.similaritySearch(
SearchRequest
.query("The World")
.withTopK(TOP_K)
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
.withFilterExpression("country in ['UK', 'NL'] && year >= 2020"));
or programmatically using the expression DSL:
FilterExpressionBuilder b = new FilterExpressionBuilder();
vectorStore.similaritySearch(
SearchRequest
.query("The World")
.withTopK(TOP_K)
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
.withFilterExpression(b.and(
b.in("country", "UK", "NL"),
b.gte("year", 2020)).build()));
The portable filter expressions get automatically converted into Typesense Search Filters. For example, the following portable filter expression:
country in ['UK', 'NL'] && year >= 2020
is converted into Typesense filter:
country: ['UK', 'NL'] && year: >=2020
Manual configuration
If you prefer not to use the auto-configuration, you can manually configure the Typesense Vector Store. Add the Typesense Vector Store and Jedis dependencies
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-typesense</artifactId>
</dependency>
Refer to the Dependency Management section to add the Spring AI BOM to your build file. |
Then, create a TypesenseVectorStore
bean in your Spring configuration:
@Bean
public VectorStore vectorStore(Client client, EmbeddingModel embeddingModel) {
TypesenseVectorStoreConfig config = TypesenseVectorStoreConfig.builder()
.withCollectionName("test_vector_store")
.withEmbeddingDimension(embeddingModel.dimensions())
.build();
return new TypesenseVectorStore(client, embeddingModel, config);
}
@Bean
public Client typesenseClient() {
List<Node> nodes = new ArrayList<>();
nodes
.add(new Node("http", typesenseContainer.getHost(), typesenseContainer.getMappedPort(8108).toString()));
Configuration configuration = new Configuration(nodes, Duration.ofSeconds(5), "xyz");
return new Client(configuration);
}
It is more convenient and preferred to create the |
Then in your main code, create some documents:
List<Document> documents = List.of(
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("country", "UK", "year", 2020)),
new Document("The World is Big and Salvation Lurks Around the Corner", Map.of()),
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("country", "NL", "year", 2023)));
Now add the documents to your vector store:
vectorStore.add(documents);
And finally, retrieve documents similar to a query:
List<Document> results = vectorStore.similaritySearch(
SearchRequest
.query("Spring")
.withTopK(5));
If all goes well, you should retrieve the document containing the text "Spring AI rocks!!".
If you are not retrieving the documents in the expected order or the search results are not as expected, check the embedding model you are using. Embedding models can have a significant impact on the search results (i.e. make sure if your data is in Spanish to use a Spanish or multilingual embedding model). |