Typesense
This section walks you through setting up TypesenseVectorStore
to store document embeddings and perform similarity searches.
Typesense is an open source typo tolerant search engine that is optimized for instant sub-50ms searches while providing an intuitive developer experience. It provides vector search capabilities that allow you to store and query high-dimensional vectors alongside your regular search data.
Prerequisites
-
A running Typesense instance. The following options are available:
-
Typesense Cloud (recommended)
-
Docker image typesense/typesense:latest
-
-
If required, an API key for the EmbeddingModel to generate the embeddings stored by the
TypesenseVectorStore
.
Auto-configuration
Spring AI provides Spring Boot auto-configuration for the Typesense Vector Store.
To enable it add the following dependency to your project’s Maven pom.xml
file:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-typesense-spring-boot-starter</artifactId>
</dependency>
or to your Gradle build.gradle
build file.
dependencies {
implementation 'org.springframework.ai:spring-ai-typesense-spring-boot-starter'
}
Refer to the Dependency Management section to add the Spring AI BOM to your build file. |
Please have a look at the list of configuration parameters for the vector store to learn about the default values and configuration options.
Refer to the Repositories section to add Milestone and/or Snapshot Repositories to your build file. |
The vector store implementation can initialize the requisite schema for you but you must opt-in by setting …initialize-schema=true
in the application.properties
file.
Additionally you will need a configured EmbeddingModel
bean. Refer to the EmbeddingModel section for more information.
Now you can auto-wire the TypesenseVectorStore
as a vector store in your application:
@Autowired VectorStore vectorStore;
// ...
List<Document> documents = List.of(
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
new Document("The World is Big and Salvation Lurks Around the Corner"),
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));
// Add the documents to Typesense
vectorStore.add(documents);
// Retrieve documents similar to a query
List<Document> results = vectorStore.similaritySearch(SearchRequest.builder().query("Spring").topK(5).build());
Configuration Properties
To connect to Typesense and use the TypesenseVectorStore
you need to provide access details for your instance.
A simple configuration can be provided via Spring Boot’s application.yml
:
spring:
ai:
vectorstore:
typesense:
initialize-schema: true
collection-name: vector_store
embedding-dimension: 1536
client:
protocol: http
host: localhost
port: 8108
api-key: xyz
Properties starting with spring.ai.vectorstore.typesense.*
are used to configure the TypesenseVectorStore
:
Property | Description | Default Value |
---|---|---|
|
Whether to initialize the required schema |
|
|
The name of the collection to store vectors |
|
|
The number of dimensions in the vector |
|
|
HTTP Protocol |
|
|
Hostname |
|
|
Port |
|
|
API Key |
|
Manual Configuration
Instead of using the Spring Boot auto-configuration you can manually configure the Typesense vector store. For this you need to add the spring-ai-typesense-store
to your project:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-typesense-store</artifactId>
</dependency>
or to your Gradle build.gradle
build file.
dependencies {
implementation 'org.springframework.ai:spring-ai-typesense-store'
}
Refer to the Dependency Management section to add the Spring AI BOM to your build file. |
Create a Typesense Client
bean:
@Bean
public Client typesenseClient() {
List<Node> nodes = new ArrayList<>();
nodes.add(new Node("http", "localhost", "8108"));
Configuration configuration = new Configuration(nodes, Duration.ofSeconds(5), "xyz");
return new Client(configuration);
}
Then create the TypesenseVectorStore
bean using the builder pattern:
@Bean
public VectorStore vectorStore(Client client, EmbeddingModel embeddingModel) {
return TypesenseVectorStore.builder(client, embeddingModel)
.collectionName("custom_vectors") // Optional: defaults to "vector_store"
.embeddingDimension(1536) // Optional: defaults to 1536
.initializeSchema(true) // Optional: defaults to false
.batchingStrategy(new TokenCountBatchingStrategy()) // Optional: defaults to TokenCountBatchingStrategy
.build();
}
// This can be any EmbeddingModel implementation
@Bean
public EmbeddingModel embeddingModel() {
return new OpenAiEmbeddingModel(new OpenAiApi(System.getenv("OPENAI_API_KEY")));
}
Metadata Filtering
You can leverage the generic portable metadata filters with Typesense store as well.
For example you can use either the text expression language:
vectorStore.similaritySearch(
SearchRequest.builder()
.query("The World")
.topK(TOP_K)
.similarityThreshold(SIMILARITY_THRESHOLD)
.filterExpression("country in ['UK', 'NL'] && year >= 2020").build());
or programmatically using the Filter.Expression
DSL:
FilterExpressionBuilder b = new FilterExpressionBuilder();
vectorStore.similaritySearch(SearchRequest.builder()
.query("The World")
.topK(TOP_K)
.similarityThreshold(SIMILARITY_THRESHOLD)
.filterExpression(b.and(
b.in("country", "UK", "NL"),
b.gte("year", 2020)).build()).build());
Those (portable) filter expressions get automatically converted into Typesense Search Filters. |
For example this portable filter expression:
country in ['UK', 'NL'] && year >= 2020
is converted into the proprietary Typesense filter format:
country: ['UK', 'NL'] && year: >=2020
If you are not retrieving the documents in the expected order or the search results are not as expected, check the embedding model you are using. Embedding models can have a significant impact on the search results (i.e. make sure if your data is in Spanish to use a Spanish or multilingual embedding model). |