Class CassandraVectorStore
java.lang.Object
org.springframework.ai.vectorstore.observation.AbstractObservationVectorStore
org.springframework.ai.vectorstore.cassandra.CassandraVectorStore
- All Implemented Interfaces:
AutoCloseable,Consumer<List<Document>>,DocumentWriter,VectorStore,VectorStoreRetriever
The CassandraVectorStore is for managing and querying vector data in an Apache
Cassandra db. It offers functionalities like adding, deleting, and performing
similarity searches on documents.
The store utilizes CQL to index and search vector data. It allows for custom metadata
fields in the documents to be stored alongside the vector and content data.
This class requires a CassandraVectorStore#CassandraBuilder configuration object for
initialization, which includes settings like connection details, index name, column
names, etc. It also requires an EmbeddingModel to convert documents into embeddings
before storing them.
A schema matching the configuration is automatically created if it doesn't exist.
Missing columns and indexes in existing tables will also be automatically created.
Disable this with the CassandraBuilder#initializeSchema(boolean) method().
Basic usage example:
CassandraVectorStore vectorStore = CassandraVectorStore.builder(embeddingModel)
.session(cqlSession)
.keyspace("my_keyspace")
.table("my_vectors")
.build();
// Add documents
vectorStore.add(List.of(
new Document("1", "content1", Map.of("key1", "value1")),
new Document("2", "content2", Map.of("key2", "value2"))
));
// Search with filters
List<Document> results = vectorStore.similaritySearch(
SearchRequest.query("search text")
.withTopK(5)
.withSimilarityThreshold(0.7)
.withFilterExpression("metadata.key1 == 'value1'")
);
Advanced configuration example:
CassandraVectorStore vectorStore = CassandraVectorStore.builder(embeddingModel)
.session(cqlSession)
.keyspace("my_keyspace")
.table("my_vectors")
.partitionKeys(List.of(new SchemaColumn("id", DataTypes.TEXT)))
.clusteringKeys(List.of(new SchemaColumn("timestamp", DataTypes.TIMESTAMP)))
.addMetadataColumns(
new SchemaColumn("category", DataTypes.TEXT, SchemaColumnTags.INDEXED),
new SchemaColumn("score", DataTypes.DOUBLE)
)
.contentColumnName("text")
.embeddingColumnName("vector")
.fixedThreadPoolExecutorSize(32)
.initializeSchema(true)
.batchingStrategy(new TokenCountBatchingStrategy())
.build();
This class is designed to work with brand new tables that it creates for you, or on top
of existing Cassandra tables. The latter is appropriate when wanting to keep data in
place, creating embeddings next to it, and performing vector similarity searches
in-situ.
Instances of this class are not dynamic against server-side schema changes. If you
change the schema server-side you need a new CassandraVectorStore instance.
When adding documents with the method AbstractObservationVectorStore.add(List<Document>) it first calls
embeddingModel to create the embeddings. This is slow. Configure
CassandraVectorStore.Builder.fixedThreadPoolExecutorSize(int) accordingly to improve performance so
embeddings are created and the documents are added concurrently. The default
concurrency is 16 (DEFAULT_ADD_CONCURRENCY). Remote transformers
probably want higher concurrency, and local transformers may need lower concurrency.
This concurrency limit does not need to be higher than the max parallel calls made to
the AbstractObservationVectorStore.add(List<Document>) method multiplied by the list size. This setting can
also serve as a protecting throttle against your embedding model.- Since:
- 1.0.0
- Author:
- Mick Semb Wever, Christian Tzolov, Thomas Vitale, Soby Chacko
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classBuilder for the Cassandra vector store.static interfaceGiven a string document id, return the value for each primary key column.static interfaceGiven a list of primary key column values, return the document id.static final recordstatic enumstatic enumIndexes are automatically created with COSINE. -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final StringFields inherited from class org.springframework.ai.vectorstore.observation.AbstractObservationVectorStore
batchingStrategy, embeddingModel -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotected -
Method Summary
Modifier and TypeMethodDescriptionstatic CassandraVectorStore.Builderbuilder(EmbeddingModel embeddingModel) voidclose()createObservationContextBuilder(String operationName) Create a newVectorStoreObservationContext.Builderinstance.voidPerform the actual add operation.voidPerform the actual delete operation.protected voiddoDelete(Filter.Expression filterExpression) Template method for concrete implementations to provide filter-based deletion logic.doSimilaritySearch(SearchRequest request) Perform the actual similarity search operation.<T> Optional<T> Returns the native client if available in this vector store implementation.Methods inherited from class org.springframework.ai.vectorstore.observation.AbstractObservationVectorStore
add, delete, delete, similaritySearchMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.springframework.ai.document.DocumentWriter
writeMethods inherited from interface org.springframework.ai.vectorstore.VectorStore
accept, delete, getNameMethods inherited from interface org.springframework.ai.vectorstore.VectorStoreRetriever
similaritySearch
-
Field Details
-
DEFAULT_KEYSPACE_NAME
- See Also:
-
DEFAULT_TABLE_NAME
- See Also:
-
DEFAULT_ID_NAME
- See Also:
-
DEFAULT_INDEX_SUFFIX
- See Also:
-
DEFAULT_CONTENT_COLUMN_NAME
- See Also:
-
DEFAULT_EMBEDDING_COLUMN_NAME
- See Also:
-
DEFAULT_ADD_CONCURRENCY
public static final int DEFAULT_ADD_CONCURRENCY- See Also:
-
DRIVER_PROFILE_UPDATES
- See Also:
-
DRIVER_PROFILE_SEARCH
- See Also:
-
-
Constructor Details
-
CassandraVectorStore
-
-
Method Details
-
builder
-
doAdd
Description copied from class:AbstractObservationVectorStorePerform the actual add operation.- Specified by:
doAddin classAbstractObservationVectorStore- Parameters:
documents- the documents to add
-
doDelete
Description copied from class:AbstractObservationVectorStorePerform the actual delete operation.- Specified by:
doDeletein classAbstractObservationVectorStore- Parameters:
idList- the list of document IDs to delete
-
doDelete
Description copied from class:AbstractObservationVectorStoreTemplate method for concrete implementations to provide filter-based deletion logic.- Overrides:
doDeletein classAbstractObservationVectorStore- Parameters:
filterExpression- Filter expression to identify documents to delete
-
doSimilaritySearch
Description copied from class:AbstractObservationVectorStorePerform the actual similarity search operation.- Specified by:
doSimilaritySearchin classAbstractObservationVectorStore- Parameters:
request- the search request- Returns:
- the list of documents that match the query request conditions
-
createObservationContextBuilder
Description copied from class:AbstractObservationVectorStoreCreate a newVectorStoreObservationContext.Builderinstance.- Specified by:
createObservationContextBuilderin classAbstractObservationVectorStore- Parameters:
operationName- the operation name- Returns:
- the observation context builder
-
close
- Specified by:
closein interfaceAutoCloseable- Throws:
Exception
-
getNativeClient
Description copied from interface:VectorStoreReturns the native client if available in this vector store implementation. Note on usage: 1. Returns empty Optional when no native client is available 2. Due to Java type erasure, runtime type checking is not possible Example usage: When working with implementation with known native client: Optionalclient = vectorStore.getNativeClient(); Note: Using Optionalinvalid input: '<'?> will return the native client if one exists, rather than an empty Optional. For type safety, prefer using the specific client type. - Specified by:
getNativeClientin interfaceVectorStore- Type Parameters:
T- The type of the native client- Returns:
- Optional containing native client if available, empty Optional otherwise
-