All Known Implementing Classes:: DefaultSemanticCache

public interface SemanticCache

Interface defining operations for a semantic cache implementation that stores and retrieves chat responses based on semantic similarity of queries. This cache uses vector embeddings to determine similarity between queries.

The semantic cache provides functionality to:

Store chat responses with their associated queries
Retrieve responses for semantically similar queries
Support time-based expiration of cached entries
Support context-based isolation (e.g., different system prompts)
Clear the entire cache

Implementations should ensure thread-safety and proper resource management.

Author:: Brian Sam-Bodden, Soby Chacko

Method Summary

Modifier and Type

Method

Description

void

clear()

Removes all entries from the cache.

Optional<ChatResponse>

get(String query)

Retrieves a cached response for a semantically similar query.

default Optional<ChatResponse>

get(String query, @Nullable String contextHash)

Retrieves a cached response for a semantically similar query, filtered by context.

VectorStore

getStore()

Returns the underlying vector store used by this cache implementation.

void

set(String query, ChatResponse response)

Stores a query and its corresponding chat response in the cache.

default void

set(String query, ChatResponse response, @Nullable String contextHash)

Stores a query and its corresponding chat response in the cache with an optional context identifier for isolation.

void

set(String query, ChatResponse response, Duration ttl)

Stores a query and response in the cache with a specified time-to-live duration.

Method Details
- set
  
  void set(String query, ChatResponse response)
  
  Stores a query and its corresponding chat response in the cache. Implementations should handle vector embedding of the query and proper storage of both the query embedding and response.
  
  Parameters:
  
  query - The original query text to be cached
  
  response - The chat response associated with the query
- set
  
  default void set(String query, ChatResponse response, @Nullable String contextHash)
  
  Stores a query and its corresponding chat response in the cache with an optional context identifier for isolation. The context hash ensures that cached responses are only returned for queries with matching context (e.g., same system prompt).
  
  Parameters:
  
  query - The original query text to be cached
  
  response - The chat response associated with the query
  
  contextHash - Optional hash identifier for context isolation (e.g., system prompt hash). If null, behaves the same as set(String, ChatResponse).
- set
  
  void set(String query, ChatResponse response, Duration ttl)
  
  Stores a query and response in the cache with a specified time-to-live duration. After the TTL expires, the entry should be automatically removed from the cache.
  
  Parameters:
  
  query - The original query text to be cached
  
  response - The chat response associated with the query
  
  ttl - The duration after which the cache entry should expire
- get
  
  Optional<ChatResponse> get(String query)
  Retrieves a cached response for a semantically similar query. The implementation should:
  
  Convert the input query to a vector embedding
  
  Search for similar query embeddings in the cache
  
  Return the response associated with the most similar query if it meets the similarity threshold
  Parameters:
  
  query - The query to find similar responses for
  
  Returns:
  
  Optional containing the most similar cached response if found and meets similarity threshold, empty Optional otherwise
- get
  
  default Optional<ChatResponse> get(String query, @Nullable String contextHash)
  
  Retrieves a cached response for a semantically similar query, filtered by context. Only returns responses that were stored with the same context hash, ensuring isolation between different contexts (e.g., different system prompts).
  
  Parameters:
  
  query - The query to find similar responses for
  
  contextHash - Optional hash identifier for context filtering. If null, behaves the same as get(String).
  
  Returns:
  
  Optional containing the most similar cached response if found, matches context, and meets similarity threshold; empty Optional otherwise
- clear
  
  void clear()
  
  Removes all entries from the cache. This operation should be atomic and thread-safe.
- getStore
  
  VectorStore getStore()
  
  Returns the underlying vector store used by this cache implementation. This allows access to lower-level vector operations if needed.
  
  Returns:
  
  The VectorStore instance used by this cache

Interface SemanticCache

Method Summary

Method Details

set

set

set

get

get

clear

getStore