Interface SemanticCache

All Known Implementing Classes:
DefaultSemanticCache

public interface SemanticCache
Interface defining operations for a semantic cache implementation that stores and retrieves chat responses based on semantic similarity of queries. This cache uses vector embeddings to determine similarity between queries.

The semantic cache provides functionality to:

  • Store chat responses with their associated queries
  • Retrieve responses for semantically similar queries
  • Support time-based expiration of cached entries
  • Support context-based isolation (e.g., different system prompts)
  • Clear the entire cache

Implementations should ensure thread-safety and proper resource management.

Author:
Brian Sam-Bodden, Soby Chacko
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    Removes all entries from the cache.
    get(String query)
    Retrieves a cached response for a semantically similar query.
    get(String query, @Nullable String contextHash)
    Retrieves a cached response for a semantically similar query, filtered by context.
    Returns the underlying vector store used by this cache implementation.
    void
    set(String query, ChatResponse response)
    Stores a query and its corresponding chat response in the cache.
    default void
    set(String query, ChatResponse response, @Nullable String contextHash)
    Stores a query and its corresponding chat response in the cache with an optional context identifier for isolation.
    void
    set(String query, ChatResponse response, Duration ttl)
    Stores a query and response in the cache with a specified time-to-live duration.
  • Method Details

    • set

      void set(String query, ChatResponse response)
      Stores a query and its corresponding chat response in the cache. Implementations should handle vector embedding of the query and proper storage of both the query embedding and response.
      Parameters:
      query - The original query text to be cached
      response - The chat response associated with the query
    • set

      default void set(String query, ChatResponse response, @Nullable String contextHash)
      Stores a query and its corresponding chat response in the cache with an optional context identifier for isolation. The context hash ensures that cached responses are only returned for queries with matching context (e.g., same system prompt).
      Parameters:
      query - The original query text to be cached
      response - The chat response associated with the query
      contextHash - Optional hash identifier for context isolation (e.g., system prompt hash). If null, behaves the same as set(String, ChatResponse).
    • set

      void set(String query, ChatResponse response, Duration ttl)
      Stores a query and response in the cache with a specified time-to-live duration. After the TTL expires, the entry should be automatically removed from the cache.
      Parameters:
      query - The original query text to be cached
      response - The chat response associated with the query
      ttl - The duration after which the cache entry should expire
    • get

      Retrieves a cached response for a semantically similar query. The implementation should:
      • Convert the input query to a vector embedding
      • Search for similar query embeddings in the cache
      • Return the response associated with the most similar query if it meets the similarity threshold
      Parameters:
      query - The query to find similar responses for
      Returns:
      Optional containing the most similar cached response if found and meets similarity threshold, empty Optional otherwise
    • get

      default Optional<ChatResponse> get(String query, @Nullable String contextHash)
      Retrieves a cached response for a semantically similar query, filtered by context. Only returns responses that were stored with the same context hash, ensuring isolation between different contexts (e.g., different system prompts).
      Parameters:
      query - The query to find similar responses for
      contextHash - Optional hash identifier for context filtering. If null, behaves the same as get(String).
      Returns:
      Optional containing the most similar cached response if found, matches context, and meets similarity threshold; empty Optional otherwise
    • clear

      void clear()
      Removes all entries from the cache. This operation should be atomic and thread-safe.
    • getStore

      VectorStore getStore()
      Returns the underlying vector store used by this cache implementation. This allows access to lower-level vector operations if needed.
      Returns:
      The VectorStore instance used by this cache