Interface SemanticCache

All Known Implementing Classes:
DefaultSemanticCache

public interface SemanticCache
Interface defining operations for a semantic cache implementation that stores and retrieves chat responses based on semantic similarity of queries. This cache uses vector embeddings to determine similarity between queries.

The semantic cache provides functionality to:

  • Store chat responses with their associated queries
  • Retrieve responses for semantically similar queries
  • Support time-based expiration of cached entries
  • Clear the entire cache

Implementations should ensure thread-safety and proper resource management.

Author:
Brian Sam-Bodden
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    Removes all entries from the cache.
    get(String query)
    Retrieves a cached response for a semantically similar query.
    Returns the underlying vector store used by this cache implementation.
    void
    set(String query, ChatResponse response)
    Stores a query and its corresponding chat response in the cache.
    void
    set(String query, ChatResponse response, Duration ttl)
    Stores a query and response in the cache with a specified time-to-live duration.
  • Method Details

    • set

      void set(String query, ChatResponse response)
      Stores a query and its corresponding chat response in the cache. Implementations should handle vector embedding of the query and proper storage of both the query embedding and response.
      Parameters:
      query - The original query text to be cached
      response - The chat response associated with the query
    • set

      void set(String query, ChatResponse response, Duration ttl)
      Stores a query and response in the cache with a specified time-to-live duration. After the TTL expires, the entry should be automatically removed from the cache.
      Parameters:
      query - The original query text to be cached
      response - The chat response associated with the query
      ttl - The duration after which the cache entry should expire
    • get

      Retrieves a cached response for a semantically similar query. The implementation should:
      • Convert the input query to a vector embedding
      • Search for similar query embeddings in the cache
      • Return the response associated with the most similar query if it meets the similarity threshold
      Parameters:
      query - The query to find similar responses for
      Returns:
      Optional containing the most similar cached response if found and meets similarity threshold, empty Optional otherwise
    • clear

      void clear()
      Removes all entries from the cache. This operation should be atomic and thread-safe.
    • getStore

      VectorStore getStore()
      Returns the underlying vector store used by this cache implementation. This allows access to lower-level vector operations if needed.
      Returns:
      The VectorStore instance used by this cache