Enum Class BedrockCacheStrategy
- All Implemented Interfaces:
Serializable, Comparable<BedrockCacheStrategy>, Constable
Prompt caching reduces latency and costs by reusing previously processed prompt content. Cached content has a 5-minute Time To Live (TTL) that resets with each cache hit.
- Since:
- 1.1.0
- Author:
- Soby Chacko
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class Enum
Enum.EnumDesc<E> -
Enum Constant Summary
Enum ConstantsEnum ConstantDescriptionCache the entire conversation history up to and including the current user question.No caching (default behavior).Cache both tool definitions and system instructions.Cache system instructions only.Cache tool definitions only. -
Method Summary
Modifier and TypeMethodDescriptionstatic BedrockCacheStrategyReturns the enum constant of this class with the specified name.static BedrockCacheStrategy[]values()Returns an array containing the constants of this enum class, in the order they are declared.
-
Enum Constant Details
-
NONE
No caching (default behavior). All content is processed fresh on each request.Use this when:
- Requests are one-off or highly variable
- Content doesn't meet minimum token requirements (1024+ tokens for most models)
- You want to avoid caching overhead
-
SYSTEM_ONLY
Cache system instructions only. Places a cache breakpoint on the system message content. Tools are cached implicitly via Bedrock's automatic ~20-block lookback mechanism (content before the cache breakpoint is included in the cache).Use this when:
- System prompts are large and stable (1024+ tokens)
- Tool definitions are relatively small (<20 tools)
- You want simple, single-breakpoint caching
Note: Changing tools will invalidate the cache since tools are part of the cache prefix (they appear before system in the request hierarchy).
This is the recommended starting point for most use cases as it provides the best balance of simplicity and effectiveness.
-
TOOLS_ONLY
Cache tool definitions only. Places a cache breakpoint after the last tool definition. System messages and conversation history are not cached.Use this when:
- You have many tool definitions (20+ tools, 1024+ tokens total)
- Tools are stable but system prompts change frequently
- You want to cache tool schemas without caching system instructions
Important Model Compatibility:
- Supported: Claude 3.x and Claude 4.x models (all variants)
- Not Supported: Amazon Nova models (Nova Micro, Lite, Pro, Premier) - these models only support caching for system and messages, not tools
If you use this strategy with an unsupported model, AWS will return a ValidationException. Use
SYSTEM_ONLYinstead for Amazon Nova models.Note: If no tools are present in the request, this strategy is equivalent to NONE (no caching occurs).
-
SYSTEM_AND_TOOLS
Cache both tool definitions and system instructions. Places two cache breakpoints: one after the last tool definition, and one after the last system message.Use this when:
- Both tools and system prompts are large and stable (1024+ tokens each)
- You want maximum cache coverage
- You're willing to use 2 of your 4 available cache breakpoints
Important Model Compatibility:
- Supported: Claude 3.x and Claude 4.x models (all variants)
- Not Supported: Amazon Nova models (Nova Micro, Lite, Pro, Premier) - these models only support caching for system and messages, not tools
If you use this strategy with an unsupported model, AWS will return a ValidationException. Use
SYSTEM_ONLYinstead for Amazon Nova models.Cache Invalidation:
- Changing tools invalidates both cache breakpoints (tools are the prefix)
- Changing system prompts only invalidates the system cache (tools remain cached)
This provides the most comprehensive caching but uses more cache breakpoints.
-
CONVERSATION_HISTORY
Cache the entire conversation history up to and including the current user question. This is ideal for multi-turn conversations where you want to reuse the conversation context while asking new questions.A cache breakpoint is placed on the last user message in the conversation. This enables incremental caching where each conversation turn builds on the previous cached prefix, providing significant cost savings and performance improvements.
Use this when:
- Building multi-turn conversational applications (chatbots, assistants)
- Conversation history is substantial (1024+ tokens)
- Users are asking follow-up questions that require context from earlier messages
- You want to reduce latency and costs for ongoing conversations
Model Compatibility:
- Verified: Claude 3.x and Claude 4.x models (all variants)
- Note: Amazon Nova models theoretically support conversation caching, but have not been verified in integration tests
How it works:
- Identifies the last user message in the conversation
- Places cache breakpoint as the last content block on that message
- All messages up to and including the last user message are cached (system, previous user/assistant turns, and current user question)
- On the next turn, the cached context is reused and a new cache is created including the assistant response and new user question
Example conversation flow:
Turn 1: "My name is Alice" → Response cached Turn 2: "I work as a data scientist" → Response cached Turn 3: "What career advice would you give me?" ← Cache applies here (Turns 1-2 are read from cache, Turn 3 question is fresh)Cache behavior:
- First request: Creates cache (cacheWriteInputTokens > 0)
- Subsequent requests: Reads from cache (cacheReadInputTokens > 0)
- Cache TTL: 5 minutes (resets on each cache hit)
- Minimum content: 1024+ tokens required for caching to activate
-
-
Method Details
-
values
Returns an array containing the constants of this enum class, in the order they are declared.- Returns:
- an array containing the constants of this enum class, in the order they are declared
-
valueOf
Returns the enum constant of this class with the specified name. The string must match exactly an identifier used to declare an enum constant in this class. (Extraneous whitespace characters are not permitted.)- Parameters:
name- the name of the enum constant to be returned.- Returns:
- the enum constant with the specified name
- Throws:
IllegalArgumentException- if this enum class has no constant with the specified nameNullPointerException- if the argument is null
-