|
This version is still in development and is not considered stable yet. For the latest stable version, please use Spring AI 1.1.6! |
Migrating the Anthropic Module to the Official Java SDK
In 2.0.0-M3, spring-ai-anthropic was rewritten on top of the official com.anthropic:anthropic-java SDK, replacing the hand-rolled RestClient / WebClient implementation. The previous AnthropicApi class was a 2,300-line file with 47 nested DTO records.
The new module is a thin adapter over the SDK rather than a parallel API. Spring AI’s value is in its own abstractions — ChatModel, ChatClient, advisors, observability, auto-config — and in features that work across providers. Where the SDK already covers a concern (cache-control modeling, streaming, rate-limit handling), the previous module’s wrapper was dropped instead of carried forward, so applications use the SDK’s type directly. This keeps the surface area small and avoids drift as Anthropic ships new SDK releases.
The Maven coordinates, the spring-ai-starter-model-anthropic Boot starter, and the spring.ai.anthropic.* configuration properties are all unchanged.
The ChatClient API is unchanged.
ChatModel.call(Prompt) and ChatModel.stream(Prompt) keep their signatures.
AnthropicChatOptions keeps all existing fields, adding new ones for skills, web search, service tier, inference geo, and structured output.
What Changed
| Area | Change |
|---|---|
|
Public constructors removed. Use |
|
Removed. For direct API access, use the SDK’s |
|
Moved out of |
|
Renamed to |
|
Removed without direct replacement. |
Default |
Changed from |
Transitive |
New, pulled in by |
If You Use Only ChatClient or ChatModel
If your code looks like this, no migration is needed:
@Autowired ChatClient.Builder builder;
String response = builder.build()
.prompt("Tell me a joke")
.call()
.content();
The auto-configuration produces an AnthropicChatModel bean wired to the new SDK-based implementation. Calling code is unaffected.
The one behavioral change to watch for is the new default maxTokens value (see Default maxTokens is now 4096).
If You Construct AnthropicChatModel Programmatically
Replace direct constructor usage with the builder:
// Before
AnthropicApi anthropicApi = new AnthropicApi(apiKey);
AnthropicChatModel chatModel = new AnthropicChatModel(anthropicApi,
AnthropicChatOptions.builder().model("claude-haiku-4-5").maxTokens(2048).build(),
retryTemplate,
toolCallingManager);
// After
AnthropicChatModel chatModel = AnthropicChatModel.builder()
.apiKey(apiKey)
.defaultOptions(AnthropicChatOptions.builder()
.model("claude-haiku-4-5")
.maxTokens(2048)
.build())
.toolCallingManager(toolCallingManager)
.build();
The builder also accepts baseUrl, timeout, maxRetries, proxy, customHeaders, observationRegistry, and observationConvention.
There is no retryTemplate builder method; retry is now handled by the SDK (see Retry uses SDK maxRetries, not RetryTemplate).
If You Imported Cache or Citation Types
The cache and citation helper classes moved out of the api (and api.utils) subpackage into the root org.springframework.ai.anthropic package.
Update imports as follows:
| Old import | New import |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Enum values of AnthropicCacheStrategy (NONE, TOOLS_ONLY, SYSTEM_ONLY, SYSTEM_AND_TOOLS, CONVERSATION_HISTORY) and AnthropicCacheTtl (FIVE_MINUTES, ONE_HOUR) are unchanged.
The plainText(…), pdf(…), and customContent(…) factory methods still exist on the renamed AnthropicCitationDocument.
If You Used AnthropicApi Directly
AnthropicApi, its nested DTO records, and AnthropicCacheType are gone. Before reaching for the SDK client, consider whether AnthropicChatModel covers what you were doing. It usually does, and you keep the framework integration.
AnthropicChatModel adds the following on top of a raw AnthropicClient:
-
Provider-neutral request and response types (
Prompt,ChatResponse,Generation,Usage), so application code doesn’t depend oncom.anthropic.*. -
Tool calling integrated with
ToolCallbackand theToolCallingManagerloop, including automatic multi-turn execution. -
Streaming as a Reactor
Flux<ChatResponse>rather than the SDK’s callback-basedAsyncStreamResponse. -
Prompt caching with strategy modeling, TTL controls, and 4-breakpoint enforcement (
AnthropicCacheOptions). -
Citationwith four location variants surfaced underChatResponseMetadata. -
Skills, built-in web search, service tier, inference geo, and structured output as
AnthropicChatOptionsfields, all bindable fromspring.ai.anthropic.chat.*. -
Spring Boot auto-config and Micrometer observation.
-
The
ChatClientpipeline above the model (advisors, message templates, RAG, conversation memory, structured-output converter). -
Provider portability: the same
ChatClientcode runs against OpenAI, Bedrock, Google GenAI, and others.
For typical chat usage, switch to AnthropicChatModel.builder() (see If You Construct AnthropicChatModel Programmatically):
// Before
AnthropicApi api = new AnthropicApi(apiKey);
AnthropicApi.ChatCompletionRequest req = new AnthropicApi.ChatCompletionRequest(
AnthropicApi.ChatModel.CLAUDE_HAIKU_4_5.getValue(),
List.of(new AnthropicApi.AnthropicMessage(List.of(new AnthropicApi.ContentBlock("hello")), AnthropicApi.Role.USER)),
null, 1024, null, 0.7, null, null, null, null, false);
ResponseEntity<AnthropicApi.ChatCompletionResponse> resp = api.chatCompletionEntity(req);
// After
AnthropicChatModel chatModel = AnthropicChatModel.builder()
.apiKey(apiKey)
.defaultOptions(AnthropicChatOptions.builder()
.model("claude-haiku-4-5")
.maxTokens(1024)
.temperature(0.7)
.build())
.build();
ChatResponse response = chatModel.call(new Prompt("hello"));
If you genuinely need an Anthropic API surface that AnthropicChatModel doesn’t expose (a beta endpoint, files API, custom skills CRUD), drop to the SDK client. You’re outside the framework at that point — no observability, no provider neutrality, no ChatClient pipeline:
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.messages.MessageCreateParams;
import com.anthropic.models.messages.Model;
AnthropicClient client = AnthropicOkHttpClient.builder().apiKey(apiKey).build();
MessageCreateParams params = MessageCreateParams.builder()
.model(Model.CLAUDE_HAIKU_4_5)
.maxTokens(1024)
.temperature(0.7)
.addUserMessage("hello")
.build();
com.anthropic.models.messages.Message message = client.messages().create(params);
The hand-rolled record types have direct analogues in the SDK under com.anthropic.models.messages.*:
Removed type (old AnthropicApi.*) |
SDK replacement (com.anthropic.models.messages.*) |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ToolUseBlock.input() returns the SDK’s JsonValue, not a JSON string. Calling .toString() on a JsonValue produces Java map syntax ({key=value}) that looks like JSON but is not. Walk it with JsonValue.Visitor<T> or serialize via Jackson to get a real JSON string.
|
Removed in Favor of SDK Equivalents
These types were dropped because the SDK already exposes the same concept. Use the SDK type directly.
| Removed | Use instead |
|---|---|
|
|
|
|
|
|
Behavior Changes
Prompt-level options no longer merge with model defaults
The previous module merged prompt-level AnthropicChatOptions into the model-level defaults via ModelOptionsUtils.copyToTarget(…) and ModelOptionsUtils.merge(…), so model, temperature, and any other unset fields were filled in from the model’s defaults:
// Before: only maxTokens set; model + temperature inherited from defaults.
Prompt prompt = new Prompt(
"Tell me a joke",
AnthropicChatOptions.builder().maxTokens(2048).build());
chatModel.call(prompt);
The new module does not merge. A prompt-level options instance is used as-is; if the prompt has no options, the model defaults are used.
// After: prompt-level options must be "full", or null.
Prompt prompt = new Prompt(
"Tell me a joke",
chatModel.getDefaultOptions().mutate().maxTokens(2048).build());
chatModel.call(prompt);
This is a forward port of the broader change in Upgrading to 2.0.0-M5 — ChatOptions Handling, landed early for Anthropic.
ChatClient performs its own merging before reaching the model, so callers using ChatClient are not affected.
Default maxTokens is now 4096
AnthropicChatOptions defaults maxTokens to 4096 instead of 500.
The previous default routinely truncated responses; the new value matches the other Spring AI chat modules.
If you relied on 500 to bound costs, set it explicitly:
spring.ai.anthropic.chat.max-tokens=500
Retry uses SDK maxRetries, not RetryTemplate
The previous module took a Spring Retry RetryTemplate in its constructor.
Retries are now handled by the SDK and configured through maxRetries (default 2):
spring.ai.anthropic.max-retries=5
Any RetryTemplate bean wired specifically for the Anthropic module can be removed.
For finer control, configure the SDK client through a custom AnthropicSetup.
Streaming thinking events
Two changes affect anyone subscribed to the raw Flux<ChatResponse> stream (rather than letting ChatClient or MessageAggregator collapse it).
The previous module bundled the text and signature of a thinking block into a single Generation. The SDK delivers them as separate events, and the new module forwards them that way: a Generation with properties.signature arrives after the thinking-text deltas, before any subsequent text deltas. MessageAggregator and `ChatClient’s built-in aggregation absorb the extra chunk transparently.
Thinking-text deltas now also include properties.thinking = Boolean.TRUE. The previous module emitted them as plain content, leaving callers no reliable way to distinguish thinking text from response text mid-stream.
The full set of streaming metadata keys:
| Block | Generation carries |
|---|---|
Thinking text delta |
|
Thinking signature delta |
empty |
Redacted thinking block |
empty |
Sync thinking block (non-streaming) |
|
Prompt caching changes
Three behaviors to know about if you configured AnthropicCacheOptions:
-
The 4-breakpoint limit is enforced in Spring AI, not at the API. Anthropic permits at most 4 cache breakpoints per request. The previous module passed markers through and let the API reject excess.
CacheBreakpointTrackernow keeps count and silently skips additions past 4, logging a one-timeWARN. The most likely way to hit the cap isSYSTEM_AND_TOOLScombined with multi-block system caching and citation documents. Requests that previously failed with an API error can now succeed with reduced caching, so check your cache hit rate after the upgrade. -
AnthropicCacheStrategy.NONEmeans "unset", not "disabled". SettingNONEat the prompt level falls back to the model default. To turn caching off when the model default has it on, build a secondAnthropicChatModelwithAnthropicCacheOptions.disabled()as its default. -
Tool cache TTL is taken from
MessageType.SYSTEM.resolveToolCacheControllooks upmessageTypeTtl(MessageType.SYSTEM)regardless of strategy. SettingmessageTypeTtl(MessageType.USER, ONE_HOUR)has no effect on tool caching.
Citation document consistency is validated client-side
The Anthropic API requires every DocumentBlockParam with citation configuration in a single request to share the same citations.enabled value.
AnthropicChatOptions.validateCitationConsistency() now enforces this and throws IllegalArgumentException before the request is sent. The previous module let the API return an HTTP 400.
Tests or call sites that mixed enabled and disabled citation documents will now fail at build time instead of on the network call.
New OkHttp Transitive Dependency
com.anthropic:anthropic-java pulls in com.squareup.okhttp3:okhttp.
Most applications will not notice; if you have strict dependency-convergence rules or an existing OkHttp pin, you may need a <dependencyManagement> entry.
New Capabilities
The migration also enables several Anthropic features. See Anthropic Chat for the full reference.
-
Native skills (
AnthropicSkill,AnthropicSkillContainer). -
Built-in web search tool (
AnthropicWebSearchTool). -
Service tier selection (
AnthropicServiceTier). -
Inference geo for data residency (
us,eu). -
Native structured output through
JsonOutputFormatandEffort(requiresclaude-sonnet-4-6or newer). -
Extended-thinking display modes (summarized / omitted).
-
Citationtype with four location variants (CHAR_LOCATION,PAGE_LOCATION,CONTENT_BLOCK_LOCATION,WEB_SEARCH_RESULT_LOCATION). -
Per-request HTTP headers on
AnthropicChatOptions#httpHeaders, distinct from client-levelcustomHeadersonAbstractAnthropicOptions.customHeadersis set once on the client and applies to every request;httpHeadersis set perPromptand merged in at request-build time. Useful for request tracing, beta-API toggles, and routing.
Things That Fail Silently
The compile errors are easy. These don’t throw; they produce different output instead.
-
Partial prompt-level options.
new Prompt(text, AnthropicChatOptions.builder().maxTokens(2048).build())no longer inheritsmodel,temperature, etc. from the model’s defaults (see Prompt-level options no longer merge with model defaults). No compile error, no exception; the request runs with different values. If output drifts after the upgrade and you callChatModeldirectly, look here first. -
If costs climb after the upgrade, check whether you were relying on the old
500-tokenmaxTokensdefault to cap responses. -
Cache breakpoints dropped past four. Stacking multi-block system caching with tools and citation documents can push past Anthropic’s 4-breakpoint cap.
CacheBreakpointTrackerskips the extras and logsWARNonce. Cache hit rate drops without an error. -
Stale
org.springframework.ai.anthropic.api.AnthropicCache*imports are the source of most compile errors after the upgrade. A project-wide find-and-replace fixes them.