Class AnthropicChatModel

java.lang.Object
org.springframework.ai.anthropic.AnthropicChatModel
All Implemented Interfaces:
ChatModel, StreamingChatModel, Model<Prompt,ChatResponse>, StreamingModel<Prompt,ChatResponse>

public final class AnthropicChatModel extends Object implements ChatModel, StreamingChatModel
ChatModel and StreamingChatModel implementation using the official Anthropic Java SDK.

Supports synchronous and streaming completions, tool calling, and Micrometer-based observability. API credentials are auto-detected from ANTHROPIC_API_KEY if not configured.

Observability. Two layers of Micrometer observations are emitted: a gen_ai.client.operation span per chat-model call (with token usage, model metadata, and request parameters), and an okhttp.requests span per outbound HTTP attempt (with HTTP method, URI, status code, and traceparent propagation). Optional OkHttp connection-pool gauges are bound to the MeterRegistry when supplied. For synchronous calls the HTTP span nests under the chat-model span; for streaming calls the HTTP span fires but is not parented under the chat-model span due to an SDK-internal thread boundary — see stream(org.springframework.ai.chat.prompt.Prompt).

Since:
1.0.0
Author:
Christian Tzolov, luocongqiu, Mariusz Bernacki, Thomas Vitale, Claudio Silva Junior, Alexandros Pappas, Jonghoon Park, Soby Chacko, Austin Dase, Sebastien Deleuze, Ilayaperumal Gopinathan
See Also:
  • Method Details

    • builder

      public static AnthropicChatModel.Builder builder()
      Creates a new builder for AnthropicChatModel.
      Returns:
      a new builder instance
    • getOptions

      public AnthropicChatOptions getOptions()
      Gets the chat options for this model.
      Specified by:
      getOptions in interface ChatModel
      Returns:
      the chat options
      Since:
      2.0.0
    • getAnthropicClient

      public com.anthropic.client.AnthropicClient getAnthropicClient()
      Returns the underlying synchronous Anthropic SDK client. Useful for accessing SDK features directly, such as the Files API (client.beta().files()).
      Returns:
      the sync client
    • getAnthropicClientAsync

      public com.anthropic.client.AnthropicClientAsync getAnthropicClientAsync()
      Returns the underlying asynchronous Anthropic SDK client. Useful for non-blocking access to SDK features directly, such as the Files API.
      Returns:
      the async client
    • call

      public ChatResponse call(Prompt prompt)
      Description copied from interface: Model
      Executes a method call to the AI model.
      Specified by:
      call in interface ChatModel
      Specified by:
      call in interface Model<Prompt,ChatResponse>
      Parameters:
      prompt - the request object to be sent to the AI model
      Returns:
      the response from the AI model
    • stream

      public reactor.core.publisher.Flux<ChatResponse> stream(Prompt prompt)
      Streams the chat completion as a Flux of ChatResponse events.

      Observability note. The outbound HTTP attempt is observed as okhttp.requests with timer + traceparent, but for streaming calls the HTTP span is not parented under the chat-model's gen_ai.client.operation span. The SDK's async path internally schedules the HTTP call on ForkJoinPool.commonPool() before Spring AI's HTTP client runs, which drops the calling thread's observation context. Filter by okhttp.requests + host api.anthropic.com and correlate by trace ID or timestamp if you need to join the spans in your tracing UI.

      Specified by:
      stream in interface ChatModel
      Specified by:
      stream in interface StreamingChatModel
      Specified by:
      stream in interface StreamingModel<Prompt,ChatResponse>
      Parameters:
      prompt - the prompt
      Returns:
      a Flux of streamed ChatResponse events
    • internalStream

      public reactor.core.publisher.Flux<ChatResponse> internalStream(Prompt prompt, @Nullable ChatResponse previousChatResponse)
      Internal method to handle streaming chat completion calls with tool execution support. This method is called recursively to support multi-turn tool calling.

      Rate-limit headers are read from the streaming response via withRawResponse().createStreaming(...) and attached to the aggregated ChatResponse. Because that SDK call exposes the stream as a blocking StreamResponse, the events are pulled on Schedulers.boundedElastic(); a streaming call therefore holds a worker thread for the duration of the stream.

      Parameters:
      prompt - The prompt for the chat completion. In a recursive tool-call scenario, this prompt will contain the full conversation history including the tool results.
      previousChatResponse - The chat response from the preceding API call. This is used to accumulate token usage correctly across multiple API calls in a single user turn.
      Returns:
      A Flux of ChatResponse events, which can include text chunks and the final response with tool call information or the model's final answer.
    • internalCall

      public ChatResponse internalCall(Prompt prompt, @Nullable ChatResponse previousChatResponse)
      Internal method to handle synchronous chat completion calls with tool execution support. This method is called recursively to support multi-turn tool calling.
      Parameters:
      prompt - The prompt for the chat completion. In a recursive tool-call scenario, this prompt will contain the full conversation history including the tool results.
      previousChatResponse - The chat response from the preceding API call. This is used to accumulate token usage correctly across multiple API calls in a single user turn.
      Returns:
      The final ChatResponse after all tool calls (if any) are resolved.
    • setObservationConvention

      public void setObservationConvention(ChatModelObservationConvention observationConvention)
      Use the provided convention for reporting observation data.
      Parameters:
      observationConvention - the provided convention