Using Chat/Embedding Response Usage
Overview
Spring AI has enhanced its Model Usage handling by introducing getNativeUsage()
method in the Usage interface and providing a DefaultUsage
implementation.
This change simplifies how different AI models can track and report their usage metrics while maintaining consistency across the framework.
Key Changes
Usage Interface Enhancement
The Usage
interface now includes a new method:
Object getNativeUsage();
This method allows access to the model-specific native usage data, enabling more detailed usage tracking when needed.
Using with ChatClient
Here’s a complete example showing how to track usage with OpenAI’s ChatClient:
@SpringBootConfiguration
public class Configuration {
@Bean
public OpenAiApi chatCompletionApi() {
return new OpenAiApi(System.getenv("OPENAI_API_KEY"));
}
@Bean
public OpenAiChatModel openAiClient(OpenAiApi openAiApi) {
return new OpenAiChatModel(openAiApi);
}
}
@Service
public class ChatService {
private final OpenAiChatModel chatModel;
public ChatService(OpenAiChatModel chatModel) {
this.chatModel = chatModel;
}
public void demonstrateUsage() {
// Create a chat prompt
Prompt prompt = new Prompt("What is the weather like today?");
ChatClient chatClient = ChatClient.builder(this.chatModel).build();
// Get the chat response
ChatResponse response = chatClient.call(prompt);
// Access the usage information
Usage usage = response.getMetadata().getUsage();
// Get standard usage metrics
System.out.println("Prompt Tokens: " + usage.getPromptTokens());
System.out.println("Completion Tokens: " + usage.getCompletionTokens());
System.out.println("Total Tokens: " + usage.getTotalTokens());
// Access native OpenAI usage data with detailed token information
if (usage.getNativeUsage() instanceof org.springframework.ai.openai.api.OpenAiApi.Usage) {
org.springframework.ai.openai.api.OpenAiApi.Usage nativeUsage =
(org.springframework.ai.openai.api.OpenAiApi.Usage) usage.getNativeUsage();
// Detailed prompt token information
System.out.println("Prompt Tokens Details:");
System.out.println("- Audio Tokens: " + nativeUsage.promptTokensDetails().audioTokens());
System.out.println("- Cached Tokens: " + nativeUsage.promptTokensDetails().cachedTokens());
// Detailed completion token information
System.out.println("Completion Tokens Details:");
System.out.println("- Reasoning Tokens: " + nativeUsage.completionTokenDetails().reasoningTokens());
System.out.println("- Accepted Prediction Tokens: " + nativeUsage.completionTokenDetails().acceptedPredictionTokens());
System.out.println("- Audio Tokens: " + nativeUsage.completionTokenDetails().audioTokens());
System.out.println("- Rejected Prediction Tokens: " + nativeUsage.completionTokenDetails().rejectedPredictionTokens());
}
}
}
Benefits
Standardization: Provides a consistent way to handle usage across different AI models Flexibility: Supports model-specific usage data through the native usage feature Simplification: Reduces boilerplate code with the default implementation Extensibility: Easy to extend for specific model requirements while maintaining compatibility
Type Safety Considerations
When working with native usage data, consider type casting carefully:
// Safe way to access native usage
if (usage.getNativeUsage() instanceof org.springframework.ai.openai.api.OpenAiApi.Usage) {
org.springframework.ai.openai.api.OpenAiApi.Usage nativeUsage =
(org.springframework.ai.openai.api.OpenAiApi.Usage) usage.getNativeUsage();
// Work with native usage data
}