Azure OpenAI Chat

Azure’s OpenAI offering, powered by ChatGPT, extends beyond traditional OpenAI capabilities, delivering AI-driven text generation with enhanced functionality. Azure offers additional AI safety and responsible AI features, as highlighted in their recent update here.

Azure offers Java developers the opportunity to leverage AI’s full potential by integrating it with an array of Azure services, which includes AI-related resources such as Vector Stores on Azure.

Prerequisites

Obtain your Azure OpenAI endpoint and api-key from the Azure OpenAI Service section on the Azure Portal. Spring AI defines a configuration property named spring.ai.azure.openai.api-key that you should set to the value of the API Key obtained from Azure. There is also a configuration property named spring.ai.azure.openai.endpoint that you should set to the endpoint URL obtained when provisioning your model in Azure. Exporting environment variables is one way to set these configuration properties:

export SPRING_AI_AZURE_OPENAI_API_KEY=<INSERT KEY HERE>
export SPRING_AI_AZURE_OPENAI_ENDPOINT=<INSERT ENDPOINT URL HERE>

Deployment Name

To use run Azure AI applications, create an Azure AI Deployment through the [Azure AI Portal](oai.azure.com/portal).

In Azure, each client must specify a Deployment Name to connect to the Azure OpenAI service.

It’s essential to understand that the Deployment Name is different from the model you choose to deploy

For instance, a deployment named 'MyAiDeployment' could be configured to use either the GPT 3.5 Turbo model or the GPT 4.0 model.

For now, to keep things simple, you can create a deployment using the following settings:

Deployment Name: gpt-35-turbo Model Name: gpt-35-turbo

This Azure configuration will align with the default configurations of the Spring Boot Azure AI Starter and its Autoconfiguration feature.

If you use a different Deployment Name, update the configuration property accordingly:

spring.ai.azure.openai.chat.options.model=<my deployment name>

The different deployment structures of Azure OpenAI and OpenAI leads to a property in the Azure OpenAI client library named deploymentOrModelName. This is because in OpenAI there is no Deployment Name, only a Model Name.

In a subsequent release, Spring AI will rename the property spring.ai.azure.openai.chat.options.model to spring.ai.azure.openai.chat.options.deployment-name to avoid confusion.

Add Repositories and BOM

Spring AI artifacts are published in Spring Milestone and Snapshot repositories. Refer to the Repositories section to add these repositories to your build system.

To help with dependency management, Spring AI provides a BOM (bill of materials) to ensure that a consistent version of Spring AI is used throughout the entire project. Refer to the Dependency Management section to add the Spring AI BOM to your build system.

Auto-configuration

Spring AI provides Spring Boot auto-configuration for the Azure OpenAI Chat Client. To enable it add the following dependency to your project’s Maven pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-azure-openai-spring-boot-starter</artifactId>
</dependency>

or to your Gradle build.gradle build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-azure-openai-spring-boot-starter'
}
Refer to the Dependency Management section to add the Spring AI BOM to your build file.

Chat Properties

The prefix spring.ai.azure.openai is the property prefix to configure the connection to Azure OpenAI.

Property Description Default

spring.ai.azure.openai.api-key

The Key from Azure AI OpenAI Keys and Endpoint section under Resource Management

-

spring.ai.azure.openai.endpoint

The endpoint from the Azure AI OpenAI Keys and Endpoint section under Resource Management

-

The prefix spring.ai.azure.openai.chat is the property prefix that configures the ChatClient implementation for Azure OpenAI.

Property Description Default

spring.ai.azure.openai.chat.enabled

Enable Azure OpenAI chat client.

true

spring.ai.azure.openai.chat.options.deployment-name

* In use with Azure, this refers to the "Deployment Name" of your model, which you can find at oai.azure.com/portal. It’s important to note that within an Azure OpenAI deployment, the "Deployment Name" is distinct from the model itself. The confusion around these terms stems from the intention to make the Azure OpenAI client library compatible with the original OpenAI endpoint. The deployment structures offered by Azure OpenAI and Sam Altman’s OpenAI differ significantly. Deployments model name to provide as part of this completions request.

gpt-35-turbo

spring.ai.azure.openai.chat.options.maxTokens

The maximum number of tokens to generate.

-

spring.ai.azure.openai.chat.options.temperature

The sampling temperature to use that controls the apparent creativity of generated completions. Higher values will make output more random while lower values will make results more focused and deterministic. It is not recommended to modify temperature and top_p for the same completions request as the interaction of these two settings is difficult to predict.

0.7

spring.ai.azure.openai.chat.options.topP

An alternative to sampling with temperature called nucleus sampling. This value causes the model to consider the results of tokens with the provided probability mass.

-

spring.ai.azure.openai.chat.options.logitBias

A map between GPT token IDs and bias scores that influences the probability of specific tokens appearing in a completions response. Token IDs are computed via external tokenizer tools, while bias scores reside in the range of -100 to 100 with minimum and maximum values corresponding to a full ban or exclusive selection of a token, respectively. The exact behavior of a given bias score varies by model.

-

spring.ai.azure.openai.chat.options.user

An identifier for the caller or end user of the operation. This may be used for tracking or rate-limiting purposes.

-

spring.ai.azure.openai.chat.options.n

The number of chat completions choices that should be generated for a chat completions response.

-

spring.ai.azure.openai.chat.options.stop

A collection of textual sequences that will end completions generation.

-

spring.ai.azure.openai.chat.options.presencePenalty

A value that influences the probability of generated tokens appearing based on their existing presence in generated text. Positive values will make tokens less likely to appear when they already exist and increase the model’s likelihood to output new topics.

-

spring.ai.azure.openai.chat.options.frequencyPenalty

A value that influences the probability of generated tokens appearing based on their cumulative frequency in generated text. Positive values will make tokens less likely to appear as their frequency increases and decrease the likelihood of the model repeating the same statements verbatim.

-

All properties prefixed with spring.ai.azure.openai.chat.options can be overridden at runtime by adding a request specific Chat Options to the Prompt call.

Chat Options

The AzureOpenAiChatOptions.java provides model configurations, such as the model to use, the temperature, the frequency penalty, etc.

On start-up, the default options can be configured with the AzureOpenAiChatClient(api, options) constructor or the spring.ai.azure.openai.chat.options.* properties.

At runtime you can override the default options by adding new, request specific, options to the Prompt call. For example to override the default model and temperature for a specific request:

ChatResponse response = chatClient.call(
    new Prompt(
        "Generate the names of 5 famous pirates.",
        AzureOpenAiChatOptions.builder()
            .withModel("gpt-4-32k")
            .withTemperature(0.4)
        .build()
    ));
In addition to the model specific AzureOpenAiChatOptions.java you can use a portable ChatOptions instance, created with the ChatOptionsBuilder#builder().

Function Calling

You can register custom Java functions with the AzureOpenAiChatClient and have the model intelligently choose to output a JSON object containing arguments to call one or many of the registered functions. This is a powerful technique to connect the LLM capabilities with external tools and APIs. Read more about Azure OpenAI Function Calling.

Sample Controller (Auto-configuration)

Create a new Spring Boot project and add the spring-ai-azure-openai-spring-boot-starter to your pom (or gradle) dependencies.

Add a application.properties file, under the src/main/resources directory, to enable and configure the OpenAi Chat client:

spring.ai.azure.openai.api-key=YOUR_API_KEY
spring.ai.azure.openai.endpoint=YOUR_ENDPOINT
spring.ai.azure.openai.chat.options.model=gpt-35-turbo
spring.ai.azure.openai.chat.options.temperature=0.7
replace the api-key and endpoint with your Azure OpenAI credentials.

This will create a AzureOpenAiChatClient implementation that you can inject into your class. Here is an example of a simple @Controller class that uses the chat client for text generations.

@RestController
public class ChatController {

    private final AzureOpenAiChatClient chatClient;

    @Autowired
    public ChatController(AzureOpenAiChatClient chatClient) {
        this.chatClient = chatClient;
    }

    @GetMapping("/ai/generate")
    public Map generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        return Map.of("generation", chatClient.call(message));
    }

    @GetMapping("/ai/generateStream")
	public Flux<ChatResponse> generateStream(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        Prompt prompt = new Prompt(new UserMessage(message));
        return chatClient.stream(prompt);
    }
}

Manual Configuration

The AzureOpenAiChatClient implements the ChatClient and StreamingChatClient and uses the Azure OpenAI Java Client.

To enable it, add the spring-ai-azure-openai dependency to your project’s Maven pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-azure-openai</artifactId>
</dependency>

or to your Gradle build.gradle build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-azure-openai'
}
Refer to the Dependency Management section to add the Spring AI BOM to your build file.
The spring-ai-azure-openai dependency also provide the access to the AzureOpenAiChatClient. For more information about the AzureOpenAiChatClient refer to the Azure OpenAI Chat section.

Next, create an AzureOpenAiChatClient instance and use it to generate text responses:

var openAIClient = OpenAIClientBuilder()
        .credential(new AzureKeyCredential(System.getenv("AZURE_OPENAI_API_KEY")))
		.endpoint(System.getenv("AZURE_OPENAI_ENDPOINT"))
		.buildClient();

var chatClient = new AzureOpenAiChatClient(openAIClient).withDefaultOptions(
		AzureOpenAiChatOptions.builder()
            .withModel("gpt-35-turbo")
            .withTemperature(0.4)
            .withMaxTokens(200)
        .build());

ChatResponse response = chatClient.call(
    new Prompt("Generate the names of 5 famous pirates."));

// Or with streaming responses
Flux<ChatResponse> response = chatClient.stream(
    new Prompt("Generate the names of 5 famous pirates."));
the gpt-35-turbo is actually the Deployment Name as presented in the Azure AI Portal.