watsonx.ai Chat

With watsonx.ai you can run various Large Language Models (LLMs) locally and generate text from them. Spring AI supports the watsonx.ai text generation with WatsonxAiChatModel.

Prerequisites

You first need to have a SaaS instance of watsonx.ai (as well as an IBM Cloud account).

Refer to free-trial to try watsonx.ai for free

More info can be found here

Auto-configuration

Spring AI provides Spring Boot auto-configuration for the watsonx.ai Chat Client. To enable it add the following dependency to your project’s Maven pom.xml file:

<dependency>
   <groupId>org.springframework.ai</groupId>
   <artifactId>spring-ai-watsonx-ai-spring-boot-starter</artifactId>
</dependency>

or to your Gradle build.gradle build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-watsonx-ai-spring-boot-starter'
}

Chat Properties

Connection Properties

The prefix spring.ai.watsonx.ai is used as the property prefix that lets you connect to watsonx.ai.

Property Description Default

spring.ai.watsonx.ai.base-url

The URL to connect to

us-south.ml.cloud.ibm.com

spring.ai.watsonx.ai.stream-endpoint

The streaming endpoint

generation/stream?version=2023-05-29

spring.ai.watsonx.ai.text-endpoint

The text endpoint

generation/text?version=2023-05-29

spring.ai.watsonx.ai.project-id

The project ID

-

spring.ai.watsonx.ai.iam-token

The IBM Cloud account IAM token

-

Configuration Properties

The prefix spring.ai.watsonx.ai.chat is the property prefix that lets you configure the chat model implementation for Watsonx.AI.

Property Description Default

spring.ai.watsonx.ai.chat.enabled

Enable Watsonx.AI chat model.

true

spring.ai.watsonx.ai.chat.options.temperature

The temperature of the model. Increasing the temperature will make the model answer more creatively.

0.7

spring.ai.watsonx.ai.chat.options.top-p

Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.2) will generate more focused and conservative text.

1.0

spring.ai.watsonx.ai.chat.options.top-k

Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative.

50

spring.ai.watsonx.ai.chat.options.decoding-method

Decoding is the process that a model uses to choose the tokens in the generated output.

greedy

spring.ai.watsonx.ai.chat.options.max-new-tokens

Sets the limit of tokens that the LLM follow.

20

spring.ai.watsonx.ai.chat.options.min-new-tokens

Sets how many tokens must the LLM generate.

0

spring.ai.watsonx.ai.chat.options.stop-sequences

Sets when the LLM should stop. (e.g., ["\n\n\n"]) then when the LLM generates three consecutive line breaks it will terminate. Stop sequences are ignored until after the number of tokens that are specified in the Min tokens parameter are generated.

-

spring.ai.watsonx.ai.chat.options.repetition-penalty

Sets how strongly to penalize repetitions. A higher value (e.g., 1.8) will penalize repetitions more strongly, while a lower value (e.g., 1.1) will be more lenient.

1.0

spring.ai.watsonx.ai.chat.options.random-seed

Produce repeatable results, set the same random seed value every time.

randomly generated

spring.ai.watsonx.ai.chat.options.model

Model is the identifier of the LLM Model to be used.

google/flan-ul2

Runtime Options

The WatsonxAiChatOptions.java provides model configurations, such as the model to use, the temperature, the frequency penalty, etc.

On start-up, the default options can be configured with the WatsonxAiChatModel(api, options) constructor or the spring.ai.watsonxai.chat.options.* properties.

At run-time you can override the default options by adding new, request specific, options to the Prompt call. For example to override the default model and temperature for a specific request:

ChatResponse response = chatModel.call(
    new Prompt(
        "Generate the names of 5 famous pirates.",
        WatsonxAiChatOptions.builder()
            .withTemperature(0.4)
        .build()
    ));
In addition to the model specific WatsonxAiChatOptions.java you can use a portable ChatOptions instance, created with the ChatOptionsBuilder#builder().
For more information go to watsonx-parameters-info

Usage example

public class MyClass {

    private final static String MODEL = "google/flan-ul2";
    private final WatsonxAiChatModel chatModel;

    @Autowired
    MyClass(WatsonxAiChatModel chatModel) {
        this.chatModel = chatModel;
    }

    public String generate(String userInput) {

        WatsonxAiOptions options = WatsonxAiOptions.create()
            .withModel(MODEL)
            .withDecodingMethod("sample")
            .withRandomSeed(1);

        Prompt prompt = new Prompt(new SystemMessage(userInput), options);

        var results = chatModel.call(prompt);

        var generatedText = results.getResult().getOutput().getContent();

        return generatedText;
    }

    public String generateStream(String userInput) {

        WatsonxAiOptions options = WatsonxAiOptions.create()
            .withModel(MODEL)
            .withDecodingMethod("greedy")
            .withRandomSeed(2);

        Prompt prompt = new Prompt(new SystemMessage(userInput), options);

        var results = chatModel.stream(prompt).collectList().block(); // wait till the stream is resolved (completed)

        var generatedText = results.stream()
            .map(generation -> generation.getResult().getOutput().getContent())
            .collect(Collectors.joining());

        return generatedText;
    }

}