This version is still in development and is not considered stable yet. For the latest snapshot version, please use Spring AI 1.0.0-SNAPSHOT! |
watsonx.ai Chat
With watsonx.ai you can run various Large Language Models (LLMs) locally and generate text from them.
Spring AI supports the watsonx.ai text generation with WatsonxAiChatModel
.
Prerequisites
You first need to have a SaaS instance of watsonx.ai (as well as an IBM Cloud account).
Refer to free-trial to try watsonx.ai for free
More info can be found here |
Auto-configuration
Spring AI provides Spring Boot auto-configuration for the watsonx.ai Chat Client.
To enable it add the following dependency to your project’s Maven pom.xml
file:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-watsonx-ai-spring-boot-starter</artifactId>
</dependency>
or to your Gradle build.gradle
build file.
dependencies {
implementation 'org.springframework.ai:spring-ai-watsonx-ai-spring-boot-starter'
}
Chat Properties
Connection Properties
The prefix spring.ai.watsonx.ai
is used as the property prefix that lets you connect to watsonx.ai.
Property | Description | Default |
---|---|---|
spring.ai.watsonx.ai.base-url |
The URL to connect to |
|
spring.ai.watsonx.ai.stream-endpoint |
The streaming endpoint |
ml/v1/text/generation_stream?version=2023-05-29 |
spring.ai.watsonx.ai.text-endpoint |
The text endpoint |
ml/v1/text/generation?version=2023-05-29 |
spring.ai.watsonx.ai.project-id |
The project ID |
- |
spring.ai.watsonx.ai.iam-token |
The IBM Cloud account IAM token |
- |
Configuration Properties
The prefix spring.ai.watsonx.ai.chat
is the property prefix that lets you configure the chat model implementation for Watsonx.AI.
Property | Description | Default |
---|---|---|
spring.ai.watsonx.ai.chat.enabled |
Enable Watsonx.AI chat model. |
true |
spring.ai.watsonx.ai.chat.options.temperature |
The temperature of the model. Increasing the temperature will make the model answer more creatively. |
0.7 |
spring.ai.watsonx.ai.chat.options.top-p |
Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.2) will generate more focused and conservative text. |
1.0 |
spring.ai.watsonx.ai.chat.options.top-k |
Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. |
50 |
spring.ai.watsonx.ai.chat.options.decoding-method |
Decoding is the process that a model uses to choose the tokens in the generated output. |
greedy |
spring.ai.watsonx.ai.chat.options.max-new-tokens |
Sets the limit of tokens that the LLM follow. |
20 |
spring.ai.watsonx.ai.chat.options.min-new-tokens |
Sets how many tokens must the LLM generate. |
0 |
spring.ai.watsonx.ai.chat.options.stop-sequences |
Sets when the LLM should stop. (e.g., ["\n\n\n"]) then when the LLM generates three consecutive line breaks it will terminate. Stop sequences are ignored until after the number of tokens that are specified in the Min tokens parameter are generated. |
- |
spring.ai.watsonx.ai.chat.options.repetition-penalty |
Sets how strongly to penalize repetitions. A higher value (e.g., 1.8) will penalize repetitions more strongly, while a lower value (e.g., 1.1) will be more lenient. |
1.0 |
spring.ai.watsonx.ai.chat.options.random-seed |
Produce repeatable results, set the same random seed value every time. |
randomly generated |
spring.ai.watsonx.ai.chat.options.model |
Model is the identifier of the LLM Model to be used. |
google/flan-ul2 |
Runtime Options
The WatsonxAiChatOptions.java provides model configurations, such as the model to use, the temperature, the frequency penalty, etc.
On start-up, the default options can be configured with the WatsonxAiChatModel(api, options)
constructor or the spring.ai.watsonxai.chat.options.*
properties.
At run-time you can override the default options by adding new, request specific, options to the Prompt
call.
For example to override the default model and temperature for a specific request:
ChatResponse response = chatModel.call(
new Prompt(
"Generate the names of 5 famous pirates.",
WatsonxAiChatOptions.builder()
.withTemperature(0.4)
.build()
));
In addition to the model specific WatsonxAiChatOptions.java you can use a portable ChatOptions instance, created with the ChatOptionsBuilder#builder(). |
For more information go to watsonx-parameters-info |
Usage example
public class MyClass {
private final static String MODEL = "google/flan-ul2";
private final WatsonxAiChatModel chatModel;
@Autowired
MyClass(WatsonxAiChatModel chatModel) {
this.chatModel = chatModel;
}
public String generate(String userInput) {
WatsonxAiOptions options = WatsonxAiOptions.create()
.withModel(MODEL)
.withDecodingMethod("sample")
.withRandomSeed(1);
Prompt prompt = new Prompt(new SystemMessage(userInput), options);
var results = this.chatModel.call(prompt);
var generatedText = results.getResult().getOutput().getContent();
return generatedText;
}
public String generateStream(String userInput) {
WatsonxAiOptions options = WatsonxAiOptions.create()
.withModel(MODEL)
.withDecodingMethod("greedy")
.withRandomSeed(2);
Prompt prompt = new Prompt(new SystemMessage(userInput), options);
var results = this.chatModel.stream(prompt).collectList().block(); // wait till the stream is resolved (completed)
var generatedText = results.stream()
.map(generation -> generation.getResult().getOutput().getContent())
.collect(Collectors.joining());
return generatedText;
}
}