PostgresML Embeddings

Spring AI supports the PostgresML text embeddings models.

Embeddings are a numeric representation of text. They are used to represent words and sentences as vectors, an array of numbers. Embeddings can be used to find similar pieces of text, by comparing the similarity of the numeric vectors using a distance measure, or they can be used as input features for other machine learning models, since most algorithms can’t use text directly.

Many pre-trained LLMs can be used to generate embeddings from text within PostgresML. You can browse all the models available to find the best solution on Hugging Face.

Add Repositories and BOM

Spring AI artifacts are published in Spring Milestone and Snapshot repositories. Refer to the Repositories section to add these repositories to your build system.

To help with dependency management, Spring AI provides a BOM (bill of materials) to ensure that a consistent version of Spring AI is used throughout the entire project. Refer to the Dependency Management section to add the Spring AI BOM to your build system.

Auto-configuration

Spring AI provides Spring Boot auto-configuration for the Azure PostgresML Embedding Model. To enable it add the following dependency to your project’s Maven pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-postgresml-spring-boot-starter</artifactId>
</dependency>

or to your Gradle build.gradle build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-postgresml-spring-boot-starter'
}
Refer to the Dependency Management section to add the Spring AI BOM to your build file.

Use the spring.ai.postgresml.embedding.options.* properties to configure your PostgresMlEmbeddingModel. links

Embedding Properties

The prefix spring.ai.postgresml.embedding is property prefix that configures the EmbeddingModel implementation for PostgresML embeddings.

Property

Description

Default

spring.ai.postgresml.embedding.enabled

Enable PostgresML embedding model.

true

spring.ai.postgresml.embedding.create-extension

Execute the SQL 'CREATE EXTENSION IF NOT EXISTS pgml' to enable the extesnion

false

spring.ai.postgresml.embedding.options.transformer

The Hugging Face transformer model to use for the embedding.

distilbert-base-uncased

spring.ai.postgresml.embedding.options.kwargs

Additional transformer specific options.

empty map

spring.ai.postgresml.embedding.options.vectorType

PostgresML vector type to use for the embedding. Two options are supported: PG_ARRAY and PG_VECTOR.

PG_ARRAY

spring.ai.postgresml.embedding.options.metadataMode

Document metadata aggregation mode

EMBED

All properties prefixed with spring.ai.postgresml.embedding.options can be overridden at runtime by adding a request specific Runtime Options to the EmbeddingRequest call.

Runtime Options

Use the PostgresMlEmbeddingOptions.java to configure the PostgresMlEmbeddingModel with options, such as the model to use and etc.

On start you can pass a PostgresMlEmbeddingOptions to the PostgresMlEmbeddingModel constructor to configure the default options used for all embedding requests.

At run-time you can override the default options, using a PostgresMlEmbeddingOptions in your EmbeddingRequest.

For example to override the default model name for a specific request:

EmbeddingResponse embeddingResponse = embeddingModel.call(
    new EmbeddingRequest(List.of("Hello World", "World is big and salvation is near"),
            PostgresMlEmbeddingOptions.builder()
                .withTransformer("intfloat/e5-small")
                .withVectorType(VectorType.PG_ARRAY)
                .withKwargs(Map.of("device", "gpu"))
                .build()));

Sample Controller

This will create a EmbeddingModel implementation that you can inject into your class. Here is an example of a simple @Controller class that uses the EmbeddingModel implementation.

spring.ai.postgresml.embedding.options.transformer=distilbert-base-uncased
spring.ai.postgresml.embedding.options.vectorType=PG_ARRAY
spring.ai.postgresml.embedding.options.metadataMode=EMBED
spring.ai.postgresml.embedding.options.kwargs.device=cpu
@RestController
public class EmbeddingController {

    private final EmbeddingModel embeddingModel;

    @Autowired
    public EmbeddingController(EmbeddingModel embeddingModel) {
        this.embeddingModel = embeddingModel;
    }

    @GetMapping("/ai/embedding")
    public Map embed(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        EmbeddingResponse embeddingResponse = this.embeddingModel.embedForResponse(List.of(message));
        return Map.of("embedding", embeddingResponse);
    }
}

Manual configuration

Instead of using the Spring Boot auto-configuration, you can create the PostgresMlEmbeddingModel manually. For this add the spring-ai-postgresml dependency to your project’s Maven pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-postgresml</artifactId>
</dependency>

or to your Gradle build.gradle build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-postgresml'
}
Refer to the Dependency Management section to add the Spring AI BOM to your build file.

Next, create an PostgresMlEmbeddingModel instance and use it to compute the similarity between two input texts:

var jdbcTemplate = new JdbcTemplate(dataSource); // your posgresml data source

PostgresMlEmbeddingModel embeddingModel = new PostgresMlEmbeddingModel(this.jdbcTemplate,
        PostgresMlEmbeddingOptions.builder()
            .withTransformer("distilbert-base-uncased") // huggingface transformer model name.
            .withVectorType(VectorType.PG_VECTOR) //vector type in PostgreSQL.
            .withKwargs(Map.of("device", "cpu")) // optional arguments.
            .withMetadataMode(MetadataMode.EMBED) // Document metadata mode.
            .build());

embeddingModel.afterPropertiesSet(); // initialize the jdbc template and database.

EmbeddingResponse embeddingResponse = this.embeddingModel
	.embedForResponse(List.of("Hello World", "World is big and salvation is near"));
When created manually, you must call the afterPropertiesSet() after setting the properties and before using the client. It is more convenient (and preferred) to create the PostgresMlEmbeddingModel as a @Bean. Then you don’t have to call the afterPropertiesSet() manually:
@Bean
public EmbeddingModel embeddingModel(JdbcTemplate jdbcTemplate) {
    return new PostgresMlEmbeddingModel(jdbcTemplate,
        PostgresMlEmbeddingOptions.builder()
             ....
            .build());
}