© 2013-2021 The original author(s).

Copies of this document may be made for your own use and for distribution to others, provided that you do not charge any fee for such copies and further provided that each copy contains this Copyright Notice, whether distributed in print or electronically.

Preface

The Spring Data Elasticsearch project applies core Spring concepts to the development of solutions using the Elasticsearch Search Engine. It provides:

  • Templates as a high-level abstraction for storing, searching, sorting documents and building aggregations.

  • Repositories which for example enable the user to express queries by defining interfaces having customized method names (for basic information about repositories see Working with Spring Data Repositories).

You will notice similarities to the Spring data solr and mongodb support in the Spring Framework.

1. What’s new

1.1. New in Spring Data Elasticsearch 4.2

  • Upgrade to Elasticsearch 7.10.0.

  • Support for custom routing values

1.2. New in Spring Data Elasticsearch 4.1

  • Uses Spring 5.3.

  • Upgrade to Elasticsearch 7.9.3.

  • Improved API for alias management.

  • Introduction of ReactiveIndexOperations for index management.

  • Index templates support.

  • Support for Geo-shape data with GeoJson.

1.3. New in Spring Data Elasticsearch 4.0

  • Uses Spring 5.2.

  • Upgrade to Elasticsearch 7.6.2.

  • Deprecation of TransportClient usage.

  • Implements most of the mapping-types available for the index mappings.

  • Removal of the Jackson ObjectMapper, now using the MappingElasticsearchConverter

  • Cleanup of the API in the *Operations interfaces, grouping and renaming methods so that they match the Elasticsearch API, deprecating the old methods, aligning with other Spring Data modules.

  • Introduction of SearchHit<T> class to represent a found document together with the relevant result metadata for this document (i.e. sortValues).

  • Introduction of the SearchHits<T> class to represent a whole search result together with the metadata for the complete search result (i.e. max_score).

  • Introduction of SearchPage<T> class to represent a paged result containing a SearchHits<T> instance.

  • Introduction of the GeoDistanceOrder class to be able to create sorting by geographical distance

  • Implementation of Auditing Support

  • Implementation of lifecycle entity callbacks

1.4. New in Spring Data Elasticsearch 3.2

2. Project Metadata

3. Requirements

Requires an installation of Elasticsearch.

3.1. Versions

The following table shows the Elasticsearch versions that are used by Spring Data release trains and version of Spring Data Elasticsearch included in that, as well as the Spring Boot versions referring to that particular Spring Data release train:

Spring Data Release Train Spring Data Elasticsearch Elasticsearch Spring Framework Spring Boot

2021.0 (Pascal)

4.2.1

7.12.1

5.3.7

2.5.x

2020.0 (Ockham)

4.1.x

7.9.3

5.3.2

2.4.x

Neumann

4.0.x

7.6.2

5.2.12

2.3.x

Moore

3.2.x

6.8.12

5.2.12

2.2.x

Lovelace[1]

3.1.x[1]

6.2.2

5.1.19

2.1.x

Kay[1]

3.0.x[1]

5.5.0

5.0.13

2.0.x

Ingalls[1]

2.1.x[1]

2.4.0

4.3.25

1.5.x

Support for upcoming versions of Elasticsearch is being tracked and general compatibility should be given assuming the usage of the high-level REST client.

4. Working with Spring Data Repositories

The goal of the Spring Data repository abstraction is to significantly reduce the amount of boilerplate code required to implement data access layers for various persistence stores.

Spring Data repository documentation and your module

This chapter explains the core concepts and interfaces of Spring Data repositories. The information in this chapter is pulled from the Spring Data Commons module. It uses the configuration and code samples for the Java Persistence API (JPA) module. You should adapt the XML namespace declaration and the types to be extended to the equivalents of the particular module that you use. “Namespace reference” covers XML configuration, which is supported across all Spring Data modules that support the repository API. “Repository query keywords” covers the query method keywords supported by the repository abstraction in general. For detailed information on the specific features of your module, see the chapter on that module of this document.

4.1. Core concepts

The central interface in the Spring Data repository abstraction is Repository. It takes the domain class to manage as well as the ID type of the domain class as type arguments. This interface acts primarily as a marker interface to capture the types to work with and to help you to discover interfaces that extend this one. The CrudRepository interface provides sophisticated CRUD functionality for the entity class that is being managed.

Example 1. CrudRepository Interface
public interface CrudRepository<T, ID> extends Repository<T, ID> {

  <S extends T> S save(S entity);      (1)

  Optional<T> findById(ID primaryKey); (2)

  Iterable<T> findAll();               (3)

  long count();                        (4)

  void delete(T entity);               (5)

  boolean existsById(ID primaryKey);   (6)

  // … more functionality omitted.
}
1 Saves the given entity.
2 Returns the entity identified by the given ID.
3 Returns all entities.
4 Returns the number of entities.
5 Deletes the given entity.
6 Indicates whether an entity with the given ID exists.
We also provide persistence technology-specific abstractions, such as JpaRepository or MongoRepository. Those interfaces extend CrudRepository and expose the capabilities of the underlying persistence technology in addition to the rather generic persistence technology-agnostic interfaces such as CrudRepository.

On top of the CrudRepository, there is a PagingAndSortingRepository abstraction that adds additional methods to ease paginated access to entities:

Example 2. PagingAndSortingRepository interface
public interface PagingAndSortingRepository<T, ID> extends CrudRepository<T, ID> {

  Iterable<T> findAll(Sort sort);

  Page<T> findAll(Pageable pageable);
}

To access the second page of User by a page size of 20, you could do something like the following:

PagingAndSortingRepository<User, Long> repository = // … get access to a bean
Page<User> users = repository.findAll(PageRequest.of(1, 20));

In addition to query methods, query derivation for both count and delete queries is available. The following list shows the interface definition for a derived count query:

Example 3. Derived Count Query
interface UserRepository extends CrudRepository<User, Long> {

  long countByLastname(String lastname);
}

The following listing shows the interface definition for a derived delete query:

Example 4. Derived Delete Query
interface UserRepository extends CrudRepository<User, Long> {

  long deleteByLastname(String lastname);

  List<User> removeByLastname(String lastname);
}

4.2. Query Methods

Standard CRUD functionality repositories usually have queries on the underlying datastore. With Spring Data, declaring those queries becomes a four-step process:

  1. Declare an interface extending Repository or one of its subinterfaces and type it to the domain class and ID type that it should handle, as shown in the following example:

    interface PersonRepository extends Repository<Person, Long> { … }
  2. Declare query methods on the interface.

    interface PersonRepository extends Repository<Person, Long> {
      List<Person> findByLastname(String lastname);
    }
  3. Set up Spring to create proxy instances for those interfaces, either with JavaConfig or with XML configuration.

    1. To use Java configuration, create a class similar to the following:

      import org.springframework.data.jpa.repository.config.EnableJpaRepositories;
      
      @EnableJpaRepositories
      class Config { … }
    2. To use XML configuration, define a bean similar to the following:

      <?xml version="1.0" encoding="UTF-8"?>
      <beans xmlns="http://www.springframework.org/schema/beans"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xmlns:jpa="http://www.springframework.org/schema/data/jpa"
         xsi:schemaLocation="http://www.springframework.org/schema/beans
           https://www.springframework.org/schema/beans/spring-beans.xsd
           http://www.springframework.org/schema/data/jpa
           https://www.springframework.org/schema/data/jpa/spring-jpa.xsd">
      
         <jpa:repositories base-package="com.acme.repositories"/>
      
      </beans>

      The JPA namespace is used in this example. If you use the repository abstraction for any other store, you need to change this to the appropriate namespace declaration of your store module. In other words, you should exchange jpa in favor of, for example, mongodb.

      Also, note that the JavaConfig variant does not configure a package explicitly, because the package of the annotated class is used by default. To customize the package to scan, use one of the basePackage… attributes of the data-store-specific repository’s @Enable${store}Repositories-annotation.

  4. Inject the repository instance and use it, as shown in the following example:

    class SomeClient {
    
      private final PersonRepository repository;
    
      SomeClient(PersonRepository repository) {
        this.repository = repository;
      }
    
      void doSomething() {
        List<Person> persons = repository.findByLastname("Matthews");
      }
    }

The sections that follow explain each step in detail:

4.3. Defining Repository Interfaces

To define a repository interface, you first need to define a domain class-specific repository interface. The interface must extend Repository and be typed to the domain class and an ID type. If you want to expose CRUD methods for that domain type, extend CrudRepository instead of Repository.

4.3.1. Fine-tuning Repository Definition

Typically, your repository interface extends Repository, CrudRepository, or PagingAndSortingRepository. Alternatively, if you do not want to extend Spring Data interfaces, you can also annotate your repository interface with @RepositoryDefinition. Extending CrudRepository exposes a complete set of methods to manipulate your entities. If you prefer to be selective about the methods being exposed, copy the methods you want to expose from CrudRepository into your domain repository.

Doing so lets you define your own abstractions on top of the provided Spring Data Repositories functionality.

The following example shows how to selectively expose CRUD methods (findById and save, in this case):

Example 5. Selectively exposing CRUD methods
@NoRepositoryBean
interface MyBaseRepository<T, ID> extends Repository<T, ID> {

  Optional<T> findById(ID id);

  <S extends T> S save(S entity);
}

interface UserRepository extends MyBaseRepository<User, Long> {
  User findByEmailAddress(EmailAddress emailAddress);
}

In the prior example, you defined a common base interface for all your domain repositories and exposed findById(…) as well as save(…).These methods are routed into the base repository implementation of the store of your choice provided by Spring Data (for example, if you use JPA, the implementation is SimpleJpaRepository), because they match the method signatures in CrudRepository. So the UserRepository can now save users, find individual users by ID, and trigger a query to find Users by email address.

The intermediate repository interface is annotated with @NoRepositoryBean. Make sure you add that annotation to all repository interfaces for which Spring Data should not create instances at runtime.

4.3.2. Using Repositories with Multiple Spring Data Modules

Using a unique Spring Data module in your application makes things simple, because all repository interfaces in the defined scope are bound to the Spring Data module. Sometimes, applications require using more than one Spring Data module. In such cases, a repository definition must distinguish between persistence technologies. When it detects multiple repository factories on the class path, Spring Data enters strict repository configuration mode. Strict configuration uses details on the repository or the domain class to decide about Spring Data module binding for a repository definition:

  1. If the repository definition extends the module-specific repository, it is a valid candidate for the particular Spring Data module.

  2. If the domain class is annotated with the module-specific type annotation, it is a valid candidate for the particular Spring Data module. Spring Data modules accept either third-party annotations (such as JPA’s @Entity) or provide their own annotations (such as @Document for Spring Data MongoDB and Spring Data Elasticsearch).

The following example shows a repository that uses module-specific interfaces (JPA in this case):

Example 6. Repository definitions using module-specific interfaces
interface MyRepository extends JpaRepository<User, Long> { }

@NoRepositoryBean
interface MyBaseRepository<T, ID> extends JpaRepository<T, ID> { … }

interface UserRepository extends MyBaseRepository<User, Long> { … }

MyRepository and UserRepository extend JpaRepository in their type hierarchy. They are valid candidates for the Spring Data JPA module.

The following example shows a repository that uses generic interfaces:

Example 7. Repository definitions using generic interfaces
interface AmbiguousRepository extends Repository<User, Long> { … }

@NoRepositoryBean
interface MyBaseRepository<T, ID> extends CrudRepository<T, ID> { … }

interface AmbiguousUserRepository extends MyBaseRepository<User, Long> { … }

AmbiguousRepository and AmbiguousUserRepository extend only Repository and CrudRepository in their type hierarchy. While this is fine when using a unique Spring Data module, multiple modules cannot distinguish to which particular Spring Data these repositories should be bound.

The following example shows a repository that uses domain classes with annotations:

Example 8. Repository definitions using domain classes with annotations
interface PersonRepository extends Repository<Person, Long> { … }

@Entity
class Person { … }

interface UserRepository extends Repository<User, Long> { … }

@Document
class User { … }

PersonRepository references Person, which is annotated with the JPA @Entity annotation, so this repository clearly belongs to Spring Data JPA. UserRepository references User, which is annotated with Spring Data MongoDB’s @Document annotation.

The following bad example shows a repository that uses domain classes with mixed annotations:

Example 9. Repository definitions using domain classes with mixed annotations
interface JpaPersonRepository extends Repository<Person, Long> { … }

interface MongoDBPersonRepository extends Repository<Person, Long> { … }

@Entity
@Document
class Person { … }

This example shows a domain class using both JPA and Spring Data MongoDB annotations. It defines two repositories, JpaPersonRepository and MongoDBPersonRepository. One is intended for JPA and the other for MongoDB usage. Spring Data is no longer able to tell the repositories apart, which leads to undefined behavior.

Repository type details and distinguishing domain class annotations are used for strict repository configuration to identify repository candidates for a particular Spring Data module. Using multiple persistence technology-specific annotations on the same domain type is possible and enables reuse of domain types across multiple persistence technologies. However, Spring Data can then no longer determine a unique module with which to bind the repository.

The last way to distinguish repositories is by scoping repository base packages. Base packages define the starting points for scanning for repository interface definitions, which implies having repository definitions located in the appropriate packages. By default, annotation-driven configuration uses the package of the configuration class. The base package in XML-based configuration is mandatory.

The following example shows annotation-driven configuration of base packages:

Example 10. Annotation-driven configuration of base packages
@EnableJpaRepositories(basePackages = "com.acme.repositories.jpa")
@EnableMongoRepositories(basePackages = "com.acme.repositories.mongo")
class Configuration { … }

4.4. Defining Query Methods

The repository proxy has two ways to derive a store-specific query from the method name:

  • By deriving the query from the method name directly.

  • By using a manually defined query.

Available options depend on the actual store. However, there must be a strategy that decides what actual query is created. The next section describes the available options.

4.4.1. Query Lookup Strategies

The following strategies are available for the repository infrastructure to resolve the query. With XML configuration, you can configure the strategy at the namespace through the query-lookup-strategy attribute. For Java configuration, you can use the queryLookupStrategy attribute of the Enable${store}Repositories annotation. Some strategies may not be supported for particular datastores.

  • CREATE attempts to construct a store-specific query from the query method name. The general approach is to remove a given set of well known prefixes from the method name and parse the rest of the method. You can read more about query construction in “Query Creation”.

  • USE_DECLARED_QUERY tries to find a declared query and throws an exception if it cannot find one. The query can be defined by an annotation somewhere or declared by other means. See the documentation of the specific store to find available options for that store. If the repository infrastructure does not find a declared query for the method at bootstrap time, it fails.

  • CREATE_IF_NOT_FOUND (the default) combines CREATE and USE_DECLARED_QUERY. It looks up a declared query first, and, if no declared query is found, it creates a custom method name-based query. This is the default lookup strategy and, thus, is used if you do not configure anything explicitly. It allows quick query definition by method names but also custom-tuning of these queries by introducing declared queries as needed.

4.4.2. Query Creation

The query builder mechanism built into the Spring Data repository infrastructure is useful for building constraining queries over entities of the repository.

The following example shows how to create a number of queries:

Example 11. Query creation from method names
interface PersonRepository extends Repository<Person, Long> {

  List<Person> findByEmailAddressAndLastname(EmailAddress emailAddress, String lastname);

  // Enables the distinct flag for the query
  List<Person> findDistinctPeopleByLastnameOrFirstname(String lastname, String firstname);
  List<Person> findPeopleDistinctByLastnameOrFirstname(String lastname, String firstname);

  // Enabling ignoring case for an individual property
  List<Person> findByLastnameIgnoreCase(String lastname);
  // Enabling ignoring case for all suitable properties
  List<Person> findByLastnameAndFirstnameAllIgnoreCase(String lastname, String firstname);

  // Enabling static ORDER BY for a query
  List<Person> findByLastnameOrderByFirstnameAsc(String lastname);
  List<Person> findByLastnameOrderByFirstnameDesc(String lastname);
}

Parsing query method names is divided into subject and predicate. The first part (find…By, exists…By) defines the subject of the query, the second part forms the predicate. The introducing clause (subject) can contain further expressions. Any text between find (or other introducing keywords) and By is considered to be descriptive unless using one of the result-limiting keywords such as a Distinct to set a distinct flag on the query to be created or Top/First to limit query results.

The appendix contains the full list of query method subject keywords and query method predicate keywords including sorting and letter-casing modifiers. However, the first By acts as a delimiter to indicate the start of the actual criteria predicate. At a very basic level, you can define conditions on entity properties and concatenate them with And and Or.

The actual result of parsing the method depends on the persistence store for which you create the query. However, there are some general things to notice:

  • The expressions are usually property traversals combined with operators that can be concatenated. You can combine property expressions with AND and OR. You also get support for operators such as Between, LessThan, GreaterThan, and Like for the property expressions. The supported operators can vary by datastore, so consult the appropriate part of your reference documentation.

  • The method parser supports setting an IgnoreCase flag for individual properties (for example, findByLastnameIgnoreCase(…)) or for all properties of a type that supports ignoring case (usually String instances — for example, findByLastnameAndFirstnameAllIgnoreCase(…)). Whether ignoring cases is supported may vary by store, so consult the relevant sections in the reference documentation for the store-specific query method.

  • You can apply static ordering by appending an OrderBy clause to the query method that references a property and by providing a sorting direction (Asc or Desc). To create a query method that supports dynamic sorting, see “Special parameter handling”.

4.4.3. Property Expressions

Property expressions can refer only to a direct property of the managed entity, as shown in the preceding example. At query creation time, you already make sure that the parsed property is a property of the managed domain class. However, you can also define constraints by traversing nested properties. Consider the following method signature:

List<Person> findByAddressZipCode(ZipCode zipCode);

Assume a Person has an Address with a ZipCode. In that case, the method creates the x.address.zipCode property traversal. The resolution algorithm starts by interpreting the entire part (AddressZipCode) as the property and checks the domain class for a property with that name (uncapitalized). If the algorithm succeeds, it uses that property. If not, the algorithm splits up the source at the camel-case parts from the right side into a head and a tail and tries to find the corresponding property — in our example, AddressZip and Code. If the algorithm finds a property with that head, it takes the tail and continues building the tree down from there, splitting the tail up in the way just described. If the first split does not match, the algorithm moves the split point to the left (Address, ZipCode) and continues.

Although this should work for most cases, it is possible for the algorithm to select the wrong property. Suppose the Person class has an addressZip property as well. The algorithm would match in the first split round already, choose the wrong property, and fail (as the type of addressZip probably has no code property).

To resolve this ambiguity you can use _ inside your method name to manually define traversal points. So our method name would be as follows:

List<Person> findByAddress_ZipCode(ZipCode zipCode);

Because we treat the underscore character as a reserved character, we strongly advise following standard Java naming conventions (that is, not using underscores in property names but using camel case instead).

4.4.4. Special parameter handling

To handle parameters in your query, define method parameters as already seen in the preceding examples. Besides that, the infrastructure recognizes certain specific types like Pageable and Sort, to apply pagination and sorting to your queries dynamically. The following example demonstrates these features:

Example 12. Using Pageable, Slice, and Sort in query methods
Page<User> findByLastname(String lastname, Pageable pageable);

Slice<User> findByLastname(String lastname, Pageable pageable);

List<User> findByLastname(String lastname, Sort sort);

List<User> findByLastname(String lastname, Pageable pageable);
APIs taking Sort and Pageable expect non-null values to be handed into methods. If you do not want to apply any sorting or pagination, use Sort.unsorted() and Pageable.unpaged().

The first method lets you pass an org.springframework.data.domain.Pageable instance to the query method to dynamically add paging to your statically defined query. A Page knows about the total number of elements and pages available. It does so by the infrastructure triggering a count query to calculate the overall number. As this might be expensive (depending on the store used), you can instead return a Slice. A Slice knows only about whether a next Slice is available, which might be sufficient when walking through a larger result set.

Sorting options are handled through the Pageable instance, too. If you need only sorting, add an org.springframework.data.domain.Sort parameter to your method. As you can see, returning a List is also possible. In this case, the additional metadata required to build the actual Page instance is not created (which, in turn, means that the additional count query that would have been necessary is not issued). Rather, it restricts the query to look up only the given range of entities.

To find out how many pages you get for an entire query, you have to trigger an additional count query. By default, this query is derived from the query you actually trigger.
Paging and Sorting

You can define simple sorting expressions by using property names. You can concatenate expressions to collect multiple criteria into one expression.

Example 13. Defining sort expressions
Sort sort = Sort.by("firstname").ascending()
  .and(Sort.by("lastname").descending());

For a more type-safe way to define sort expressions, start with the type for which to define the sort expression and use method references to define the properties on which to sort.

Example 14. Defining sort expressions by using the type-safe API
TypedSort<Person> person = Sort.sort(Person.class);

Sort sort = person.by(Person::getFirstname).ascending()
  .and(person.by(Person::getLastname).descending());
TypedSort.by(…) makes use of runtime proxies by (typically) using CGlib, which may interfere with native image compilation when using tools such as Graal VM Native.

If your store implementation supports Querydsl, you can also use the generated metamodel types to define sort expressions:

Example 15. Defining sort expressions by using the Querydsl API
QSort sort = QSort.by(QPerson.firstname.asc())
  .and(QSort.by(QPerson.lastname.desc()));

4.4.5. Limiting Query Results

You can limit the results of query methods by using the first or top keywords, which you can use interchangeably. You can append an optional numeric value to top or first to specify the maximum result size to be returned. If the number is left out, a result size of 1 is assumed. The following example shows how to limit the query size:

Example 16. Limiting the result size of a query with Top and First
User findFirstByOrderByLastnameAsc();

User findTopByOrderByAgeDesc();

Page<User> queryFirst10ByLastname(String lastname, Pageable pageable);

Slice<User> findTop3ByLastname(String lastname, Pageable pageable);

List<User> findFirst10ByLastname(String lastname, Sort sort);

List<User> findTop10ByLastname(String lastname, Pageable pageable);

The limiting expressions also support the Distinct keyword for datastores that support distinct queries. Also, for the queries that limit the result set to one instance, wrapping the result into with the Optional keyword is supported.

If pagination or slicing is applied to a limiting query pagination (and the calculation of the number of available pages), it is applied within the limited result.

Limiting the results in combination with dynamic sorting by using a Sort parameter lets you express query methods for the 'K' smallest as well as for the 'K' biggest elements.

4.4.6. Repository Methods Returning Collections or Iterables

Query methods that return multiple results can use standard Java Iterable, List, and Set. Beyond that, we support returning Spring Data’s Streamable, a custom extension of Iterable, as well as collection types provided by Vavr. Refer to the appendix explaining all possible query method return types.

Using Streamable as Query Method Return Type

You can use Streamable as alternative to Iterable or any collection type. It provides convenience methods to access a non-parallel Stream (missing from Iterable) and the ability to directly ….filter(…) and ….map(…) over the elements and concatenate the Streamable to others:

Example 17. Using Streamable to combine query method results
interface PersonRepository extends Repository<Person, Long> {
  Streamable<Person> findByFirstnameContaining(String firstname);
  Streamable<Person> findByLastnameContaining(String lastname);
}

Streamable<Person> result = repository.findByFirstnameContaining("av")
  .and(repository.findByLastnameContaining("ea"));
Returning Custom Streamable Wrapper Types

Providing dedicated wrapper types for collections is a commonly used pattern to provide an API for a query result that returns multiple elements. Usually, these types are used by invoking a repository method returning a collection-like type and creating an instance of the wrapper type manually. You can avoid that additional step as Spring Data lets you use these wrapper types as query method return types if they meet the following criteria:

  1. The type implements Streamable.

  2. The type exposes either a constructor or a static factory method named of(…) or valueOf(…) that takes Streamable as an argument.

The following listing shows an example:

class Product {                                         (1)
  MonetaryAmount getPrice() { … }
}

@RequiredArgsConstructor(staticName = "of")
class Products implements Streamable<Product> {         (2)

  private final Streamable<Product> streamable;

  public MonetaryAmount getTotal() {                    (3)
    return streamable.stream()
      .map(Priced::getPrice)
      .reduce(Money.of(0), MonetaryAmount::add);
  }


  @Override
  public Iterator<Product> iterator() {                 (4)
    return streamable.iterator();
  }
}

interface ProductRepository implements Repository<Product, Long> {
  Products findAllByDescriptionContaining(String text); (5)
}
1 A Product entity that exposes API to access the product’s price.
2 A wrapper type for a Streamable<Product> that can be constructed by using Products.of(…) (factory method created with the Lombok annotation). A standard constructor taking the Streamable<Product> will do as well.
3 The wrapper type exposes an additional API, calculating new values on the Streamable<Product>.
4 Implement the Streamable interface and delegate to the actual result.
5 That wrapper type Products can be used directly as a query method return type. You do not need to return Streamable<Product> and manually wrap it after the query in the repository client.
Support for Vavr Collections

Vavr is a library that embraces functional programming concepts in Java. It ships with a custom set of collection types that you can use as query method return types, as the following table shows:

Vavr collection type Used Vavr implementation type Valid Java source types

io.vavr.collection.Seq

io.vavr.collection.List

java.util.Iterable

io.vavr.collection.Set

io.vavr.collection.LinkedHashSet

java.util.Iterable

io.vavr.collection.Map

io.vavr.collection.LinkedHashMap

java.util.Map

You can use the types in the first column (or subtypes thereof) as query method return types and get the types in the second column used as implementation type, depending on the Java type of the actual query result (third column). Alternatively, you can declare Traversable (the Vavr Iterable equivalent), and we then derive the implementation class from the actual return value. That is, a java.util.List is turned into a Vavr List or Seq, a java.util.Set becomes a Vavr LinkedHashSet Set, and so on.

4.4.7. Null Handling of Repository Methods

As of Spring Data 2.0, repository CRUD methods that return an individual aggregate instance use Java 8’s Optional to indicate the potential absence of a value. Besides that, Spring Data supports returning the following wrapper types on query methods:

  • com.google.common.base.Optional

  • scala.Option

  • io.vavr.control.Option

Alternatively, query methods can choose not to use a wrapper type at all. The absence of a query result is then indicated by returning null. Repository methods returning collections, collection alternatives, wrappers, and streams are guaranteed never to return null but rather the corresponding empty representation. See “Repository query return types” for details.

Nullability Annotations

You can express nullability constraints for repository methods by using Spring Framework’s nullability annotations. They provide a tooling-friendly approach and opt-in null checks during runtime, as follows:

  • @NonNullApi: Used on the package level to declare that the default behavior for parameters and return values is, respectively, neither to accept nor to produce null values.

  • @NonNull: Used on a parameter or return value that must not be null (not needed on a parameter and return value where @NonNullApi applies).

  • @Nullable: Used on a parameter or return value that can be null.

Spring annotations are meta-annotated with JSR 305 annotations (a dormant but widely used JSR). JSR 305 meta-annotations let tooling vendors (such as IDEA, Eclipse, and Kotlin) provide null-safety support in a generic way, without having to hard-code support for Spring annotations. To enable runtime checking of nullability constraints for query methods, you need to activate non-nullability on the package level by using Spring’s @NonNullApi in package-info.java, as shown in the following example:

Example 18. Declaring Non-nullability in package-info.java
@org.springframework.lang.NonNullApi
package com.acme;

Once non-null defaulting is in place, repository query method invocations get validated at runtime for nullability constraints. If a query result violates the defined constraint, an exception is thrown. This happens when the method would return null but is declared as non-nullable (the default with the annotation defined on the package in which the repository resides). If you want to opt-in to nullable results again, selectively use @Nullable on individual methods. Using the result wrapper types mentioned at the start of this section continues to work as expected: an empty result is translated into the value that represents absence.

The following example shows a number of the techniques just described:

Example 19. Using different nullability constraints
package com.acme;                                                       (1)

import org.springframework.lang.Nullable;

interface UserRepository extends Repository<User, Long> {

  User getByEmailAddress(EmailAddress emailAddress);                    (2)

  @Nullable
  User findByEmailAddress(@Nullable EmailAddress emailAdress);          (3)

  Optional<User> findOptionalByEmailAddress(EmailAddress emailAddress); (4)
}
1 The repository resides in a package (or sub-package) for which we have defined non-null behavior.
2 Throws an EmptyResultDataAccessException when the query does not produce a result. Throws an IllegalArgumentException when the emailAddress handed to the method is null.
3 Returns null when the query does not produce a result. Also accepts null as the value for emailAddress.
4 Returns Optional.empty() when the query does not produce a result. Throws an IllegalArgumentException when the emailAddress handed to the method is null.
Nullability in Kotlin-based Repositories

Kotlin has the definition of nullability constraints baked into the language. Kotlin code compiles to bytecode, which does not express nullability constraints through method signatures but rather through compiled-in metadata. Make sure to include the kotlin-reflect JAR in your project to enable introspection of Kotlin’s nullability constraints. Spring Data repositories use the language mechanism to define those constraints to apply the same runtime checks, as follows:

Example 20. Using nullability constraints on Kotlin repositories
interface UserRepository : Repository<User, String> {

  fun findByUsername(username: String): User     (1)

  fun findByFirstname(firstname: String?): User? (2)
}
1 The method defines both the parameter and the result as non-nullable (the Kotlin default). The Kotlin compiler rejects method invocations that pass null to the method. If the query yields an empty result, an EmptyResultDataAccessException is thrown.
2 This method accepts null for the firstname parameter and returns null if the query does not produce a result.

4.4.8. Streaming Query Results

You can process the results of query methods incrementally by using a Java 8 Stream<T> as the return type. Instead of wrapping the query results in a Stream, data store-specific methods are used to perform the streaming, as shown in the following example:

Example 21. Stream the result of a query with Java 8 Stream<T>
@Query("select u from User u")
Stream<User> findAllByCustomQueryAndStream();

Stream<User> readAllByFirstnameNotNull();

@Query("select u from User u")
Stream<User> streamAllPaged(Pageable pageable);
A Stream potentially wraps underlying data store-specific resources and must, therefore, be closed after usage. You can either manually close the Stream by using the close() method or by using a Java 7 try-with-resources block, as shown in the following example:
Example 22. Working with a Stream<T> result in a try-with-resources block
try (Stream<User> stream = repository.findAllByCustomQueryAndStream()) {
  stream.forEach(…);
}
Not all Spring Data modules currently support Stream<T> as a return type.

4.4.9. Asynchronous Query Results

You can run repository queries asynchronously by using Spring’s asynchronous method running capability. This means the method returns immediately upon invocation while the actual query occurs in a task that has been submitted to a Spring TaskExecutor. Asynchronous queries differ from reactive queries and should not be mixed. See the store-specific documentation for more details on reactive support. The following example shows a number of asynchronous queries:

@Async
Future<User> findByFirstname(String firstname);               (1)

@Async
CompletableFuture<User> findOneByFirstname(String firstname); (2)

@Async
ListenableFuture<User> findOneByLastname(String lastname);    (3)
1 Use java.util.concurrent.Future as the return type.
2 Use a Java 8 java.util.concurrent.CompletableFuture as the return type.
3 Use a org.springframework.util.concurrent.ListenableFuture as the return type.

4.5. Creating Repository Instances

This section covers how to create instances and bean definitions for the defined repository interfaces. One way to do so is by using the Spring namespace that is shipped with each Spring Data module that supports the repository mechanism, although we generally recommend using Java configuration.

4.5.1. XML Configuration

Each Spring Data module includes a repositories element that lets you define a base package that Spring scans for you, as shown in the following example:

Example 23. Enabling Spring Data repositories via XML
<?xml version="1.0" encoding="UTF-8"?>
<beans:beans xmlns:beans="http://www.springframework.org/schema/beans"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns="http://www.springframework.org/schema/data/jpa"
  xsi:schemaLocation="http://www.springframework.org/schema/beans
    https://www.springframework.org/schema/beans/spring-beans.xsd
    http://www.springframework.org/schema/data/jpa
    https://www.springframework.org/schema/data/jpa/spring-jpa.xsd">

  <repositories base-package="com.acme.repositories" />

</beans:beans>

In the preceding example, Spring is instructed to scan com.acme.repositories and all its sub-packages for interfaces extending Repository or one of its sub-interfaces. For each interface found, the infrastructure registers the persistence technology-specific FactoryBean to create the appropriate proxies that handle invocations of the query methods. Each bean is registered under a bean name that is derived from the interface name, so an interface of UserRepository would be registered under userRepository. Bean names for nested repository interfaces are prefixed with their enclosing type name. The base-package attribute allows wildcards so that you can define a pattern of scanned packages.

Using Filters

By default, the infrastructure picks up every interface that extends the persistence technology-specific Repository sub-interface located under the configured base package and creates a bean instance for it. However, you might want more fine-grained control over which interfaces have bean instances created for them. To do so, use <include-filter /> and <exclude-filter /> elements inside the <repositories /> element. The semantics are exactly equivalent to the elements in Spring’s context namespace. For details, see the Spring reference documentation for these elements.

For example, to exclude certain interfaces from instantiation as repository beans, you could use the following configuration:

Example 24. Using exclude-filter element
<repositories base-package="com.acme.repositories">
  <context:exclude-filter type="regex" expression=".*SomeRepository" />
</repositories>

The preceding example excludes all interfaces ending in SomeRepository from being instantiated.

4.5.2. Java Configuration

You can also trigger the repository infrastructure by using a store-specific @Enable${store}Repositories annotation on a Java configuration class. For an introduction to Java-based configuration of the Spring container, see JavaConfig in the Spring reference documentation.

A sample configuration to enable Spring Data repositories resembles the following:

Example 25. Sample annotation-based repository configuration
@Configuration
@EnableJpaRepositories("com.acme.repositories")
class ApplicationConfiguration {

  @Bean
  EntityManagerFactory entityManagerFactory() {
    // …
  }
}
The preceding example uses the JPA-specific annotation, which you would change according to the store module you actually use. The same applies to the definition of the EntityManagerFactory bean. See the sections covering the store-specific configuration.

4.5.3. Standalone Usage

You can also use the repository infrastructure outside of a Spring container — for example, in CDI environments. You still need some Spring libraries in your classpath, but, generally, you can set up repositories programmatically as well. The Spring Data modules that provide repository support ship with a persistence technology-specific RepositoryFactory that you can use, as follows:

Example 26. Standalone usage of the repository factory
RepositoryFactorySupport factory = … // Instantiate factory here
UserRepository repository = factory.getRepository(UserRepository.class);

4.6. Custom Implementations for Spring Data Repositories

Spring Data provides various options to create query methods with little coding. But when those options don’t fit your needs you can also provide your own custom implementation for repository methods. This section describes how to do that.

4.6.1. Customizing Individual Repositories

To enrich a repository with custom functionality, you must first define a fragment interface and an implementation for the custom functionality, as follows:

Example 27. Interface for custom repository functionality
interface CustomizedUserRepository {
  void someCustomMethod(User user);
}
Example 28. Implementation of custom repository functionality
class CustomizedUserRepositoryImpl implements CustomizedUserRepository {

  public void someCustomMethod(User user) {
    // Your custom implementation
  }
}
The most important part of the class name that corresponds to the fragment interface is the Impl postfix.

The implementation itself does not depend on Spring Data and can be a regular Spring bean.Consequently, you can use standard dependency injection behavior to inject references to other beans (such as a JdbcTemplate), take part in aspects, and so on.

Then you can let your repository interface extend the fragment interface, as follows:

Example 29. Changes to your repository interface
interface UserRepository extends CrudRepository<User, Long>, CustomizedUserRepository {

  // Declare query methods here
}

Extending the fragment interface with your repository interface combines the CRUD and custom functionality and makes it available to clients.

Spring Data repositories are implemented by using fragments that form a repository composition. Fragments are the base repository, functional aspects (such as QueryDsl), and custom interfaces along with their implementations. Each time you add an interface to your repository interface, you enhance the composition by adding a fragment. The base repository and repository aspect implementations are provided by each Spring Data module.

The following example shows custom interfaces and their implementations:

Example 30. Fragments with their implementations
interface HumanRepository {
  void someHumanMethod(User user);
}

class HumanRepositoryImpl implements HumanRepository {

  public void someHumanMethod(User user) {
    // Your custom implementation
  }
}

interface ContactRepository {

  void someContactMethod(User user);

  User anotherContactMethod(User user);
}

class ContactRepositoryImpl implements ContactRepository {

  public void someContactMethod(User user) {
    // Your custom implementation
  }

  public User anotherContactMethod(User user) {
    // Your custom implementation
  }
}

The following example shows the interface for a custom repository that extends CrudRepository:

Example 31. Changes to your repository interface
interface UserRepository extends CrudRepository<User, Long>, HumanRepository, ContactRepository {

  // Declare query methods here
}

Repositories may be composed of multiple custom implementations that are imported in the order of their declaration. Custom implementations have a higher priority than the base implementation and repository aspects. This ordering lets you override base repository and aspect methods and resolves ambiguity if two fragments contribute the same method signature. Repository fragments are not limited to use in a single repository interface. Multiple repositories may use a fragment interface, letting you reuse customizations across different repositories.

The following example shows a repository fragment and its implementation:

Example 32. Fragments overriding save(…)
interface CustomizedSave<T> {
  <S extends T> S save(S entity);
}

class CustomizedSaveImpl<T> implements CustomizedSave<T> {

  public <S extends T> S save(S entity) {
    // Your custom implementation
  }
}

The following example shows a repository that uses the preceding repository fragment:

Example 33. Customized repository interfaces
interface UserRepository extends CrudRepository<User, Long>, CustomizedSave<User> {
}

interface PersonRepository extends CrudRepository<Person, Long>, CustomizedSave<Person> {
}
Configuration

If you use namespace configuration, the repository infrastructure tries to autodetect custom implementation fragments by scanning for classes below the package in which it found a repository. These classes need to follow the naming convention of appending the namespace element’s repository-impl-postfix attribute to the fragment interface name. This postfix defaults to Impl. The following example shows a repository that uses the default postfix and a repository that sets a custom value for the postfix:

Example 34. Configuration example
<repositories base-package="com.acme.repository" />

<repositories base-package="com.acme.repository" repository-impl-postfix="MyPostfix" />

The first configuration in the preceding example tries to look up a class called com.acme.repository.CustomizedUserRepositoryImpl to act as a custom repository implementation. The second example tries to look up com.acme.repository.CustomizedUserRepositoryMyPostfix.

Resolution of Ambiguity

If multiple implementations with matching class names are found in different packages, Spring Data uses the bean names to identify which one to use.

Given the following two custom implementations for the CustomizedUserRepository shown earlier, the first implementation is used. Its bean name is customizedUserRepositoryImpl, which matches that of the fragment interface (CustomizedUserRepository) plus the postfix Impl.

Example 35. Resolution of ambiguous implementations
package com.acme.impl.one;

class CustomizedUserRepositoryImpl implements CustomizedUserRepository {

  // Your custom implementation
}
package com.acme.impl.two;

@Component("specialCustomImpl")
class CustomizedUserRepositoryImpl implements CustomizedUserRepository {

  // Your custom implementation
}

If you annotate the UserRepository interface with @Component("specialCustom"), the bean name plus Impl then matches the one defined for the repository implementation in com.acme.impl.two, and it is used instead of the first one.

Manual Wiring

If your custom implementation uses annotation-based configuration and autowiring only, the preceding approach shown works well, because it is treated as any other Spring bean. If your implementation fragment bean needs special wiring, you can declare the bean and name it according to the conventions described in the preceding section. The infrastructure then refers to the manually defined bean definition by name instead of creating one itself. The following example shows how to manually wire a custom implementation:

Example 36. Manual wiring of custom implementations
<repositories base-package="com.acme.repository" />

<beans:bean id="userRepositoryImpl" class="…">
  <!-- further configuration -->
</beans:bean>

4.6.2. Customize the Base Repository

The approach described in the preceding section requires customization of each repository interfaces when you want to customize the base repository behavior so that all repositories are affected. To instead change behavior for all repositories, you can create an implementation that extends the persistence technology-specific repository base class. This class then acts as a custom base class for the repository proxies, as shown in the following example:

Example 37. Custom repository base class
class MyRepositoryImpl<T, ID>
  extends SimpleJpaRepository<T, ID> {

  private final EntityManager entityManager;

  MyRepositoryImpl(JpaEntityInformation entityInformation,
                          EntityManager entityManager) {
    super(entityInformation, entityManager);

    // Keep the EntityManager around to used from the newly introduced methods.
    this.entityManager = entityManager;
  }

  @Transactional
  public <S extends T> S save(S entity) {
    // implementation goes here
  }
}
The class needs to have a constructor of the super class which the store-specific repository factory implementation uses. If the repository base class has multiple constructors, override the one taking an EntityInformation plus a store specific infrastructure object (such as an EntityManager or a template class).

The final step is to make the Spring Data infrastructure aware of the customized repository base class. In Java configuration, you can do so by using the repositoryBaseClass attribute of the @Enable${store}Repositories annotation, as shown in the following example:

Example 38. Configuring a custom repository base class using JavaConfig
@Configuration
@EnableJpaRepositories(repositoryBaseClass = MyRepositoryImpl.class)
class ApplicationConfiguration { … }

A corresponding attribute is available in the XML namespace, as shown in the following example:

Example 39. Configuring a custom repository base class using XML
<repositories base-package="com.acme.repository"
     base-class="….MyRepositoryImpl" />

4.7. Publishing Events from Aggregate Roots

Entities managed by repositories are aggregate roots. In a Domain-Driven Design application, these aggregate roots usually publish domain events. Spring Data provides an annotation called @DomainEvents that you can use on a method of your aggregate root to make that publication as easy as possible, as shown in the following example:

Example 40. Exposing domain events from an aggregate root
class AnAggregateRoot {

    @DomainEvents (1)
    Collection<Object> domainEvents() {
        // … return events you want to get published here
    }

    @AfterDomainEventPublication (2)
    void callbackMethod() {
       // … potentially clean up domain events list
    }
}
1 The method that uses @DomainEvents can return either a single event instance or a collection of events. It must not take any arguments.
2 After all events have been published, we have a method annotated with @AfterDomainEventPublication. You can use it to potentially clean the list of events to be published (among other uses).

The methods are called every time one of a Spring Data repository’s save(…), saveAll(…), delete(…) or deleteAll(…) methods are called.

4.8. Spring Data Extensions

This section documents a set of Spring Data extensions that enable Spring Data usage in a variety of contexts. Currently, most of the integration is targeted towards Spring MVC.

4.8.1. Querydsl Extension

Querydsl is a framework that enables the construction of statically typed SQL-like queries through its fluent API.

Several Spring Data modules offer integration with Querydsl through QuerydslPredicateExecutor, as the following example shows:

Example 41. QuerydslPredicateExecutor interface
public interface QuerydslPredicateExecutor<T> {

  Optional<T> findById(Predicate predicate);  (1)

  Iterable<T> findAll(Predicate predicate);   (2)

  long count(Predicate predicate);            (3)

  boolean exists(Predicate predicate);        (4)

  // … more functionality omitted.
}
1 Finds and returns a single entity matching the Predicate.
2 Finds and returns all entities matching the Predicate.
3 Returns the number of entities matching the Predicate.
4 Returns whether an entity that matches the Predicate exists.

To use the Querydsl support, extend QuerydslPredicateExecutor on your repository interface, as the following example shows:

Example 42. Querydsl integration on repositories
interface UserRepository extends CrudRepository<User, Long>, QuerydslPredicateExecutor<User> {
}

The preceding example lets you write type-safe queries by using Querydsl Predicate instances, as the following example shows:

Predicate predicate = user.firstname.equalsIgnoreCase("dave")
	.and(user.lastname.startsWithIgnoreCase("mathews"));

userRepository.findAll(predicate);

4.8.2. Web support

Spring Data modules that support the repository programming model ship with a variety of web support. The web related components require Spring MVC JARs to be on the classpath. Some of them even provide integration with Spring HATEOAS. In general, the integration support is enabled by using the @EnableSpringDataWebSupport annotation in your JavaConfig configuration class, as the following example shows:

Example 43. Enabling Spring Data web support
@Configuration
@EnableWebMvc
@EnableSpringDataWebSupport
class WebConfiguration {}

The @EnableSpringDataWebSupport annotation registers a few components. We discuss those later in this section. It also detects Spring HATEOAS on the classpath and registers integration components (if present) for it as well.

Alternatively, if you use XML configuration, register either SpringDataWebConfiguration or HateoasAwareSpringDataWebConfiguration as Spring beans, as the following example shows (for SpringDataWebConfiguration):

Example 44. Enabling Spring Data web support in XML
<bean class="org.springframework.data.web.config.SpringDataWebConfiguration" />

<!-- If you use Spring HATEOAS, register this one *instead* of the former -->
<bean class="org.springframework.data.web.config.HateoasAwareSpringDataWebConfiguration" />
Basic Web Support

The configuration shown in the previous section registers a few basic components:

  • A Using the DomainClassConverter Class to let Spring MVC resolve instances of repository-managed domain classes from request parameters or path variables.

  • HandlerMethodArgumentResolver implementations to let Spring MVC resolve Pageable and Sort instances from request parameters.

  • Jackson Modules to de-/serialize types like Point and Distance, or store specific ones, depending on the Spring Data Module used.

Using the DomainClassConverter Class

The DomainClassConverter class lets you use domain types in your Spring MVC controller method signatures directly so that you need not manually lookup the instances through the repository, as the following example shows:

Example 45. A Spring MVC controller using domain types in method signatures
@Controller
@RequestMapping("/users")
class UserController {

  @RequestMapping("/{id}")
  String showUserForm(@PathVariable("id") User user, Model model) {

    model.addAttribute("user", user);
    return "userForm";
  }
}

The method receives a User instance directly, and no further lookup is necessary. The instance can be resolved by letting Spring MVC convert the path variable into the id type of the domain class first and eventually access the instance through calling findById(…) on the repository instance registered for the domain type.

Currently, the repository has to implement CrudRepository to be eligible to be discovered for conversion.
HandlerMethodArgumentResolvers for Pageable and Sort

The configuration snippet shown in the previous section also registers a PageableHandlerMethodArgumentResolver as well as an instance of SortHandlerMethodArgumentResolver. The registration enables Pageable and Sort as valid controller method arguments, as the following example shows:

Example 46. Using Pageable as a controller method argument
@Controller
@RequestMapping("/users")
class UserController {

  private final UserRepository repository;

  UserController(UserRepository repository) {
    this.repository = repository;
  }

  @RequestMapping
  String showUsers(Model model, Pageable pageable) {

    model.addAttribute("users", repository.findAll(pageable));
    return "users";
  }
}

The preceding method signature causes Spring MVC try to derive a Pageable instance from the request parameters by using the following default configuration:

Table 1. Request parameters evaluated for Pageable instances

page

Page you want to retrieve. 0-indexed and defaults to 0.

size

Size of the page you want to retrieve. Defaults to 20.

sort

Properties that should be sorted by in the format property,property(,ASC|DESC)(,IgnoreCase). The default sort direction is case-sensitive ascending. Use multiple sort parameters if you want to switch direction or case sensitivity — for example, ?sort=firstname&sort=lastname,asc&sort=city,ignorecase.

To customize this behavior, register a bean that implements the PageableHandlerMethodArgumentResolverCustomizer interface or the SortHandlerMethodArgumentResolverCustomizer interface, respectively. Its customize() method gets called, letting you change settings, as the following example shows:

@Bean SortHandlerMethodArgumentResolverCustomizer sortCustomizer() {
    return s -> s.setPropertyDelimiter("<-->");
}

If setting the properties of an existing MethodArgumentResolver is not sufficient for your purpose, extend either SpringDataWebConfiguration or the HATEOAS-enabled equivalent, override the pageableResolver() or sortResolver() methods, and import your customized configuration file instead of using the @Enable annotation.

If you need multiple Pageable or Sort instances to be resolved from the request (for multiple tables, for example), you can use Spring’s @Qualifier annotation to distinguish one from another. The request parameters then have to be prefixed with ${qualifier}_. The following example shows the resulting method signature:

String showUsers(Model model,
      @Qualifier("thing1") Pageable first,
      @Qualifier("thing2") Pageable second) { … }

You have to populate thing1_page, thing2_page, and so on.

The default Pageable passed into the method is equivalent to a PageRequest.of(0, 20), but you can customize it by using the @PageableDefault annotation on the Pageable parameter.

Hypermedia Support for Pageables

Spring HATEOAS ships with a representation model class (PagedResources) that allows enriching the content of a Page instance with the necessary Page metadata as well as links to let the clients easily navigate the pages. The conversion of a Page to a PagedResources is done by an implementation of the Spring HATEOAS ResourceAssembler interface, called the PagedResourcesAssembler. The following example shows how to use a PagedResourcesAssembler as a controller method argument:

Example 47. Using a PagedResourcesAssembler as controller method argument
@Controller
class PersonController {

  @Autowired PersonRepository repository;

  @RequestMapping(value = "/persons", method = RequestMethod.GET)
  HttpEntity<PagedResources<Person>> persons(Pageable pageable,
    PagedResourcesAssembler assembler) {

    Page<Person> persons = repository.findAll(pageable);
    return new ResponseEntity<>(assembler.toResources(persons), HttpStatus.OK);
  }
}

Enabling the configuration, as shown in the preceding example, lets the PagedResourcesAssembler be used as a controller method argument. Calling toResources(…) on it has the following effects:

  • The content of the Page becomes the content of the PagedResources instance.

  • The PagedResources object gets a PageMetadata instance attached, and it is populated with information from the Page and the underlying PageRequest.

  • The PagedResources may get prev and next links attached, depending on the page’s state. The links point to the URI to which the method maps. The pagination parameters added to the method match the setup of the PageableHandlerMethodArgumentResolver to make sure the links can be resolved later.

Assume we have 30 Person instances in the database. You can now trigger a request (GET http://localhost:8080/persons) and see output similar to the following:

{ "links" : [ { "rel" : "next",
                "href" : "http://localhost:8080/persons?page=1&size=20" }
  ],
  "content" : [
     … // 20 Person instances rendered here
  ],
  "pageMetadata" : {
    "size" : 20,
    "totalElements" : 30,
    "totalPages" : 2,
    "number" : 0
  }
}

The assembler produced the correct URI and also picked up the default configuration to resolve the parameters into a Pageable for an upcoming request. This means that, if you change that configuration, the links automatically adhere to the change. By default, the assembler points to the controller method it was invoked in, but you can customize that by passing a custom Link to be used as base to build the pagination links, which overloads the PagedResourcesAssembler.toResource(…) method.

Spring Data Jackson Modules

The core module, and some of the store specific ones, ship with a set of Jackson Modules for types, like org.springframework.data.geo.Distance and org.springframework.data.geo.Point, used by the Spring Data domain.
Those Modules are imported once web support is enabled and com.fasterxml.jackson.databind.ObjectMapper is available.

During initialization SpringDataJacksonModules, like the SpringDataJacksonConfiguration, get picked up by the infrastructure, so that the declared com.fasterxml.jackson.databind.Modules are made available to the Jackson ObjectMapper.

Data binding mixins for the following domain types are registered by the common infrastructure.

org.springframework.data.geo.Distance
org.springframework.data.geo.Point
org.springframework.data.geo.Box
org.springframework.data.geo.Circle
org.springframework.data.geo.Polygon

The individual module may provide additional SpringDataJacksonModules.
Please refer to the store specific section for more details.

Web Databinding Support

You can use Spring Data projections (described in [projections]) to bind incoming request payloads by using either JSONPath expressions (requires Jayway JsonPath or XPath expressions (requires XmlBeam), as the following example shows:

Example 48. HTTP payload binding using JSONPath or XPath expressions
@ProjectedPayload
public interface UserPayload {

  @XBRead("//firstname")
  @JsonPath("$..firstname")
  String getFirstname();

  @XBRead("/lastname")
  @JsonPath({ "$.lastname", "$.user.lastname" })
  String getLastname();
}

You can use the type shown in the preceding example as a Spring MVC handler method argument or by using ParameterizedTypeReference on one of methods of the RestTemplate. The preceding method declarations would try to find firstname anywhere in the given document. The lastname XML lookup is performed on the top-level of the incoming document. The JSON variant of that tries a top-level lastname first but also tries lastname nested in a user sub-document if the former does not return a value. That way, changes in the structure of the source document can be mitigated easily without having clients calling the exposed methods (usually a drawback of class-based payload binding).

Nested projections are supported as described in [projections]. If the method returns a complex, non-interface type, a Jackson ObjectMapper is used to map the final value.

For Spring MVC, the necessary converters are registered automatically as soon as @EnableSpringDataWebSupport is active and the required dependencies are available on the classpath. For usage with RestTemplate, register a ProjectingJackson2HttpMessageConverter (JSON) or XmlBeamHttpMessageConverter manually.

For more information, see the web projection example in the canonical Spring Data Examples repository.

Querydsl Web Support

For those stores that have QueryDSL integration, you can derive queries from the attributes contained in a Request query string.

Consider the following query string:

?firstname=Dave&lastname=Matthews

Given the User object from the previous examples, you can resolve a query string to the following value by using the QuerydslPredicateArgumentResolver, as follows:

QUser.user.firstname.eq("Dave").and(QUser.user.lastname.eq("Matthews"))
The feature is automatically enabled, along with @EnableSpringDataWebSupport, when Querydsl is found on the classpath.

Adding a @QuerydslPredicate to the method signature provides a ready-to-use Predicate, which you can run by using the QuerydslPredicateExecutor.

Type information is typically resolved from the method’s return type. Since that information does not necessarily match the domain type, it might be a good idea to use the root attribute of QuerydslPredicate.

The following example shows how to use @QuerydslPredicate in a method signature:

@Controller
class UserController {

  @Autowired UserRepository repository;

  @RequestMapping(value = "/", method = RequestMethod.GET)
  String index(Model model, @QuerydslPredicate(root = User.class) Predicate predicate,    (1)
          Pageable pageable, @RequestParam MultiValueMap<String, String> parameters) {

    model.addAttribute("users", repository.findAll(predicate, pageable));

    return "index";
  }
}
1 Resolve query string arguments to matching Predicate for User.

The default binding is as follows:

  • Object on simple properties as eq.

  • Object on collection like properties as contains.

  • Collection on simple properties as in.

You can customize those bindings through the bindings attribute of @QuerydslPredicate or by making use of Java 8 default methods and adding the QuerydslBinderCustomizer method to the repository interface, as follows:

interface UserRepository extends CrudRepository<User, String>,
                                 QuerydslPredicateExecutor<User>,                (1)
                                 QuerydslBinderCustomizer<QUser> {               (2)

  @Override
  default void customize(QuerydslBindings bindings, QUser user) {

    bindings.bind(user.username).first((path, value) -> path.contains(value))    (3)
    bindings.bind(String.class)
      .first((StringPath path, String value) -> path.containsIgnoreCase(value)); (4)
    bindings.excluding(user.password);                                           (5)
  }
}
1 QuerydslPredicateExecutor provides access to specific finder methods for Predicate.
2 QuerydslBinderCustomizer defined on the repository interface is automatically picked up and shortcuts @QuerydslPredicate(bindings=…​).
3 Define the binding for the username property to be a simple contains binding.
4 Define the default binding for String properties to be a case-insensitive contains match.
5 Exclude the password property from Predicate resolution.
You can register a QuerydslBinderCustomizerDefaults bean holding default Querydsl bindings before applying specific bindings from the repository or @QuerydslPredicate.

4.8.3. Repository Populators

If you work with the Spring JDBC module, you are probably familiar with the support for populating a DataSource with SQL scripts. A similar abstraction is available on the repositories level, although it does not use SQL as the data definition language because it must be store-independent. Thus, the populators support XML (through Spring’s OXM abstraction) and JSON (through Jackson) to define data with which to populate the repositories.

Assume you have a file called data.json with the following content:

Example 49. Data defined in JSON
[ { "_class" : "com.acme.Person",
 "firstname" : "Dave",
  "lastname" : "Matthews" },
  { "_class" : "com.acme.Person",
 "firstname" : "Carter",
  "lastname" : "Beauford" } ]

You can populate your repositories by using the populator elements of the repository namespace provided in Spring Data Commons. To populate the preceding data to your PersonRepository, declare a populator similar to the following:

Example 50. Declaring a Jackson repository populator
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:repository="http://www.springframework.org/schema/data/repository"
  xsi:schemaLocation="http://www.springframework.org/schema/beans
    https://www.springframework.org/schema/beans/spring-beans.xsd
    http://www.springframework.org/schema/data/repository
    https://www.springframework.org/schema/data/repository/spring-repository.xsd">

  <repository:jackson2-populator locations="classpath:data.json" />

</beans>

The preceding declaration causes the data.json file to be read and deserialized by a Jackson ObjectMapper.

The type to which the JSON object is unmarshalled is determined by inspecting the _class attribute of the JSON document. The infrastructure eventually selects the appropriate repository to handle the object that was deserialized.

To instead use XML to define the data the repositories should be populated with, you can use the unmarshaller-populator element. You configure it to use one of the XML marshaller options available in Spring OXM. See the Spring reference documentation for details. The following example shows how to unmarshall a repository populator with JAXB:

Example 51. Declaring an unmarshalling repository populator (using JAXB)
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:repository="http://www.springframework.org/schema/data/repository"
  xmlns:oxm="http://www.springframework.org/schema/oxm"
  xsi:schemaLocation="http://www.springframework.org/schema/beans
    https://www.springframework.org/schema/beans/spring-beans.xsd
    http://www.springframework.org/schema/data/repository
    https://www.springframework.org/schema/data/repository/spring-repository.xsd
    http://www.springframework.org/schema/oxm
    https://www.springframework.org/schema/oxm/spring-oxm.xsd">

  <repository:unmarshaller-populator locations="classpath:data.json"
    unmarshaller-ref="unmarshaller" />

  <oxm:jaxb2-marshaller contextPath="com.acme" />

</beans>

Reference Documentation

5. Elasticsearch Clients

This chapter illustrates configuration and usage of supported Elasticsearch client implementations.

Spring Data Elasticsearch operates upon an Elasticsearch client that is connected to a single Elasticsearch node or a cluster. Although the Elasticsearch Client can be used to work with the cluster, applications using Spring Data Elasticsearch normally use the higher level abstractions of Elasticsearch Operations and Elasticsearch Repositories.

5.1. Transport Client

The TransportClient is deprecated as of Elasticsearch 7 and will be removed in Elasticsearch 8. (see the Elasticsearch documentation). Spring Data Elasticsearch will support the TransportClient as long as it is available in the used Elasticsearch version but has deprecated the classes using it since version 4.0.

We strongly recommend to use the High Level REST Client instead of the TransportClient.

Example 52. Transport Client
@Configuration
public class TransportClientConfig extends ElasticsearchConfigurationSupport {

    @Bean
    public Client elasticsearchClient() throws UnknownHostException {
        Settings settings = Settings.builder().put("cluster.name", "elasticsearch").build();        (1)
        TransportClient client = new PreBuiltTransportClient(settings);
        client.addTransportAddress(new TransportAddress(InetAddress.getByName("127.0.0.1"), 9300)); (2)
        return client;
    }

    @Bean(name = { "elasticsearchOperations", "elasticsearchTemplate" })
    public ElasticsearchTemplate elasticsearchTemplate() throws UnknownHostException {

		ElasticsearchTemplate template = new ElasticsearchTemplate(elasticsearchClient, elasticsearchConverter);
		template.setRefreshPolicy(refreshPolicy());                                                 (3)

		return template;
    }
}

// ...

IndexRequest request = new IndexRequest("spring-data")
 .id(randomID())
 .source(someObject);

IndexResponse response = client.index(request);
1 The TransportClient must be configured with the cluster name.
2 The host and port to connect the client to.
3 the RefreshPolicy must be set in the ElasticsearchTemplate (override refreshPolicy() to not use the default)

5.2. High Level REST Client

The Java High Level REST Client is the default client of Elasticsearch, it provides a straight forward replacement for the TransportClient as it accepts and returns the very same request/response objects and therefore depends on the Elasticsearch core project. Asynchronous calls are operated upon a client managed thread pool and require a callback to be notified when the request is done.

Example 53. High Level REST Client
@Configuration
public class RestClientConfig extends AbstractElasticsearchConfiguration {

    @Override
    @Bean
    public RestHighLevelClient elasticsearchClient() {

        final ClientConfiguration clientConfiguration = ClientConfiguration.builder()  (1)
            .connectedTo("localhost:9200")
            .build();

        return RestClients.create(clientConfiguration).rest();                         (2)
    }
}

// ...

  @Autowired
  RestHighLevelClient highLevelClient;

  RestClient lowLevelClient = highLevelClient.lowLevelClient();                        (3)

// ...

IndexRequest request = new IndexRequest("spring-data")
  .id(randomID())
  .source(singletonMap("feature", "high-level-rest-client"))
  .setRefreshPolicy(IMMEDIATE);

IndexResponse response = highLevelClient.index(request,RequestOptions.DEFAULT);
1 Use the builder to provide cluster addresses, set default HttpHeaders or enable SSL.
2 Create the RestHighLevelClient.
3 It is also possible to obtain the lowLevelRest() client.

5.3. Reactive Client

The ReactiveElasticsearchClient is a non official driver based on WebClient. It uses the request/response objects provided by the Elasticsearch core project. Calls are directly operated on the reactive stack, not wrapping async (thread pool bound) responses into reactive types.

Example 54. Reactive REST Client
@Configuration
public class ReactiveRestClientConfig extends AbstractReactiveElasticsearchConfiguration {

    @Override
    @Bean
    public ReactiveElasticsearchClient reactiveElasticsearchClient() {
        final ClientConfiguration clientConfiguration = ClientConfiguration.builder() (1)
            .connectedTo("localhost:9200") //
            .build();
        return ReactiveRestClients.create(clientConfiguration);

    }
}
// ...

Mono<IndexResponse> response = client.index(request ->

  request.index("spring-data")
    .id(randomID())
    .source(singletonMap("feature", "reactive-client"));
);
1 Use the builder to provide cluster addresses, set default HttpHeaders or enable SSL.
The ReactiveClient response, especially for search operations, is bound to the from (offset) & size (limit) options of the request.

5.4. Client Configuration

Client behaviour can be changed via the ClientConfiguration that allows to set options for SSL, connect and socket timeouts, headers and other parameters.

Example 55. Client Configuration
HttpHeaders httpHeaders = new HttpHeaders();
httpHeaders.add("some-header", "on every request")                      (1)

ClientConfiguration clientConfiguration = ClientConfiguration.builder()
  .connectedTo("localhost:9200", "localhost:9291")                      (2)
  .usingSsl()                                                           (3)
  .withProxy("localhost:8888")                                          (4)
  .withPathPrefix("ela")                                                (5)
  .withConnectTimeout(Duration.ofSeconds(5))                            (6)
  .withSocketTimeout(Duration.ofSeconds(3))                             (7)
  .withDefaultHeaders(defaultHeaders)                                   (8)
  .withBasicAuth(username, password)                                    (9)
  .withHeaders(() -> {                                                  (10)
    HttpHeaders headers = new HttpHeaders();
    headers.add("currentTime", LocalDateTime.now().format(DateTimeFormatter.ISO_LOCAL_DATE_TIME));
    return headers;
  })
  .withWebClientConfigurer(webClient -> {                               (11)
    //...
    return webClient;
  })
  .withHttpClientConfigurer(clientBuilder -> {                          (12)
      //...
      return clientBuilder;
  })
  . // ... other options
  .build();
1 Define default headers, if they need to be customized
2 Use the builder to provide cluster addresses, set default HttpHeaders or enable SSL.
3 Optionally enable SSL.
4 Optionally set a proxy.
5 Optionally set a path prefix, mostly used when different clusters a behind some reverse proxy.
6 Set the connection timeout. Default is 10 sec.
7 Set the socket timeout. Default is 5 sec.
8 Optionally set headers.
9 Add basic authentication.
10 A Supplier<Header> function can be specified which is called every time before a request is sent to Elasticsearch - here, as an example, the current time is written in a header.
11 for reactive setup a function configuring the WebClient
12 for non-reactive setup a function configuring the REST client
Adding a Header supplier as shown in above example allows to inject headers that may change over the time, like authentication JWT tokens. If this is used in the reactive setup, the supplier function must not block!

5.4.1. Elasticsearch 7 compatibility headers

When using Spring Data Elasticsearch 4 - which uses the Elasticsearch 7 client libraries - and accessing an Elasticsearch cluster that is running on version 8, it is necessary to set the compatibility headers see Elasticserach documentation. This should be done using a header supplier like shown above:

ClientConfigurationBuilder configurationBuilder = new ClientConfigurationBuilder();
    configurationBuilder //
		// ...
		.withHeaders(() -> {
			HttpHeaders defaultCompatibilityHeaders = new HttpHeaders();
			defaultCompatibilityHeaders.add("Accept",
                          "application/vnd.elasticsearch+json;compatible-with=7");
			defaultCompatibilityHeaders.add("Content-Type",
                          "application/vnd.elasticsearch+json;compatible-with=7");
			return defaultCompatibilityHeaders;
		});

5.5. Client Logging

To see what is actually sent to and received from the server Request / Response logging on the transport level needs to be turned on as outlined in the snippet below.

Enable transport layer logging
<logger name="org.springframework.data.elasticsearch.client.WIRE" level="trace"/>
The above applies to both the RestHighLevelClient and ReactiveElasticsearchClient when obtained via RestClients respectively ReactiveRestClients, is not available for the TransportClient.

6. Elasticsearch Object Mapping

Spring Data Elasticsearch Object Mapping is the process that maps a Java object - the domain entity - into the JSON representation that is stored in Elasticsearch and back.

Earlier versions of Spring Data Elasticsearch used a Jackson based conversion, Spring Data Elasticsearch 3.2.x introduced the Meta Model Object Mapping. As of version 4.0 only the Meta Object Mapping is used, the Jackson based mapper is not available anymore and the MappingElasticsearchConverter is used.

The main reasons for the removal of the Jackson based mapper are:

  • Custom mappings of fields needed to be done with annotations like @JsonFormat or @JsonInclude. This often caused problems when the same object was used in different JSON based datastores or sent over a JSON based API.

  • Custom field types and formats also need to be stored into the Elasticsearch index mappings. The Jackson based annotations did not fully provide all the information that is necessary to represent the types of Elasticsearch.

  • Fields must be mapped not only when converting from and to entities, but also in query argument, returned data and on other places.

Using the MappingElasticsearchConverter now covers all these cases.

6.1. Meta Model Object Mapping

The Metamodel based approach uses domain type information for reading/writing from/to Elasticsearch. This allows to register Converter instances for specific domain type mapping.

6.1.1. Mapping Annotation Overview

The MappingElasticsearchConverter uses metadata to drive the mapping of objects to documents. The metadata is taken from the entity’s properties which can be annotated.

The following annotations are available:

  • @Document: Applied at the class level to indicate this class is a candidate for mapping to the database. The most important attributes are:

    • indexName: the name of the index to store this entity in. This can contain a SpEL template expression like "log-#{T(java.time.LocalDate).now().toString()}"

    • type: the mapping type. If not set, the lowercased simple name of the class is used. (deprecated since version 4.0)

    • createIndex: flag whether to create an index on repository bootstrapping. Default value is true. See Automatic creation of indices with the corresponding mapping

    • versionType: Configuration of version management. Default value is EXTERNAL.

  • @Id: Applied at the field level to mark the field used for identity purpose.

  • @Transient: By default all fields are mapped to the document when it is stored or retrieved, this annotation excludes the field.

  • @PersistenceConstructor: Marks a given constructor - even a package protected one - to use when instantiating the object from the database. Constructor arguments are mapped by name to the key values in the retrieved Document.

  • @Field: Applied at the field level and defines properties of the field, most of the attributes map to the respective Elasticsearch Mapping definitions (the following list is not complete, check the annotation Javadoc for a complete reference):

    • name: The name of the field as it will be represented in the Elasticsearch document, if not set, the Java field name is used.

    • type: The field type, can be one of Text, Keyword, Long, Integer, Short, Byte, Double, Float, Half_Float, Scaled_Float, Date, Date_Nanos, Boolean, Binary, Integer_Range, Float_Range, Long_Range, Double_Range, Date_Range, Ip_Range, Object, Nested, Ip, TokenCount, Percolator, Flattened, Search_As_You_Type. See Elasticsearch Mapping Types

    • format: One or more built-in date formats, see the next section Date format mapping.

    • pattern: One or more custom date formats, see the next section Date format mapping.

    • store: Flag whether the original field value should be store in Elasticsearch, default value is false.

    • analyzer, searchAnalyzer, normalizer for specifying custom analyzers and normalizer.

  • @GeoPoint: Marks a field as geo_point datatype. Can be omitted if the field is an instance of the GeoPoint class.

The mapping metadata infrastructure is defined in a separate spring-data-commons project that is technology agnostic.

Date format mapping

Properties that derive from TemporalAccessor or are of type java.util.Date must either have a @Field annotation of type FieldType.Date or a custom converter must be registered for this type. This paragraph describes the use of FieldType.Date.

There are two attributes of the @Field annotation that define which date format information is written to the mapping (also see Elasticsearch Built In Formats and Elasticsearch Custom Date Formats)

The format attributes is used to define at least one of the predefined formats. If it is not defined, then a default value of _date_optional_time and epoch_millis is used.

The pattern attribute can be used to add additional custom format strings. If you want to use only custom date formats, you must set the format property to empty {}.

The following table shows the different attributes and the mapping created from their values:

annotation format string in Elasticsearch mapping

@Field(type=FieldType.Date)

"date_optional_time||epoch_millis",

@Field(type=FieldType.Date, format=DateFormat.basic_date)

"basic_date"

@Field(type=FieldType.Date, format={DateFormat.basic_date, DateFormat.basic_time})

"basic_date||basic_time"

@Field(type=FieldType.Date, pattern="dd.MM.uuuu")

"date_optional_time||epoch_millis||dd.MM.uuuu",

@Field(type=FieldType.Date, format={}, pattern="dd.MM.uuuu")

"dd.MM.uuuu"

If you are using a custom date format, you need to use uuuu for the year instead of yyyy. This is due to a change in Elasticsearch 7.
Mapped field names

Without further configuration, Spring Data Elasticsearch will use the property name of an object as field name in Elasticsearch. This can be changed for individual field by using the @Field annotation on that property.

It is also possible to define a FieldNamingStrategy in the configuration of the client (Elasticsearch Clients). If for example a SnakeCaseFieldNamingStrategy is configured, the property sampleProperty of the object would be mapped to sample_property in Elasticsearch. A FieldNamingStrategy applies to all entities; it can be overwritten by setting a specific name with @Field on a property.

6.1.2. Mapping Rules

Type Hints

Mapping uses type hints embedded in the document sent to the server to allow generic type mapping. Those type hints are represented as _class attributes within the document and are written for each aggregate root.

Example 56. Type Hints
public class Person {              (1)

  @Id String id;
  String firstname;
  String lastname;
}
{
  "_class" : "com.example.Person", (1)
  "id" : "cb7bef",
  "firstname" : "Sarah",
  "lastname" : "Connor"
}
1 By default the domain types class name is used for the type hint.

Type hints can be configured to hold custom information. Use the @TypeAlias annotation to do so.

Make sure to add types with @TypeAlias to the initial entity set (AbstractElasticsearchConfiguration#getInitialEntitySet) to already have entity information available when first reading data from the store.
Example 57. Type Hints with Alias
@TypeAlias("human")                (1)
public class Person {

  @Id String id;
  // ...
}
{
  "_class" : "human",              (1)
  "id" : ...
}
1 The configured alias is used when writing the entity.
Type hints will not be written for nested Objects unless the properties type is Object, an interface or the actual value type does not match the properties declaration.
Geospatial Types

Geospatial types like Point & GeoPoint are converted into lat/lon pairs.

Example 58. Geospatial types
public class Address {

  String city, street;
  Point location;
}
{
  "city" : "Los Angeles",
  "street" : "2800 East Observatory Road",
  "location" : { "lat" : 34.118347, "lon" : -118.3026284 }
}
GeoJson Types

Spring Data Elasticsearch supports the GeoJson types by providing an interface GeoJson and implementations for the different geometries. They are mapped to Elasticsearch documents according to the GeoJson specification. The corresponding properties of the entity are specified in the index mappings as geo_shape when the index mappings is written. (check the Elasticsearch documentation as well)

Example 59. GeoJson types
public class Address {

  String city, street;
  GeoJsonPoint location;
}
{
  "city": "Los Angeles",
  "street": "2800 East Observatory Road",
  "location": {
    "type": "Point",
    "coordinates": [-118.3026284, 34.118347]
  }
}

The following GeoJson types are implemented:

  • GeoJsonPoint

  • GeoJsonMultiPoint

  • GeoJsonLineString

  • GeoJsonMultiLineString

  • GeoJsonPolygon

  • GeoJsonMultiPolygon

  • GeoJsonGeometryCollection

Collections

For values inside Collections apply the same mapping rules as for aggregate roots when it comes to type hints and Custom Conversions.

Example 60. Collections
public class Person {

  // ...

  List<Person> friends;

}
{
  // ...

  "friends" : [ { "firstname" : "Kyle", "lastname" : "Reese" } ]
}
Maps

For values inside Maps apply the same mapping rules as for aggregate roots when it comes to type hints and Custom Conversions. However the Map key needs to a String to be processed by Elasticsearch.

Example 61. Collections
public class Person {

  // ...

  Map<String, Address> knownLocations;

}
{
  // ...

  "knownLocations" : {
    "arrivedAt" : {
       "city" : "Los Angeles",
       "street" : "2800 East Observatory Road",
       "location" : { "lat" : 34.118347, "lon" : -118.3026284 }
     }
  }
}

6.1.3. Custom Conversions

Looking at the Configuration from the previous section ElasticsearchCustomConversions allows registering specific rules for mapping domain and simple types.

Example 62. Meta Model Object Mapping Configuration
@Configuration
public class Config extends AbstractElasticsearchConfiguration {

  @Override
  public RestHighLevelClient elasticsearchClient() {
    return RestClients.create(ClientConfiguration.create("localhost:9200")).rest();
  }

  @Bean
  @Override
  public ElasticsearchCustomConversions elasticsearchCustomConversions() {
    return new ElasticsearchCustomConversions(
      Arrays.asList(new AddressToMap(), new MapToAddress()));       (1)
  }

  @WritingConverter                                                 (2)
  static class AddressToMap implements Converter<Address, Map<String, Object>> {

    @Override
    public Map<String, Object> convert(Address source) {

      LinkedHashMap<String, Object> target = new LinkedHashMap<>();
      target.put("ciudad", source.getCity());
      // ...

      return target;
    }
  }

  @ReadingConverter                                                 (3)
  static class MapToAddress implements Converter<Map<String, Object>, Address> {

    @Override
    public Address convert(Map<String, Object> source) {

      // ...
      return address;
    }
  }
}
{
  "ciudad" : "Los Angeles",
  "calle" : "2800 East Observatory Road",
  "localidad" : { "lat" : 34.118347, "lon" : -118.3026284 }
}
1 Add Converter implementations.
2 Set up the Converter used for writing DomainType to Elasticsearch.
3 Set up the Converter used for reading DomainType from search result.

7. Elasticsearch Operations

Spring Data Elasticsearch uses several interfaces to define the operations that can be called against an Elasticsearch index (for a description of the reactive interfaces see Reactive Elasticsearch Operations).

  • IndexOperations defines actions on index level like creating or deleting an index.

  • DocumentOperations defines actions to store, update and retrieve entities based on their id.

  • SearchOperations define the actions to search for multiple entities using queries

  • ElasticsearchOperations combines the DocumentOperations and SearchOperations interfaces.

These interfaces correspond to the structuring of the Elasticsearch API.

The default implementations of the interfaces offer:

  • index management functionality.

  • Read/Write mapping support for domain types.

  • A rich query and criteria api.

  • Resource management and Exception translation.

Index management and automatic creation of indices and mappings.

The IndexOperations interface and the provided implementation which can be obtained from an ElasticsearchOperations instance - for example with a call to operations.indexOps(clazz)- give the user the ability to create indices, put mappings or store template and alias information in the Elasticsearch cluster. Details of the index that will be created can be set by using the @Setting annotation, refer to Index settings for further information.

None of these operations are done automatically by the implementations of IndexOperations or ElasticsearchOperations. It is the user’s responsibility to call the methods.

There is support for automatic creation of indices and writing the mappings when using Spring Data Elasticsearch repositories, see Automatic creation of indices with the corresponding mapping

7.1. ElasticsearchTemplate

Usage of the ElasticsearchTemplate is deprecated as of version 4.0, use ElasticsearchRestTemplate instead.

The ElasticsearchTemplate is an implementation of the ElasticsearchOperations interface using the Transport Client.

Example 63. ElasticsearchTemplate configuration
@Configuration
public class TransportClientConfig extends ElasticsearchConfigurationSupport {

  @Bean
  public Client elasticsearchClient() throws UnknownHostException {                 (1)
    Settings settings = Settings.builder().put("cluster.name", "elasticsearch").build();
    TransportClient client = new PreBuiltTransportClient(settings);
    client.addTransportAddress(new TransportAddress(InetAddress.getByName("127.0.0.1"), 9300));
    return client;
  }

  @Bean(name = {"elasticsearchOperations", "elasticsearchTemplate"})
  public ElasticsearchTemplate elasticsearchTemplate() throws UnknownHostException { (2)
  	return new ElasticsearchTemplate(elasticsearchClient());
  }
}
1 Setting up the Transport Client. Deprecated as of version 4.0.
2 Creating the ElasticsearchTemplate bean, offering both names, elasticsearchOperations and elasticsearchTemplate.

7.2. ElasticsearchRestTemplate

The ElasticsearchRestTemplate is an implementation of the ElasticsearchOperations interface using the High Level REST Client.

Example 64. ElasticsearchRestTemplate configuration
@Configuration
public class RestClientConfig extends AbstractElasticsearchConfiguration {
  @Override
  public RestHighLevelClient elasticsearchClient() {       (1)
    return RestClients.create(ClientConfiguration.localhost()).rest();
  }

  // no special bean creation needed                       (2)
}
1 Setting up the High Level REST Client.
2 The base class AbstractElasticsearchConfiguration already provides the elasticsearchTemplate bean.

7.3. Usage examples

As both ElasticsearchTemplate and ElasticsearchRestTemplate implement the ElasticsearchOperations interface, the code to use them is not different. The example shows how to use an injected ElasticsearchOperations instance in a Spring REST controller. The decision, if this is using the TransportClient or the RestClient is made by providing the corresponding Bean with one of the configurations shown above.

Example 65. ElasticsearchOperations usage
@RestController
@RequestMapping("/")
public class TestController {

  private  ElasticsearchOperations elasticsearchOperations;

  public TestController(ElasticsearchOperations elasticsearchOperations) { (1)
    this.elasticsearchOperations = elasticsearchOperations;
  }

  @PostMapping("/person")
  public String save(@RequestBody Person person) {                         (2)

    IndexQuery indexQuery = new IndexQueryBuilder()
      .withId(person.getId().toString())
      .withObject(person)
      .build();
    String documentId = elasticsearchOperations.index(indexQuery);
    return documentId;
  }

  @GetMapping("/person/{id}")
  public Person findById(@PathVariable("id")  Long id) {                   (3)
    Person person = elasticsearchOperations
      .queryForObject(GetQuery.getById(id.toString()), Person.class);
    return person;
  }
}
1 Let Spring inject the provided ElasticsearchOperations bean in the constructor.
2 Store some entity in the Elasticsearch cluster.
3 Retrieve the entity with a query by id.

To see the full possibilities of ElasticsearchOperations please refer to the API documentation.

7.4. Reactive Elasticsearch Operations

ReactiveElasticsearchOperations is the gateway to executing high level commands against an Elasticsearch cluster using the ReactiveElasticsearchClient.

The ReactiveElasticsearchTemplate is the default implementation of ReactiveElasticsearchOperations.

7.4.1. Reactive Elasticsearch Template

To get started the ReactiveElasticsearchTemplate needs to know about the actual client to work with. Please see Reactive Client for details on the client.

Reactive Template Configuration

The easiest way of setting up the ReactiveElasticsearchTemplate is via AbstractReactiveElasticsearchConfiguration providing dedicated configuration method hooks for base package, the initial entity set etc.

Example 66. The AbstractReactiveElasticsearchConfiguration
@Configuration
public class Config extends AbstractReactiveElasticsearchConfiguration {

  @Bean (1)
  @Override
  public ReactiveElasticsearchClient reactiveElasticsearchClient() {
      // ...
  }
}
1 Configure the client to use. This can be done by ReactiveRestClients or directly via DefaultReactiveElasticsearchClient.
If applicable set default HttpHeaders via the ClientConfiguration of the ReactiveElasticsearchClient. See Client Configuration.
If needed the ReactiveElasticsearchTemplate can be configured with default RefreshPolicy and IndicesOptions that get applied to the related requests by overriding the defaults of refreshPolicy() and indicesOptions().

However one might want to be more in control over the actual components and use a more verbose approach.

Example 67. Configure the ReactiveElasticsearchTemplate
@Configuration
public class Config {

  @Bean (1)
  public ReactiveElasticsearchClient reactiveElasticsearchClient() {
    // ...
  }
  @Bean (2)
  public ElasticsearchConverter elasticsearchConverter() {
    return new MappingElasticsearchConverter(elasticsearchMappingContext());
  }
  @Bean (3)
  public SimpleElasticsearchMappingContext elasticsearchMappingContext() {
    return new SimpleElasticsearchMappingContext();
  }
  @Bean (4)
  public ReactiveElasticsearchOperations reactiveElasticsearchOperations() {
    return new ReactiveElasticsearchTemplate(reactiveElasticsearchClient(), elasticsearchConverter());
  }
}
1 Configure the client to use. This can be done by ReactiveRestClients or directly via DefaultReactiveElasticsearchClient.
2 Set up the ElasticsearchConverter used for domain type mapping utilizing metadata provided by the mapping context.
3 The Elasticsearch specific mapping context for domain type metadata.
4 The actual template based on the client and conversion infrastructure.
Reactive Template Usage

ReactiveElasticsearchTemplate lets you save, find and delete your domain objects and map those objects to documents stored in Elasticsearch.

Consider the following:

Example 68. Use the ReactiveElasticsearchTemplate
@Document(indexName = "marvel")
public class Person {

  private @Id String id;
  private String name;
  private int age;
  // Getter/Setter omitted...
}
template.save(new Person("Bruce Banner", 42))                    (1)
  .doOnNext(System.out::println)
  .flatMap(person -> template.findById(person.id, Person.class)) (2)
  .doOnNext(System.out::println)
  .flatMap(person -> template.delete(person))                    (3)
  .doOnNext(System.out::println)
  .flatMap(id -> template.count(Person.class))                   (4)
  .doOnNext(System.out::println)
  .subscribe(); (5)

The above outputs the following sequence on the console.

> Person(id=QjWCWWcBXiLAnp77ksfR, name=Bruce Banner, age=42)
> Person(id=QjWCWWcBXiLAnp77ksfR, name=Bruce Banner, age=42)
> QjWCWWcBXiLAnp77ksfR
> 0
1 Insert a new Person document into the marvel index under type characters. The id is generated on server side and set into the instance returned.
2 Lookup the Person with matching id in the marvel index under type characters.
3 Delete the Person with matching id, extracted from the given instance, in the marvel index under type characters.
4 Count the total number of documents in the marvel index under type characters.
5 Don’t forget to subscribe().

7.5. Search Result Types

When a document is retrieved with the methods of the DocumentOperations interface, just the found entity will be returned. When searching with the methods of the SearchOperations interface, additional information is available for each entity, for example the score or the sortValues of the found entity.

In order to return this information, each entity is wrapped in a SearchHit object that contains this entity-specific additional information. These SearchHit objects themselves are returned within a SearchHits object which additionally contains informations about the whole search like the maxScore or requested aggregations. The following classes and interfaces are now available:

SearchHit<T>

Contains the following information:

  • Id

  • Score

  • Sort Values

  • Highlight fields

  • Inner hits (this is an embedded SearchHits object containing eventually returned inner hits)

  • The retrieved entity of type <T>

SearchHits<T>

Contains the following information:

  • Number of total hits

  • Total hits relation

  • Maximum score

  • A list of SearchHit<T> objects

  • Returned aggregations

SearchPage<T>

Defines a Spring Data Page that contains a SearchHits<T> element and can be used for paging access using repository methods.

SearchScrollHits<T>

Returned by the low level scroll API functions in ElasticsearchRestTemplate, it enriches a SearchHits<T> with the Elasticsearch scroll id.

SearchHitsIterator<T>

An Iterator returned by the streaming functions of the SearchOperations interface.

7.6. Queries

Almost all of the methods defined in the SearchOperations and ReactiveSearchOperations interface take a Query parameter that defines the query to execute for searching. Query is an interface and Spring Data Elasticsearch provides three implementations: CriteriaQuery, StringQuery and NativeSearchQuery.

7.6.1. CriteriaQuery

CriteriaQuery based queries allow the creation of queries to search for data without knowing the syntax or basics of Elasticsearch queries. They allow the user to build queries by simply chaining and combining Criteria objects that specifiy the criteria the searched documents must fulfill.

when talking about AND or OR when combining criteria keep in mind, that in Elasticsearch AND are converted to a must condition and OR to a should

Criteria and their usage are best explained by example (let’s assume we have a Book entity with a price property):

Example 69. Get books with a given price
Criteria criteria = new Criteria("price").is(42.0);
Query query = new CriteriaQuery(criteria);

Conditions for the same field can be chained, they will be combined with a logical AND:

Example 70. Get books with a given price
Criteria criteria = new Criteria("price").greaterThan(42.0).lessThan(34.0L);
Query query = new CriteriaQuery(criteria);

When chaining Criteria, by default a AND logic is used:

Example 71. Get all persons with first name James and last name Miller:
Criteria criteria = new Criteria("lastname").is("Miller") (1)
  .and("firstname").is("James")                           (2)
Query query = new CriteriaQuery(criteria);
1 the first Criteria
2 the and() creates a new Criteria and chaines it to the first one.

If you want to create nested queries, you need to use subqueries for this. Let’s assume we want to find all persons with a last name of Miller and a first name of either Jack or John:

Example 72. Nested subqueries
Criteria miller = new Criteria("lastName").is("Miller")  (1)
  .subCriteria(                                          (2)
    new Criteria().or("firstName").is("John")            (3)
      .or("firstName").is("Jack")                        (4)
  );
Query query = new CriteriaQuery(criteria);
1 create a first Criteria for the last name
2 this is combined with AND to a subCriteria
3 This sub Criteria is an OR combination for the first name John
4 and the first name Jack

Please refer to the API documentation of the Criteria class for a complete overview of the different available operations.

7.6.2. StringQuery

This class takes an Elasticsearch query as JSON String. The following code shows a query that searches for persons having the first name "Jack":

Query query = new SearchQuery("{ \"match\": { \"firstname\": { \"query\": \"Jack\" } } } ");
SearchHits<Person> searchHits = operations.search(query, Person.class);

Using StringQuery may be appropriate if you already have an Elasticsearch query to use.

7.6.3. NativeSearchQuery

NativeSearchQuery is the class to use when you have a complex query, or a query that cannot be expressed by using the Criteria API, for example when building queries and using aggregates. It allows to use all the different QueryBuilder implementations from the Elasticsearch library therefore named "native".

The following code shows how to search for persons with a given firstname and for the found documents have a terms aggregation that counts the number of occurences of the lastnames for these persons:

Query query = new NativeSearchQueryBuilder()
    .addAggregation(terms("lastnames").field("lastname").size(10)) //
    .withQuery(QueryBuilders.matchQuery("firstname", firstName))
    .build();

SearchHits<Person> searchHits = operations.search(query, Person.class);

8. Elasticsearch Repositories

This chapter includes details of the Elasticsearch repository implementation.

Example 73. The sample Book entity
@Document(indexName="books")
class Book {
    @Id
    private String id;

    @Field(type = FieldType.text)
    private String name;

    @Field(type = FieldType.text)
    private String summary;

    @Field(type = FieldType.Integer)
    private Integer price;

	// getter/setter ...
}

8.1. Automatic creation of indices with the corresponding mapping

The @Document annotation has an argument createIndex. If this argument is set to true - which is the default value - Spring Data Elasticsearch will during bootstrapping the repository support on application startup check if the index defined by the @Document annotation exists.

If it does not exist, the index will be created and the mappings derived from the entity’s annotations (see Elasticsearch Object Mapping) will be written to the newly created index. Details of the index that will be created can be set by using the @Setting annotation, refer to Index settings for further information.

8.2. Query methods

8.2.1. Query lookup strategies

The Elasticsearch module supports all basic query building feature as string queries, native search queries, criteria based queries or have it being derived from the method name.

Declared queries

Deriving the query from the method name is not always sufficient and/or may result in unreadable method names. In this case one might make use of the @Query annotation (see Using @Query Annotation ).

8.2.2. Query creation

Generally the query creation mechanism for Elasticsearch works as described in Query Methods. Here’s a short example of what a Elasticsearch query method translates into:

Example 74. Query creation from method names
interface BookRepository extends Repository<Book, String> {
  List<Book> findByNameAndPrice(String name, Integer price);
}

The method name above will be translated into the following Elasticsearch json query

{
    "query": {
        "bool" : {
            "must" : [
                { "query_string" : { "query" : "?", "fields" : [ "name" ] } },
                { "query_string" : { "query" : "?", "fields" : [ "price" ] } }
            ]
        }
    }
}

A list of supported keywords for Elasticsearch is shown below.

Table 2. Supported keywords inside method names
Keyword Sample Elasticsearch Query String

And

findByNameAndPrice

{ "query" : { "bool" : { "must" : [ { "query_string" : { "query" : "?", "fields" : [ "name" ] } }, { "query_string" : { "query" : "?", "fields" : [ "price" ] } } ] } }}

Or

findByNameOrPrice

{ "query" : { "bool" : { "should" : [ { "query_string" : { "query" : "?", "fields" : [ "name" ] } }, { "query_string" : { "query" : "?", "fields" : [ "price" ] } } ] } }}

Is

findByName

{ "query" : { "bool" : { "must" : [ { "query_string" : { "query" : "?", "fields" : [ "name" ] } } ] } }}

Not

findByNameNot

{ "query" : { "bool" : { "must_not" : [ { "query_string" : { "query" : "?", "fields" : [ "name" ] } } ] } }}

Between

findByPriceBetween

{ "query" : { "bool" : { "must" : [ {"range" : {"price" : {"from" : ?, "to" : ?, "include_lower" : true, "include_upper" : true } } } ] } }}

LessThan

findByPriceLessThan

{ "query" : { "bool" : { "must" : [ {"range" : {"price" : {"from" : null, "to" : ?, "include_lower" : true, "include_upper" : false } } } ] } }}

LessThanEqual

findByPriceLessThanEqual

{ "query" : { "bool" : { "must" : [ {"range" : {"price" : {"from" : null, "to" : ?, "include_lower" : true, "include_upper" : true } } } ] } }}

GreaterThan

findByPriceGreaterThan

{ "query" : { "bool" : { "must" : [ {"range" : {"price" : {"from" : ?, "to" : null, "include_lower" : false, "include_upper" : true } } } ] } }}

GreaterThanEqual

findByPriceGreaterThan

{ "query" : { "bool" : { "must" : [ {"range" : {"price" : {"from" : ?, "to" : null, "include_lower" : true, "include_upper" : true } } } ] } }}

Before

findByPriceBefore

{ "query" : { "bool" : { "must" : [ {"range" : {"price" : {"from" : null, "to" : ?, "include_lower" : true, "include_upper" : true } } } ] } }}

After

findByPriceAfter

{ "query" : { "bool" : { "must" : [ {"range" : {"price" : {"from" : ?, "to" : null, "include_lower" : true, "include_upper" : true } } } ] } }}

Like

findByNameLike

{ "query" : { "bool" : { "must" : [ { "query_string" : { "query" : "?*", "fields" : [ "name" ] }, "analyze_wildcard": true } ] } }}

StartingWith

findByNameStartingWith

{ "query" : { "bool" : { "must" : [ { "query_string" : { "query" : "?*", "fields" : [ "name" ] }, "analyze_wildcard": true } ] } }}

EndingWith

findByNameEndingWith

{ "query" : { "bool" : { "must" : [ { "query_string" : { "query" : "*?", "fields" : [ "name" ] }, "analyze_wildcard": true } ] } }}

Contains/Containing

findByNameContaining

{ "query" : { "bool" : { "must" : [ { "query_string" : { "query" : "*?*", "fields" : [ "name" ] }, "analyze_wildcard": true } ] } }}

In (when annotated as FieldType.Keyword)

findByNameIn(Collection<String>names)

{ "query" : { "bool" : { "must" : [ {"bool" : {"must" : [ {"terms" : {"name" : ["?","?"]}} ] } } ] } }}

In

findByNameIn(Collection<String>names)

{ "query": {"bool": {"must": [{"query_string":{"query": "\"?\" \"?\"", "fields": ["name"]}}]}}}

NotIn (when annotated as FieldType.Keyword)

findByNameNotIn(Collection<String>names)

{ "query" : { "bool" : { "must" : [ {"bool" : {"must_not" : [ {"terms" : {"name" : ["?","?"]}} ] } } ] } }}

NotIn

findByNameNotIn(Collection<String>names)

{"query": {"bool": {"must": [{"query_string": {"query": "NOT(\"?\" \"?\")", "fields": ["name"]}}]}}}

Near

findByStoreNear

Not Supported Yet !

True

findByAvailableTrue

{ "query" : { "bool" : { "must" : [ { "query_string" : { "query" : "true", "fields" : [ "available" ] } } ] } }}

False

findByAvailableFalse

{ "query" : { "bool" : { "must" : [ { "query_string" : { "query" : "false", "fields" : [ "available" ] } } ] } }}

OrderBy

findByAvailableTrueOrderByNameDesc

{ "query" : { "bool" : { "must" : [ { "query_string" : { "query" : "true", "fields" : [ "available" ] } } ] } }, "sort":[{"name":{"order":"desc"}}] }

Methods names to build Geo-shape queries taking GeoJson parameters are not supported. Use ElasticsearchOperations with CriteriaQuery in a custom repository implementation if you need to have such a function in a repository.

8.2.3. Method return types

Repository methods can be defined to have the following return types for returning multiple Elements:

  • List<T>

  • Stream<T>

  • SearchHits<T>

  • List<SearchHit<T>>

  • Stream<SearchHit<T>>

  • SearchPage<T>

8.2.4. Using @Query Annotation

Example 75. Declare query at the method using the @Query annotation.
interface BookRepository extends ElasticsearchRepository<Book, String> {
    @Query("{\"match\": {\"name\": {\"query\": \"?0\"}}}")
    Page<Book> findByName(String name,Pageable pageable);
}

The String that is set as the annotation argument must be a valid Elasticsearch JSON query. It will be sent to Easticsearch as value of the query element; if for example the function is called with the parameter John, it would produce the following query body:

{
  "query": {
    "match": {
      "name": {
        "query": "John"
      }
    }
  }
}

8.3. Reactive Elasticsearch Repositories

Reactive Elasticsearch repository support builds on the core repository support explained in Working with Spring Data Repositories utilizing operations provided via Reactive Elasticsearch Operations executed by a Reactive Client.

Spring Data Elasticsearch reactive repository support uses Project Reactor as its reactive composition library of choice.

There are 3 main interfaces to be used:

  • ReactiveRepository

  • ReactiveCrudRepository

  • ReactiveSortingRepository

8.3.1. Usage

To access domain objects stored in a Elasticsearch using a Repository, just create an interface for it. Before you can actually go on and do that you will need an entity.

Example 76. Sample Person entity
public class Person {

  @Id
  private String id;
  private String firstname;
  private String lastname;
  private Address address;

  // … getters and setters omitted
}
Please note that the id property needs to be of type String.
Example 77. Basic repository interface to persist Person entities
interface ReactivePersonRepository extends ReactiveSortingRepository<Person, String> {

  Flux<Person> findByFirstname(String firstname);                                   (1)

  Flux<Person> findByFirstname(Publisher<String> firstname);                        (2)

  Flux<Person> findByFirstnameOrderByLastname(String firstname);                    (3)

  Flux<Person> findByFirstname(String firstname, Sort sort);                        (4)

  Flux<Person> findByFirstname(String firstname, Pageable page);                    (5)

  Mono<Person> findByFirstnameAndLastname(String firstname, String lastname);       (6)

  Mono<Person> findFirstByLastname(String lastname);                                (7)

  @Query("{ \"bool\" : { \"must\" : { \"term\" : { \"lastname\" : \"?0\" } } } }")
  Flux<Person> findByLastname(String lastname);                                     (8)

  Mono<Long> countByFirstname(String firstname)                                     (9)

  Mono<Boolean> existsByFirstname(String firstname)                                 (10)

  Mono<Long> deleteByFirstname(String firstname)                                    (11)
}
1 The method shows a query for all people with the given lastname.
2 Finder method awaiting input from Publisher to bind parameter value for firstname.
3 Finder method ordering matching documents by lastname.
4 Finder method ordering matching documents by the expression defined via the Sort parameter.
5 Use Pageable to pass offset and sorting parameters to the database.
6 Finder method concating criteria using And / Or keywords.
7 Find the first matching entity.
8 The method shows a query for all people with the given lastname looked up by running the annotated @Query with given parameters.
9 Count all entities with matching firstname.
10 Check if at least one entity with matching firstname exists.
11 Delete all entites with matching firstname.

8.3.2. Configuration

For Java configuration, use the @EnableReactiveElasticsearchRepositories annotation. If no base package is configured, the infrastructure scans the package of the annotated configuration class.

The following listing shows how to use Java configuration for a repository:

Example 78. Java configuration for repositories
@Configuration
@EnableReactiveElasticsearchRepositories
public class Config extends AbstractReactiveElasticsearchConfiguration {

  @Override
  public ReactiveElasticsearchClient reactiveElasticsearchClient() {
    return ReactiveRestClients.create(ClientConfiguration.localhost());
  }
}

Because the repository from the previous example extends ReactiveSortingRepository, all CRUD operations are available as well as methods for sorted access to the entities. Working with the repository instance is a matter of dependency injecting it into a client, as the following example shows:

Example 79. Sorted access to Person entities
public class PersonRepositoryTests {

  @Autowired ReactivePersonRepository repository;

  @Test
  public void sortsElementsCorrectly() {

    Flux<Person> persons = repository.findAll(Sort.by(new Order(ASC, "lastname")));

    // ...
  }
}

8.4. Annotations for repository methods

8.4.1. @Highlight

The @Highlight annotation on a repository method defines for which fields of the returned entity highlighting should be included. To search for some text in a Book 's name or summary and have the found data highlighted, the following repository method can be used:

interface BookRepository extends Repository<Book, String> {

    @Highlight(fields = {
        @HighlightField(name = "name"),
        @HighlightField(name = "summary")
    })
    List<SearchHit<Book>> findByNameOrSummary(String text, String summary);
}

It is possible to define multiple fields to be highlighted like above, and both the @Highlight and the @HighlightField annotation can further be customized with a @HighlightParameters annotation. Check the Javadocs for the possible configuration options.

In the search results the highlight data can be retrieved from the SearchHit class.

8.5. Annotation based configuration

The Spring Data Elasticsearch repositories support can be activated using an annotation through JavaConfig.

Example 80. Spring Data Elasticsearch repositories using JavaConfig
@Configuration
@EnableElasticsearchRepositories(                             (1)
  basePackages = "org.springframework.data.elasticsearch.repositories"
  )
static class Config {

  @Bean
  public ElasticsearchOperations elasticsearchTemplate() {    (2)
      // ...
  }
}

class ProductService {

  private ProductRepository repository;                       (3)

  public ProductService(ProductRepository repository) {
    this.repository = repository;
  }

  public Page<Product> findAvailableBookByName(String name, Pageable pageable) {
    return repository.findByAvailableTrueAndNameStartingWith(name, pageable);
  }
}
1 The EnableElasticsearchRepositories annotation activates the Repository support. If no base package is configured, it will use the one of the configuration class it is put on.
2 Provide a Bean named elasticsearchTemplate of type ElasticsearchOperations by using one of the configurations shown in the Elasticsearch Operations chapter.
3 Let Spring inject the Repository bean into your class.

8.6. Elasticsearch Repositories using CDI

The Spring Data Elasticsearch repositories can also be set up using CDI functionality.

Example 81. Spring Data Elasticsearch repositories using CDI
class ElasticsearchTemplateProducer {

  @Produces
  @ApplicationScoped
  public ElasticsearchOperations createElasticsearchTemplate() {
    // ...                               (1)
  }
}

class ProductService {

  private ProductRepository repository;  (2)
  public Page<Product> findAvailableBookByName(String name, Pageable pageable) {
    return repository.findByAvailableTrueAndNameStartingWith(name, pageable);
  }
  @Inject
  public void setRepository(ProductRepository repository) {
    this.repository = repository;
  }
}
1 Create a component by using the same calls as are used in the Elasticsearch Operations chapter.
2 Let the CDI framework inject the Repository into your class.

8.7. Spring Namespace

The Spring Data Elasticsearch module contains a custom namespace allowing definition of repository beans as well as elements for instantiating a ElasticsearchServer .

Using the repositories element looks up Spring Data repositories as described in Creating Repository Instances .

Example 82. Setting up Elasticsearch repositories using Namespace
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:elasticsearch="http://www.springframework.org/schema/data/elasticsearch"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
       https://www.springframework.org/schema/beans/spring-beans-3.1.xsd
       http://www.springframework.org/schema/data/elasticsearch
       https://www.springframework.org/schema/data/elasticsearch/spring-elasticsearch-1.0.xsd">

  <elasticsearch:repositories base-package="com.acme.repositories" />

</beans>

Using the Transport Client or Rest Client element registers an instance of Elasticsearch Server in the context.

Example 83. Transport Client using Namespace
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:elasticsearch="http://www.springframework.org/schema/data/elasticsearch"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
       https://www.springframework.org/schema/beans/spring-beans-3.1.xsd
       http://www.springframework.org/schema/data/elasticsearch
       https://www.springframework.org/schema/data/elasticsearch/spring-elasticsearch-1.0.xsd">

  <elasticsearch:transport-client id="client" cluster-nodes="localhost:9300,someip:9300" />

</beans>
Example 84. Rest Client using Namespace
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:elasticsearch="http://www.springframework.org/schema/data/elasticsearch"
       xsi:schemaLocation="http://www.springframework.org/schema/data/elasticsearch
       https://www.springframework.org/schema/data/elasticsearch/spring-elasticsearch.xsd
       http://www.springframework.org/schema/beans
       https://www.springframework.org/schema/beans/spring-beans.xsd">

  <elasticsearch:rest-client id="restClient" hosts="http://localhost:9200">

</beans>

9. Auditing

9.1. Basics

Spring Data provides sophisticated support to transparently keep track of who created or changed an entity and when the change happened. To benefit from that functionality, you have to equip your entity classes with auditing metadata that can be defined either using annotations or by implementing an interface. Additionally, auditing has to be enabled either through Annotation configuration or XML configuration to register the required infrastructure components. Please refer to the store-specific section for configuration samples.

Applications that only track creation and modification dates do not need to specify an AuditorAware.

9.1.1. Annotation-based Auditing Metadata

We provide @CreatedBy and @LastModifiedBy to capture the user who created or modified the entity as well as @CreatedDate and @LastModifiedDate to capture when the change happened.

Example 85. An audited entity
class Customer {

  @CreatedBy
  private User user;

  @CreatedDate
  private Instant createdDate;

  // … further properties omitted
}

As you can see, the annotations can be applied selectively, depending on which information you want to capture. The annotations capturing when changes were made can be used on properties of type Joda-Time, DateTime, legacy Java Date and Calendar, JDK8 date and time types, and long or Long.

Auditing metadata does not necessarily need to live in the root level entity but can be added to an embedded one (depending on the actual store in use), as shown in the snipped below.

Example 86. Audit metadata in embedded entity
class Customer {

  private AuditMetadata auditingMetadata;

  // … further properties omitted
}

class AuditMetadata {

  @CreatedBy
  private User user;

  @CreatedDate
  private Instant createdDate;

}

9.1.2. Interface-based Auditing Metadata

In case you do not want to use annotations to define auditing metadata, you can let your domain class implement the Auditable interface. It exposes setter methods for all of the auditing properties.

9.1.3. AuditorAware

In case you use either @CreatedBy or @LastModifiedBy, the auditing infrastructure somehow needs to become aware of the current principal. To do so, we provide an AuditorAware<T> SPI interface that you have to implement to tell the infrastructure who the current user or system interacting with the application is. The generic type T defines what type the properties annotated with @CreatedBy or @LastModifiedBy have to be.

The following example shows an implementation of the interface that uses Spring Security’s Authentication object:

Example 87. Implementation of AuditorAware based on Spring Security
class SpringSecurityAuditorAware implements AuditorAware<User> {

  @Override
  public Optional<User> getCurrentAuditor() {

    return Optional.ofNullable(SecurityContextHolder.getContext())
            .map(SecurityContext::getAuthentication)
            .filter(Authentication::isAuthenticated)
            .map(Authentication::getPrincipal)
            .map(User.class::cast);
  }
}

The implementation accesses the Authentication object provided by Spring Security and looks up the custom UserDetails instance that you have created in your UserDetailsService implementation. We assume here that you are exposing the domain user through the UserDetails implementation but that, based on the Authentication found, you could also look it up from anywhere.

9.1.4. ReactiveAuditorAware

When using reactive infrastructure you might want to make use of contextual information to provide @CreatedBy or @LastModifiedBy information. We provide an ReactiveAuditorAware<T> SPI interface that you have to implement to tell the infrastructure who the current user or system interacting with the application is. The generic type T defines what type the properties annotated with @CreatedBy or @LastModifiedBy have to be.

The following example shows an implementation of the interface that uses reactive Spring Security’s Authentication object:

Example 88. Implementation of ReactiveAuditorAware based on Spring Security
class SpringSecurityAuditorAware implements ReactiveAuditorAware<User> {

  @Override
  public Mono<User> getCurrentAuditor() {

    return ReactiveSecurityContextHolder.getContext()
                .map(SecurityContext::getAuthentication)
                .filter(Authentication::isAuthenticated)
                .map(Authentication::getPrincipal)
                .map(User.class::cast);
  }
}

The implementation accesses the Authentication object provided by Spring Security and looks up the custom UserDetails instance that you have created in your UserDetailsService implementation. We assume here that you are exposing the domain user through the UserDetails implementation but that, based on the Authentication found, you could also look it up from anywhere.

9.2. Elasticsearch Auditing

9.2.1. Preparing entities

In order for the auditing code to be able to decide whether an entity instance is new, the entity must implement the Persistable<ID> interface which is defined as follows:

package org.springframework.data.domain;

import org.springframework.lang.Nullable;

public interface Persistable<ID> {
    @Nullable
    ID getId();

    boolean isNew();
}

As the existence of an Id is not a sufficient criterion to determine if an enitity is new in Elasticsearch, additional information is necessary. One way is to use the creation-relevant auditing fields for this decision:

A Person entity might look as follows - omitting getter and setter methods for brevity:

@Document(indexName = "person")
public class Person implements Persistable<Long> {
    @Id private Long id;
    private String lastName;
    private String firstName;
    @CreatedDate
    @Field(type = FieldType.Date, format = DateFormat.basic_date_time)
    private Instant createdDate;
    @CreatedBy
    private String createdBy
    @Field(type = FieldType.Date, format = DateFormat.basic_date_time)
    @LastModifiedDate
    private Instant lastModifiedDate;
    @LastModifiedBy
    private String lastModifiedBy;

    public Long getId() {                                                 (1)
        return id;
    }

    @Override
    public boolean isNew() {
        return id == null || (createdDate == null && createdBy == null);  (2)
    }
}
1 the getter is the required implementation from the interface
2 an object is new if it either has no id or none of fields containing creation attributes are set.

9.2.2. Activating auditing

After the entities have been set up and providing the AuditorAware - or ReactiveAuditorAware - the Auditing must be activated by setting the @EnableElasticsearchAuditing on a configuration class:

@Configuration
@EnableElasticsearchRepositories
@EnableElasticsearchAuditing
class MyConfiguration {
   // configuration code
}

When using the reactive stack this must be:

@Configuration
@EnableReactiveElasticsearchRepositories
@EnableReactiveElasticsearchAuditing
class MyConfiguration {
   // configuration code
}

If your code contains more than one AuditorAware bean for different types, you must provide the name of the bean to use as an argument to the auditorAwareRef parameter of the @EnableElasticsearchAuditing annotation.

10. Entity Callbacks

The Spring Data infrastructure provides hooks for modifying an entity before and after certain methods are invoked. Those so called EntityCallback instances provide a convenient way to check and potentially modify an entity in a callback fashioned style.
An EntityCallback looks pretty much like a specialized ApplicationListener. Some Spring Data modules publish store specific events (such as BeforeSaveEvent) that allow modifying the given entity. In some cases, such as when working with immutable types, these events can cause trouble. Also, event publishing relies on ApplicationEventMulticaster. If configuring that with an asynchronous TaskExecutor it can lead to unpredictable outcomes, as event processing can be forked onto a Thread.

Entity callbacks provide integration points with both synchronous and reactive APIs to guarantee in-order execution at well-defined checkpoints within the processing chain, returning a potentially modified entity or an reactive wrapper type.

Entity callbacks are typically separated by API type. This separation means that a synchronous API considers only synchronous entity callbacks and a reactive implementation considers only reactive entity callbacks.

The Entity Callback API has been introduced with Spring Data Commons 2.2. It is the recommended way of applying entity modifications. Existing store specific ApplicationEvents are still published before the invoking potentially registered EntityCallback instances.

10.1. Implementing Entity Callbacks

An EntityCallback is directly associated with its domain type through its generic type argument. Each Spring Data module typically ships with a set of predefined EntityCallback interfaces covering the entity lifecycle.

Example 89. Anatomy of an EntityCallback
@FunctionalInterface
public interface BeforeSaveCallback<T> extends EntityCallback<T> {

	/**
	 * Entity callback method invoked before a domain object is saved.
	 * Can return either the same or a modified instance.
	 *
	 * @return the domain object to be persisted.
	 */
	T onBeforeSave(T entity <2>, String collection <3>); (1)
}
1 BeforeSaveCallback specific method to be called before an entity is saved. Returns a potentially modifed instance.
2 The entity right before persisting.
3 A number of store specific arguments like the collection the entity is persisted to.
Example 90. Anatomy of a reactive EntityCallback
@FunctionalInterface
public interface ReactiveBeforeSaveCallback<T> extends EntityCallback<T> {

	/**
	 * Entity callback method invoked on subscription, before a domain object is saved.
	 * The returned Publisher can emit either the same or a modified instance.
	 *
	 * @return Publisher emitting the domain object to be persisted.
	 */
	Publisher<T> onBeforeSave(T entity <2>, String collection <3>); (1)
}
1 BeforeSaveCallback specific method to be called on subscription, before an entity is saved. Emits a potentially modifed instance.
2 The entity right before persisting.
3 A number of store specific arguments like the collection the entity is persisted to.
Optional entity callback parameters are defined by the implementing Spring Data module and inferred from call site of EntityCallback.callback().

Implement the interface suiting your application needs like shown in the example below:

Example 91. Example BeforeSaveCallback
class DefaultingEntityCallback implements BeforeSaveCallback<Person>, Ordered {      (2)

	@Override
	public Object onBeforeSave(Person entity, String collection) {                   (1)

		if(collection == "user") {
		    return // ...
		}

		return // ...
	}

	@Override
	public int getOrder() {
		return 100;                                                                  (2)
	}
}
1 Callback implementation according to your requirements.
2 Potentially order the entity callback if multiple ones for the same domain type exist. Ordering follows lowest precedence.

10.2. Registering Entity Callbacks

EntityCallback beans are picked up by the store specific implementations in case they are registered in the ApplicationContext. Most template APIs already implement ApplicationContextAware and therefore have access to the ApplicationContext

The following example explains a collection of valid entity callback registrations:

Example 92. Example EntityCallback Bean registration
@Order(1)                                                           (1)
@Component
class First implements BeforeSaveCallback<Person> {

	@Override
	public Person onBeforeSave(Person person) {
		return // ...
	}
}

@Component
class DefaultingEntityCallback implements BeforeSaveCallback<Person>,
                                                           Ordered { (2)

	@Override
	public Object onBeforeSave(Person entity, String collection) {
		// ...
	}

	@Override
	public int getOrder() {
		return 100;                                                  (2)
	}
}

@Configuration
public class EntityCallbackConfiguration {

    @Bean
    BeforeSaveCallback<Person> unorderedLambdaReceiverCallback() {   (3)
        return (BeforeSaveCallback<Person>) it -> // ...
    }
}

@Component
class UserCallbacks implements BeforeConvertCallback<User>,
                                        BeforeSaveCallback<User> {   (4)

	@Override
	public Person onBeforeConvert(User user) {
		return // ...
	}

	@Override
	public Person onBeforeSave(User user) {
		return // ...
	}
}
1 BeforeSaveCallback receiving its order from the @Order annotation.
2 BeforeSaveCallback receiving its order via the Ordered interface implementation.
3 BeforeSaveCallback using a lambda expression. Unordered by default and invoked last. Note that callbacks implemented by a lambda expression do not expose typing information hence invoking these with a non-assignable entity affects the callback throughput. Use a class or enum to enable type filtering for the callback bean.
4 Combine multiple entity callback interfaces in a single implementation class.

10.3. Elasticsearch EntityCallbacks

Spring Data Elasticsearch uses the EntityCallback API internally for its auditing support and reacts on the following callbacks:

Table 3. Supported Entity Callbacks
Callback Method Description Order

Reactive/BeforeConvertCallback

onBeforeConvert(T entity, IndexCoordinates index)

Invoked before a domain object is converted to org.springframework.data.elasticsearch.core.document.Document. Can return the entity or a modified entity which then will be converted.

Ordered.LOWEST_PRECEDENCE

Reactive/AfterConvertCallback

onAfterConvert(T entity, Document document, IndexCoordinates indexCoordinates)

Invoked after a domain object is converted from org.springframework.data.elasticsearch.core.document.Document on reading result data from Elasticsearch.

Ordered.LOWEST_PRECEDENCE

Reactive/AuditingEntityCallback

onBeforeConvert(Object entity, IndexCoordinates index)

Marks an auditable entity created or modified

100

Reactive/AfterSaveCallback

T onAfterSave(T entity, IndexCoordinates index)

Invoked after a domain object is saved.

Ordered.LOWEST_PRECEDENCE

11. Join-Type implementation

Spring Data Elasticsearch supports the Join data type for creating the corresponding index mappings and for storing the relevant information.

11.1. Setting up the data

For an entity to be used in a parent child join relationship, it must have a property of type JoinField which must be annotated. Let’s assume a Statement entity where a statement may be a question, an answer, a comment or a vote (a Builder is also shown in this example, it’s not necessary, but later used in the sample code):

@Document(indexName = "statements")
@Routing("routing")                                                                       (1)
public class Statement {
    @Id
    private String id;

    @Field(type = FieldType.Text)
    private String text;

    @Field(type = FieldType.Keyword)
    private String routing;

    @JoinTypeRelations(
        relations =
            {
                @JoinTypeRelation(parent = "question", children = {"answer", "comment"}), (2)
                @JoinTypeRelation(parent = "answer", children = "vote")                   (3)
            }
    )
    private JoinField<String> relation;                                                   (4)

    private Statement() {
    }

    public static StatementBuilder builder() {
        return new StatementBuilder();
    }

    public String getId() {
        return id;
    }

    public void setId(String id) {
        this.id = id;
    }

    public String getRouting() {
        return routing;
    }

    public void setRouting(Routing routing) {
        this.routing = routing;
    }

    public String getText() {
        return text;
    }

    public void setText(String text) {
        this.text = text;
    }

    public JoinField<String> getRelation() {
        return relation;
    }

    public void setRelation(JoinField<String> relation) {
        this.relation = relation;
    }

    public static final class StatementBuilder {
        private String id;
        private String text;
        private String routing;
        private JoinField<String> relation;

        private StatementBuilder() {
        }

        public StatementBuilder withId(String id) {
            this.id = id;
            return this;
        }

        public StatementBuilder withRouting(String routing) {
            this.routing = routing;
            return this;
        }

        public StatementBuilder withText(String text) {
            this.text = text;
            return this;
        }

        public StatementBuilder withRelation(JoinField<String> relation) {
            this.relation = relation;
            return this;
        }

        public Statement build() {
            Statement statement = new Statement();
            statement.setId(id);
            statement.setRouting(routing);
            statement.setText(text);
            statement.setRelation(relation);
            return statement;
        }
    }
}
1 for routing related info see Routing values
2 a question can have answers and comments
3 an answer can have votes
4 the JoinField property is used to combine the name (question, answer, comment or vote) of the relation with the parent id. The generic type must be the same as the @Id annotated property.

Spring Data Elasticsearch will build the following mapping for this class:

{
  "statements": {
    "mappings": {
      "properties": {
        "_class": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "routing": {
          "type": "keyword"
        },
        "relation": {
          "type": "join",
          "eager_global_ordinals": true,
          "relations": {
            "question": [
              "answer",
              "comment"
            ],
            "answer": "vote"
          }
        },
        "text": {
          "type": "text"
        }
      }
    }
  }
}

11.2. Storing data

Given a repository for this class the following code inserts a question, two answers, a comment and a vote:

void init() {
    repository.deleteAll();

    Statement savedWeather = repository.save(
        Statement.builder()
            .withText("How is the weather?")
            .withRelation(new JoinField<>("question"))                          (1)
            .build());

    Statement sunnyAnswer = repository.save(
        Statement.builder()
            .withText("sunny")
            .withRelation(new JoinField<>("answer", savedWeather.getId()))      (2)
            .build());

    repository.save(
        Statement.builder()
            .withText("rainy")
            .withRelation(new JoinField<>("answer", savedWeather.getId()))      (3)
            .build());

    repository.save(
        Statement.builder()
            .withText("I don't like the rain")
            .withRelation(new JoinField<>("comment", savedWeather.getId()))     (4)
            .build());

    repository.save(
        Statement.builder()
            .withText("+1 for the sun")
            ,withRouting(savedWeather.getId())
            .withRelation(new JoinField<>("vote", sunnyAnswer.getId()))         (5)
            .build());
}
1 create a question statement
2 the first answer to the question
3 the second answer
4 a comment to the question
5 a vote for the first answer, this needs to have the routing set to the weather document, see Routing values.

11.3. Retrieving data

Currently native search queries must be used to query the data, so there is no support from standard repository methods. Custom Implementations for Spring Data Repositories can be used instead.

The following code shows as an example how to retrieve all entries that have a vote (which must be answers, because only answers can have a vote) using an ElasticsearchOperations instance:

SearchHits<Statement> hasVotes() {
    NativeSearchQuery query = new NativeSearchQueryBuilder()
        .withQuery(hasChildQuery("vote", matchAllQuery(), ScoreMode.None))
        .build();

    return operations.search(query, Statement.class);
}

12. Routing values

When Elasticsearch stores a document in an index that has multiple shards, it determines the shard to you use based on the id of the document. Sometimes it is necessary to predefine that multiple documents should be indexed on the same shard (join-types, faster search for related data). For this Elasticsearch offers the possibility to define a routing, which is the value that should be used to calculate the shard from instead of the id.

Spring Data Elasticsearch supports routing definitions on storing and retrieving data in the following ways:

12.1. Routing on join-types

When using join-types (see Join-Type implementation), Spring Data Elasticsearch will automatically use the parent property of the entity’s JoinField property as the value for the routing.

This is correct for all the use-cases where the parent-child relationship has just one level. If it is deeper, like a child-parent-grandparent relationship - like in the above example from voteanswerquestion - then the routing needs to explicitly specified by using the techniques described in the next section (the vote needs the question.id as routing value).

12.2. Custom routing values

To define a custom routing for an entity, Spring Data Elasticsearch provides a @Routing annotation (reusing the Statement class from above):

@Document(indexName = "statements")
@Routing("routing")                  (1)
public class Statement {
    @Id
    private String id;

    @Field(type = FieldType.Text)
    private String text;

    @JoinTypeRelations(
        relations =
            {
                @JoinTypeRelation(parent = "question", children = {"answer", "comment"}),
                @JoinTypeRelation(parent = "answer", children = "vote")
            }
    )
    private JoinField<String> relation;

    @Nullable
    @Field(type = FieldType.Keyword)
    private String routing;          (2)

    // getter/setter...
}
1 This defines "routing" as routing specification
2 a property with the name routing

If the routing specification of the annotation is a plain string and not a SpEL expression, it is interpreted as the name of a property of the entity, in the example it’s the routing property. The value of this property will then be used as the routing value for all requests that use the entity.

It is also possible to us a SpEL expression in the @Document annotation like this:

@Document(indexName = "statements")
@Routing("@myBean.getRouting(#entity)")
public class Statement{
    // all the needed stuff
}

In this case the user needs to provide a bean with the name myBean that has a method String getRouting(Object). To reference the entity "#entity" must be used in the SpEL expression, and the return value must be null or the routing value as a String.

If plain property’s names and SpEL expressions are not enough to customize the routing definitions, it is possible to define provide an implementation of the RoutingResolver interface. This can then be set on the ElasticOperations instance:

RoutingResolver resolver = ...;

ElasticsearchOperations customOperations= operations.withRouting(resolver);

The withRouting() functions return a copy of the original ElasticsearchOperations instance with the customized routing set.

When a routing has been defined on an entity when it is stored in Elasticsearch, the same value must be provided when doing a get or delete operation. For methods that do not use an entity - like get(ID) or delete(ID) - the ElasticsearchOperations.withRouting(RoutingResolver) method can be used like this:

String id = "someId";
String routing = "theRoutingValue";

// get an entity
Statement s = operations
                .withRouting(RoutingResolver.just(routing))       (1)
                .get(id, Statement.class);

// delete an entity
operations.withRouting(RoutingResolver.just(routing)).delete(id);
1 RoutingResolver.just(s) returns a resolver that will just return the given String.

13. Miscellaneous Elasticsearch Operation Support

This chapter covers additional support for Elasticsearch operations that cannot be directly accessed via the repository interface. It is recommended to add those operations as custom implementation as described in Custom Implementations for Spring Data Repositories .

13.1. Index settings

When creating Elasticsearch indices with Spring Data Elasticsearch different index settings can be defined by using the @Setting annotation. The following arguments are available:

  • useServerConfiguration does not send any settings parameters, so the Elasticsearch server configuration determines them.

  • settingPath refers to a JSON file defining the settings that must be resolvable in the classpath

  • shards the number of shards to use, defaults to 1

  • replicas the number of replicas, defaults to 1

  • refreshIntervall, defaults to "1s"

  • indexStoreType, defaults to "fs"

It is as well possible to define index sorting (check the linked Elasticsearch documentation for the possible field types and values):

@Document(indexName = "entities")
@Setting(
  sortFields = { "secondField", "firstField" },                                  (1)
  sortModes = { Setting.SortMode.max, Setting.SortMode.min },                    (2)
  sortOrders = { Setting.SortOrder.desc, Setting.SortOrder.asc },
  sortMissingValues = { Setting.SortMissing._last, Setting.SortMissing._first })
class Entity {
    @Nullable
    @Id private String id;

    @Nullable
    @Field(name = "first_field", type = FieldType.Keyword)
    private String firstField;

    @Nullable @Field(name = "second_field", type = FieldType.Keyword)
    private String secondField;

    // getter and setter...
}
1 when defining sort fields, use the name of the Java property (firstField), not the name that might be defined for Elasticsearch (first_field)
2 sortModes, sortOrders and sortMissingValues are optional, but if they are set, the number of entries must match the number of sortFields elements

13.2. Filter Builder

Filter Builder improves query speed.

private ElasticsearchOperations operations;

IndexCoordinates index = IndexCoordinates.of("sample-index");

SearchQuery searchQuery = new NativeSearchQueryBuilder()
  .withQuery(matchAllQuery())
  .withFilter(boolFilter().must(termFilter("id", documentId)))
  .build();

Page<SampleEntity> sampleEntities = operations.searchForPage(searchQuery, SampleEntity.class, index);

13.3. Using Scroll For Big Result Set

Elasticsearch has a scroll API for getting big result set in chunks. This is internally used by Spring Data Elasticsearch to provide the implementations of the <T> SearchHitsIterator<T> SearchOperations.searchForStream(Query query, Class<T> clazz, IndexCoordinates index) method.

IndexCoordinates index = IndexCoordinates.of("sample-index");

SearchQuery searchQuery = new NativeSearchQueryBuilder()
  .withQuery(matchAllQuery())
  .withFields("message")
  .withPageable(PageRequest.of(0, 10))
  .build();

SearchHitsIterator<SampleEntity> stream = elasticsearchTemplate.searchForStream(searchQuery, SampleEntity.class, index);

List<SampleEntity> sampleEntities = new ArrayList<>();
while (stream.hasNext()) {
  sampleEntities.add(stream.next());
}

stream.close();

There are no methods in the SearchOperations API to access the scroll id, if it should be necessary to access this, the following methods of the ElasticsearchRestTemplate can be used:

@Autowired ElasticsearchRestTemplate template;

IndexCoordinates index = IndexCoordinates.of("sample-index");

SearchQuery searchQuery = new NativeSearchQueryBuilder()
  .withQuery(matchAllQuery())
  .withFields("message")
  .withPageable(PageRequest.of(0, 10))
  .build();

SearchScrollHits<SampleEntity> scroll = template.searchScrollStart(1000, searchQuery, SampleEntity.class, index);

String scrollId = scroll.getScrollId();
List<SampleEntity> sampleEntities = new ArrayList<>();
while (scroll.hasSearchHits()) {
  sampleEntities.addAll(scroll.getSearchHits());
  scrollId = scroll.getScrollId();
  scroll = template.searchScrollContinue(scrollId, 1000, SampleEntity.class);
}
template.searchScrollClear(scrollId);

To use the Scroll API with repository methods, the return type must defined as Stream in the Elasticsearch Repository. The implementation of the method will then use the scroll methods from the ElasticsearchTemplate.

interface SampleEntityRepository extends Repository<SampleEntity, String> {

    Stream<SampleEntity> findBy();

}

13.4. Sort options

In addition to the default sort options described Paging and Sorting Spring Data Elasticsearch has a GeoDistanceOrder class which can be used to have the result of a search operation ordered by geographical distance.

If the class to be retrieved has a GeoPoint property named location, the following Sort would sort the results by distance to the given point:

Sort.by(new GeoDistanceOrder("location", new GeoPoint(48.137154, 11.5761247)))

Appendix

Appendix A: Namespace reference

The <repositories /> Element

The <repositories /> element triggers the setup of the Spring Data repository infrastructure. The most important attribute is base-package, which defines the package to scan for Spring Data repository interfaces. See “XML Configuration”. The following table describes the attributes of the <repositories /> element:

Table 4. Attributes
Name Description

base-package

Defines the package to be scanned for repository interfaces that extend *Repository (the actual interface is determined by the specific Spring Data module) in auto-detection mode. All packages below the configured package are scanned, too. Wildcards are allowed.

repository-impl-postfix

Defines the postfix to autodetect custom repository implementations. Classes whose names end with the configured postfix are considered as candidates. Defaults to Impl.

query-lookup-strategy

Determines the strategy to be used to create finder queries. See “Query Lookup Strategies” for details. Defaults to create-if-not-found.

named-queries-location

Defines the location to search for a Properties file containing externally defined queries.

consider-nested-repositories

Whether nested repository interface definitions should be considered. Defaults to false.

Appendix B: Populators namespace reference

The <populator /> element

The <populator /> element allows to populate the a data store via the Spring Data repository infrastructure.[2]

Table 5. Attributes
Name Description

locations

Where to find the files to read the objects from the repository shall be populated with.

Appendix C: Repository query keywords

Supported query method subject keywords

The following table lists the subject keywords generally supported by the Spring Data repository query derivation mechanism to express the predicate. Consult the store-specific documentation for the exact list of supported keywords, because some keywords listed here might not be supported in a particular store.

Table 6. Query subject keywords
Keyword Description

find…By, read…By, get…By, query…By, search…By, stream…By

General query method returning typically the repository type, a Collection or Streamable subtype or a result wrapper such as Page, GeoResults or any other store-specific result wrapper. Can be used as findBy…, findMyDomainTypeBy… or in combination with additional keywords.

exists…By

Exists projection, returning typically a boolean result.

count…By

Count projection returning a numeric result.

delete…By, remove…By

Delete query method returning either no result (void) or the delete count.

…First<number>…, …Top<number>…

Limit the query results to the first <number> of results. This keyword can occur in any place of the subject between find (and the other keywords) and by.

…Distinct…

Use a distinct query to return only unique results. Consult the store-specific documentation whether that feature is supported. This keyword can occur in any place of the subject between find (and the other keywords) and by.

Supported query method predicate keywords and modifiers

The following table lists the predicate keywords generally supported by the Spring Data repository query derivation mechanism. However, consult the store-specific documentation for the exact list of supported keywords, because some keywords listed here might not be supported in a particular store.

Table 7. Query predicate keywords
Logical keyword Keyword expressions

AND

And

OR

Or

AFTER

After, IsAfter

BEFORE

Before, IsBefore

CONTAINING

Containing, IsContaining, Contains

BETWEEN

Between, IsBetween

ENDING_WITH

EndingWith, IsEndingWith, EndsWith

EXISTS

Exists

FALSE

False, IsFalse

GREATER_THAN

GreaterThan, IsGreaterThan

GREATER_THAN_EQUALS

GreaterThanEqual, IsGreaterThanEqual

IN

In, IsIn

IS

Is, Equals, (or no keyword)

IS_EMPTY

IsEmpty, Empty

IS_NOT_EMPTY

IsNotEmpty, NotEmpty

IS_NOT_NULL

NotNull, IsNotNull

IS_NULL

Null, IsNull

LESS_THAN

LessThan, IsLessThan

LESS_THAN_EQUAL

LessThanEqual, IsLessThanEqual

LIKE

Like, IsLike

NEAR

Near, IsNear

NOT

Not, IsNot

NOT_IN

NotIn, IsNotIn

NOT_LIKE

NotLike, IsNotLike

REGEX

Regex, MatchesRegex, Matches

STARTING_WITH

StartingWith, IsStartingWith, StartsWith

TRUE

True, IsTrue

WITHIN

Within, IsWithin

In addition to filter predicates, the following list of modifiers is supported:

Table 8. Query predicate modifier keywords
Keyword Description

IgnoreCase, IgnoringCase

Used with a predicate keyword for case-insensitive comparison.

AllIgnoreCase, AllIgnoringCase

Ignore case for all suitable properties. Used somewhere in the query method predicate.

OrderBy…

Specify a static sorting order followed by the property path and direction (e. g. OrderByFirstnameAscLastnameDesc).

Appendix D: Repository query return types

Supported Query Return Types

The following table lists the return types generally supported by Spring Data repositories. However, consult the store-specific documentation for the exact list of supported return types, because some types listed here might not be supported in a particular store.

Geospatial types (such as GeoResult, GeoResults, and GeoPage) are available only for data stores that support geospatial queries. Some store modules may define their own result wrapper types.
Table 9. Query return types
Return type Description

void

Denotes no return value.

Primitives

Java primitives.

Wrapper types

Java wrapper types.

T

A unique entity. Expects the query method to return one result at most. If no result is found, null is returned. More than one result triggers an IncorrectResultSizeDataAccessException.

Iterator<T>

An Iterator.

Collection<T>

A Collection.

List<T>

A List.

Optional<T>

A Java 8 or Guava Optional. Expects the query method to return one result at most. If no result is found, Optional.empty() or Optional.absent() is returned. More than one result triggers an IncorrectResultSizeDataAccessException.

Option<T>

Either a Scala or Vavr Option type. Semantically the same behavior as Java 8’s Optional, described earlier.

Stream<T>

A Java 8 Stream.

Streamable<T>

A convenience extension of Iterable that directy exposes methods to stream, map and filter results, concatenate them etc.

Types that implement Streamable and take a Streamable constructor or factory method argument

Types that expose a constructor or ….of(…)/….valueOf(…) factory method taking a Streamable as argument. See Returning Custom Streamable Wrapper Types for details.

Vavr Seq, List, Map, Set

Vavr collection types. See Support for Vavr Collections for details.

Future<T>

A Future. Expects a method to be annotated with @Async and requires Spring’s asynchronous method execution capability to be enabled.

CompletableFuture<T>

A Java 8 CompletableFuture. Expects a method to be annotated with @Async and requires Spring’s asynchronous method execution capability to be enabled.

ListenableFuture

A org.springframework.util.concurrent.ListenableFuture. Expects a method to be annotated with @Async and requires Spring’s asynchronous method execution capability to be enabled.

Slice<T>

A sized chunk of data with an indication of whether there is more data available. Requires a Pageable method parameter.

Page<T>

A Slice with additional information, such as the total number of results. Requires a Pageable method parameter.

GeoResult<T>

A result entry with additional information, such as the distance to a reference location.

GeoResults<T>

A list of GeoResult<T> with additional information, such as the average distance to a reference location.

GeoPage<T>

A Page with GeoResult<T>, such as the average distance to a reference location.

Mono<T>

A Project Reactor Mono emitting zero or one element using reactive repositories. Expects the query method to return one result at most. If no result is found, Mono.empty() is returned. More than one result triggers an IncorrectResultSizeDataAccessException.

Flux<T>

A Project Reactor Flux emitting zero, one, or many elements using reactive repositories. Queries returning Flux can emit also an infinite number of elements.

Single<T>

A RxJava Single emitting a single element using reactive repositories. Expects the query method to return one result at most. If no result is found, Mono.empty() is returned. More than one result triggers an IncorrectResultSizeDataAccessException.

Maybe<T>

A RxJava Maybe emitting zero or one element using reactive repositories. Expects the query method to return one result at most. If no result is found, Mono.empty() is returned. More than one result triggers an IncorrectResultSizeDataAccessException.

Flowable<T>

A RxJava Flowable emitting zero, one, or many elements using reactive repositories. Queries returning Flowable can emit also an infinite number of elements.

Appendix E: Migration Guides

Upgrading from 3.2.x to 4.0.x

This section describes breaking changes from version 3.2.x to 4.0.x and how removed features can be replaced by new introduced features.

Removal of the used Jackson Mapper

One of the changes in version 4.0.x is that Spring Data Elasticsearch does not use the Jackson Mapper anymore to map an entity to the JSON representation needed for Elasticsearch (see Elasticsearch Object Mapping). In version 3.2.x the Jackson Mapper was the default that was used. It was possible to switch to the meta-model based converter (named ElasticsearchEntityMapper) by explicitly configuring it (Meta Model Object Mapping).

In version 4.0.x the meta-model based converter is the only one that is available and does not need to be configured explicitly. If you had a custom configuration to enable the meta-model converter by providing a bean like this:

@Bean
@Override
public EntityMapper entityMapper() {

  ElasticsearchEntityMapper entityMapper = new ElasticsearchEntityMapper(
    elasticsearchMappingContext(), new DefaultConversionService()
  );
  entityMapper.setConversions(elasticsearchCustomConversions());

  return entityMapper;
}

You now have to remove this bean, the ElasticsearchEntityMapper interface has been removed.

Entity configuration

Some users had custom Jackson annotations on the entity class, for example in order to define a custom name for the mapped document in Elasticsearch or to configure date conversions. These are not taken into account anymore. The needed functionality is now provided with Spring Data Elasticsearch’s @Field annotation. Please see Mapping Annotation Overview for detailed information.

Removal of implicit index name from query objects

In 3.2.x the different query classes like IndexQuery or SearchQuery had properties that were taking the index name or index names that they were operating upon. If these were not set, the passed in entity was inspected to retrieve the index name that was set in the @Document annotation.
In 4.0.x the index name(s) must now be provided in an additional parameter of type IndexCoordinates. By separating this, it now is possible to use one query object against different indices.

So for example the following code:

IndexQuery indexQuery = new IndexQueryBuilder()
  .withId(person.getId().toString())
  .withObject(person)
  .build();

String documentId = elasticsearchOperations.index(indexQuery);

must be changed to:

IndexCoordinates indexCoordinates = elasticsearchOperations.getIndexCoordinatesFor(person.getClass());

IndexQuery indexQuery = new IndexQueryBuilder()
  .withId(person.getId().toString())
  .withObject(person)
  .build();

String documentId = elasticsearchOperations.index(indexQuery, indexCoordinates);

To make it easier to work with entities and use the index name that is contained in the entitie’s @Document annotation, new methods have been added like DocumentOperations.save(T entity);

The new Operations interfaces

In version 3.2 there was the ElasticsearchOperations interface that defined all the methods for the ElasticsearchTemplate class. In version 4 the functions have been split into different interfaces, aligning these interfaces with the Elasticsearch API:

  • DocumentOperations are the functions related documents like saving, or deleting

  • SearchOperations contains the functions to search in Elasticsearch

  • IndexOperations define the functions to operate on indexes, like index creation or mappings creation.

ElasticsearchOperations now extends DocumentOperations and SearchOperations and has methods get access to an IndexOperations instance.

All the functions from the ElasticsearchOperations interface in version 3.2 that are now moved to the IndexOperations interface are still available, they are marked as deprecated and have default implementations that delegate to the new implementation:
/**
 * Create an index for given indexName.
 *
 * @param indexName the name of the index
 * @return {@literal true} if the index was created
 * @deprecated since 4.0, use {@link IndexOperations#create()}
 */
@Deprecated
default boolean createIndex(String indexName) {
	return indexOps(IndexCoordinates.of(indexName)).create();
}

Deprecations

Methods and classes

Many functions and classes have been deprecated. These functions still work, but the Javadocs show with what they should be replaced.

Example from ElasticsearchOperations
/*
 * Retrieves an object from an index.
 *
 * @param query the query defining the id of the object to get
 * @param clazz the type of the object to be returned
 * @return the found object
 * @deprecated since 4.0, use {@link #get(String, Class, IndexCoordinates)}
 */
@Deprecated
@Nullable
<T> T queryForObject(GetQuery query, Class<T> clazz);
Elasticsearch deprecations

Since version 7 the Elasticsearch TransportClient is deprecated, it will be removed with Elasticsearch version 8. Spring Data Elasticsearch deprecates the ElasticsearchTemplate class which uses the TransportClient in version 4.0.

Mapping types were removed from Elasticsearch 7, they still exist as deprecated values in the Spring Data @Document annotation and the IndexCoordinates class but they are not used anymore internally.

Removals

  • As already described, the ElasticsearchEntityMapper interface has been removed.

  • The SearchQuery interface has been merged into it’s base interface Query, so it’s occurrences can just be replaced with Query.

  • The method org.springframework.data.elasticsearch.core.ElasticsearchOperations.query(SearchQuery query, ResultsExtractor<T> resultsExtractor); and the org.springframework.data.elasticsearch.core.ResultsExtractor interface have been removed. These could be used to parse the result from Elasticsearch for cases in which the response mapping done with the Jackson based mapper was not enough. Since version 4.0, there are the new Search Result Types to return the information from an Elasticsearch response, so there is no need to expose this low level functionality.

  • The low level methods startScroll, continueScroll and clearScroll have been removed from the ElasticsearchOperations interface. For low level scroll API access, there now are searchScrollStart, searchScrollContinue and searchScrollClear methods on the ElasticsearchRestTemplate class.

Upgrading from 4.0.x to 4.1.x

This section describes breaking changes from version 4.0.x to 4.1.x and how removed features can be replaced by new introduced features.

Deprecations

Definition of the id property

It is possible to define a property of en entity as the id property by naming it either id or document. This behaviour is now deprecated and will produce a warning. PLease us the @Id annotation to mark a property as being the id property.

Index mappings

In the ReactiveElasticsearchClient.Indices interface the updateMapping methods are deprecated in favour of the putMapping methods. They do the same, but putMapping is consistent with the naming in the Elasticsearch API:

Alias handling

In the IndexOperations interface the methods addAlias(AliasQuery), removeAlias(AliasQuery) and queryForAlias() have been deprecated. The new methods alias(AliasAction), getAliases(String…​) and getAliasesForIndex(String…​) offer more functionality and a cleaner API.

Parent-ID

Usage of a parent-id has been removed from Elasticsearch since version 6. We now deprecate the corresponding fields and methods.

Removals

Type mappings

The type mappings parameters of the @Document annotation and the IndexCoordinates object were removed. They had been deprecated in Spring Data Elasticsearch 4.0 and their values weren’t used anymore.

Breaking Changes

Return types of ReactiveElasticsearchClient.Indices methods

The methods in the ReactiveElasticsearchClient.Indices were not used up to now. With the introduction of the ReactiveIndexOperations it became necessary to change some of the return types:

  • the createIndex variants now return a Mono<Boolean> instead of a Mono<Void> to signal successful index creation.

  • the updateMapping variants now return a Mono<Boolean> instead of a Mono<Void> to signal successful mappings storage.

Return types of DocumentOperartions.bulkIndex methods

These methods were returing a List<String> containing the ids of the new indexed records. Now they return a List<IndexedObjectInformation>; these objects contain the id and information about optimistic locking (seq_no and primary_term)

Upgrading from 4.1.x to 4.2.x

This section describes breaking changes from version 4.1.x to 4.2.x and how removed features can be replaced by new introduced features.

Deprecations

@Document parameters

The parameters of the @Document annotation that are relevant for the index settings (useServerConfiguration, shards. replicas, refreshIntervall and indexStoretype) have been moved to the @Setting annotation. Use in @Document is still possible but deprecated.

Removals

The @Score annotation that was used to set the score return value in an entity was deprecated in version 4.0 and has been removed. Scroe values are returned in the SearchHit instances that encapsulate the returned entities.

The org.springframework.data.elasticsearch.ElasticsearchException class has been removed. The remaining usages have been replaced with org.springframework.data.mapping.MappingException and org.springframework.dao.InvalidDataAccessApiUsageException.

The deprecated ScoredPage, ScrolledPage @AggregatedPage and implementations has been removed.

The deprecated GetQuery and DeleteQuery have been removed.

The deprecated find methods from ReactiveSearchOperations and ReactiveDocumentOperations have been removed.

Breaking Changes

RefreshPolicy
Enum package changed

It was possible in 4.1 to configure the refresh policy for the ReactiveElasticsearchTemplate by overriding the method AbstractReactiveElasticsearchConfiguration.refreshPolicy() in a custom configuration class. The return value of this method was an instance of the class org.elasticsearch.action.support.WriteRequest.RefreshPolicy.

Now the configuration must return org.springframework.data.elasticsearch.core.RefreshPolicy. This enum has the same values and triggers the same behaviour as before, so only the import statement has to be adjusted.

Refresh behaviour

ElasticsearchOperations and ReactiveElasticsearchOperations now explicitly use the RefreshPolicy set on the template for write requests if not null. If the refresh policy is null, then nothing special is done, so the cluster defaults are used. ElasticsearchOperations was always using the cluster default before this version.

The provided implementations for ElasticsearchRepository and ReactiveElasticsearchRepository will do an explicit refresh when the refresh policy is null. This is the same behaviour as in previous versions. If a refresh policy is set, then it will be used by the repositories as well.

Refresh configuration

When configuring Spring Data Elasticsearch like described in Elasticsearch Clients by using ElasticsearchConfigurationSupport, AbstractElasticsearchConfiguration or AbstractReactiveElasticsearchConfiguration the refresh policy will be initialized to null. Previously the reactive code initialized this to IMMEDIATE, now reactive and non-reactive code show the same behaviour.

Method return types
delete methods that take a Query

The reactive methods previously returned a Mono<Long> with the number of deleted documents, the non reactive versions were void. They now return a Mono<ByQueryResponse> which contains much more detailed information about the deleted documents and errors that might have occurred.

multiget methods

The implementations of multiget previousl only returned the found entities in a List<T> for non-reactive implementations and in a Flux<T> for reactive implementations. If the request contained ids that were not found, the information that these are missing was not available. The user needed to compare the returned ids to the requested ones to find which ones were missing.

Now the multiget methods return a MultiGetItem for every requested id. This contains information about failures (like non existing indices) and the information if the item existed (then it is contained in the `MultiGetItem) or not. :leveloffset: -1 :leveloffset: -1


1. Out of maintenance
2. see XML Configuration