This version is still in development and is not considered stable yet. For the latest stable version, please use spring-cloud-task 3.1.2! |
Single Step Batch Job Starter
This section goes into how to develop a Spring Batch Job
with a single Step
by using the
starter included in Spring Cloud Task. This starter lets you use configuration
to define an ItemReader
, an ItemWriter
, or a full single-step Spring Batch Job
.
For more about Spring Batch and its capabilities, see the
Spring Batch documentation.
To obtain the starter for Maven, add the following to your build:
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-single-step-batch-job</artifactId>
<version>2.3.0</version>
</dependency>
To obtain the starter for Gradle, add the following to your build:
compile "org.springframework.cloud:spring-cloud-starter-single-step-batch-job:2.3.0"
Defining a Job
You can use the starter to define as little as an ItemReader
or an ItemWriter
or as much as a full Job
.
In this section, we define which properties are required to be defined to configure a
Job
.
Properties
To begin, the starter provides a set of properties that let you configure the basics of a Job with one Step:
Property | Type | Default Value | Description |
---|---|---|---|
|
|
|
The name of the job. |
|
|
|
The name of the step. |
|
|
|
The number of items to be processed per transaction. |
With the above properties configured, you have a job with a single, chunk-based step.
This chunk-based step reads, processes, and writes Map<String, Object>
instances as the
items. However, the step does not yet do anything. You need to configure an ItemReader
, an
optional ItemProcessor
, and an ItemWriter
to give it something to do. To configure one
of these, you can either use properties and configure one of the options that has provided
autoconfiguration or you can configure your own with the standard Spring configuration
mechanisms.
If you configure your own, the input and output types must match the others in the step.
The ItemReader implementations and ItemWriter implementations in this starter all use
a Map<String, Object> as the input and the output item.
|
Autoconfiguration for ItemReader Implementations
This starter provides autoconfiguration for four different ItemReader
implementations:
AmqpItemReader
, FlatFileItemReader
, JdbcCursorItemReader
, and KafkaItemReader
.
In this section, we outline how to configure each of these by using the provided
autoconfiguration.
AmqpItemReader
You can read from a queue or topic with AMQP by using the AmqpItemReader
. The
autoconfiguration for this ItemReader
implementation is dependent upon two sets of
configuration. The first is the configuration of an AmqpTemplate
. You can either
configure this yourself or use the autoconfiguration provided by Spring Boot. See the
Spring Boot AMQP documentation.
Once you have configured the AmqpTemplate
, you can enable the batch capabilities to support it
by setting the following properties:
Property | Type | Default Value | Description |
---|---|---|---|
|
|
|
If |
|
|
|
Indicates if the |
For more information, see the AmqpItemReader
documentation.
FlatFileItemReader
FlatFileItemReader
lets you read from flat files (such as CSVs
and other file formats). To read from a file, you can provide some components
yourself through normal Spring configuration (LineTokenizer
, RecordSeparatorPolicy
,
FieldSetMapper
, LineMapper
, or SkippedLinesCallback
). You can also use the
following properties to configure the reader:
Property | Type | Default Value | Description |
---|---|---|---|
|
|
|
Determines if the state should be saved for restarts. |
|
|
|
Name used to provide unique keys in the |
|
|
|
Maximum number of items to be read from the file. |
|
|
0 |
Number of items that have already been read. Used on restarts. |
|
|
empty List |
A list of Strings that indicate commented lines (lines to be ignored) in the file. |
|
|
|
The resource to be read. |
|
|
|
If set to |
|
|
|
Encoding to be used when reading the file. |
|
|
0 |
Indicates the number of lines to skip at the start of a file. |
|
|
|
Indicates whether the file is a delimited file (CSV and other formats). Only one of this property or |
|
|
|
If reading a delimited file, indicates the delimiter to parse on. |
|
|
|
Used to determine the character used to quote values. |
|
|
empty list |
A list of indices to determine which fields in a record to include in the item. |
|
|
|
Indicates if a file’s records are parsed by column numbers. Only one of this property or |
|
|
empty list |
List of column ranges by which to parse a fixed width record. See the Range documentation. |
|
|
|
List of names for each field parsed from a record. These names are the keys in the |
|
|
|
If set to |
See the FlatFileItemReader
documentation.
JdbcCursorItemReader
The JdbcCursorItemReader
runs a query against a relational database and iterates over
the resulting cursor (ResultSet
) to provide the resulting items. This autoconfiguration
lets you provide a PreparedStatementSetter
, a RowMapper
, or both. You
can also use the following properties to configure a JdbcCursorItemReader
:
Property | Type | Default Value | Description |
---|---|---|---|
|
|
|
Determines whether the state should be saved for restarts. |
|
|
|
Name used to provide unique keys in the |
|
|
|
Maximum number of items to be read from the file. |
|
|
0 |
Number of items that have already been read. Used on restarts. |
|
|
A hint to the driver to indicate how many records to retrieve per call to the database system. For best performance, you usually want to set it to match the chunk size. |
|
|
|
Maximum number of items to read from the database. |
|
|
|
Number of milliseconds for the query to timeout. |
|
|
|
|
Determines whether the reader should ignore SQL warnings when processing. |
|
|
|
Indicates whether the cursor’s position should be verified after each read to verify that the |
|
|
|
Indicates whether the driver supports absolute positioning of a cursor. |
|
|
|
Indicates whether the connection is shared with other processing (and is therefore part of a transaction). |
|
|
|
SQL query from which to read. |
You can also specify JDBC DataSource specifically for the reader by using the following properties:
.JdbcCursorItemReader
Properties
Property | Type | Default Value | Description |
---|---|---|---|
|
|
|
Determines whether |
|
|
|
JDBC URL of the database. |
|
|
|
Login username of the database. |
|
|
|
Login password of the database. |
|
|
|
Fully qualified name of the JDBC driver. |
The default DataSource will be used by the JDBCCursorItemReader if the jdbccursoritemreader_datasource is not specified.
|
See the JdbcCursorItemReader
documentation.
KafkaItemReader
Ingesting a partition of data from a Kafka topic is useful and exactly what the
KafkaItemReader
can do. To configure a KafkaItemReader
, two pieces
of configuration are required. First, configuring Kafka with Spring Boot’s Kafka
autoconfiguration is required (see the
Spring Boot Kafka documentation).
Once you have configured the Kafka properties from Spring Boot, you can configure the KafkaItemReader
itself by setting the following properties:
Property | Type | Default Value | Description |
---|---|---|---|
|
|
|
Name used to provide unique keys in the |
|
|
|
Name of the topic from which to read. |
|
|
empty list |
List of partition indices from which to read. |
|
|
30 |
Timeout for the |
|
|
|
Determines whether the state should be saved for restarts. |
See the KafkaItemReader
documentation.
Native Compilation
The advantage of Single Step Batch Processing is that it lets you dynamically select which reader and writer beans to use at runtime when you use the JVM. However, when you use native compilation, you must determine the reader and writer at build time instead of runtime. The following example does so:
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<executions>
<execution>
<id>process-aot</id>
<goals>
<goal>process-aot</goal>
</goals>
<configuration>
<jvmArguments>
-Dspring.batch.job.flatfileitemreader.name=fooReader
-Dspring.batch.job.flatfileitemwriter.name=fooWriter
</jvmArguments>
</configuration>
</execution>
</executions>
</plugin>
ItemProcessor Configuration
The single-step batch job autoconfiguration accepts an ItemProcessor
if one
is available within the ApplicationContext
. If one is found of the correct type
(ItemProcessor<Map<String, Object>, Map<String, Object>>
), it is autowired
into the step.
Autoconfiguration for ItemWriter implementations
This starter provides autoconfiguration for ItemWriter
implementations that
match the supported ItemReader
implementations: AmqpItemWriter
,
FlatFileItemWriter
, JdbcItemWriter
, and KafkaItemWriter
. This section
covers how to use autoconfiguration to configure a supported ItemWriter
.
AmqpItemWriter
To write to a RabbitMQ queue, you need two sets of configuration. First, you need an
AmqpTemplate
. The easiest way to get this is by using Spring Boot’s
RabbitMQ autoconfiguration. See the Spring Boot AMQP documentation.
Once you have configured the AmqpTemplate
, you can configure the AmqpItemWriter
by setting the
following properties:
Property | Type | Default Value | Description |
---|---|---|---|
|
|
|
If |
|
|
|
Indicates whether |
FlatFileItemWriter
To write a file as the output of the step, you can configure FlatFileItemWriter
.
Autoconfiguration accepts components that have been explicitly configured (such as LineAggregator
,
FieldExtractor
, FlatFileHeaderCallback
, or a FlatFileFooterCallback
) and
components that have been configured by setting the following properties specified:
Property | Type | Default Value | Description |
---|---|---|---|
|
|
|
The resource to be read. |
|
|
|
Indicates whether the output file is a delimited file. If |
|
|
|
Indicates whether the output file a formatted file. If |
|
|
|
The format used to generate the output for a formatted file. The formatting is performed by using |
|
|
|
The |
|
|
0 |
Max length of the record. If 0, the size is unbounded. |
|
|
0 |
The minimum record length. |
|
|
|
The |
|
|
|
Encoding to use when writing the file. |
|
|
|
Indicates whether a file should be force-synced to the disk on flush. |
|
|
|
List of names for each field parsed from a record. These names are the keys in the |
|
|
|
Indicates whether a file should be appended to if the output file is found. |
|
|
|
What |
|
|
|
Name used to provide unique keys in the |
|
|
|
Determines whether the state should be saved for restarts. |
|
|
|
If set to |
|
|
|
If set to |
|
|
|
Indicates whether the reader is a transactional queue (indicating that the items read are returned to the queue upon a failure). |
See the FlatFileItemWriter
documentation.
JdbcBatchItemWriter
To write the output of a step to a relational database, this starter provides the ability
to autoconfigure a JdbcBatchItemWriter
. The autoconfiguration lets you provide your
own ItemPreparedStatementSetter
or ItemSqlParameterSourceProvider
and
configuration options by setting the following properties:
Property | Type | Default Value | Description |
---|---|---|---|
|
|
|
Name used to provide unique keys in the |
|
|
|
The SQL used to insert each item. |
|
|
|
Whether to verify that every insert results in the update of at least one record. |
You can also specify JDBC DataSource specifically for the writer by using the following properties:
.JdbcBatchItemWriter
Properties
Property | Type | Default Value | Description |
---|---|---|---|
|
|
|
Determines whether |
|
|
|
JDBC URL of the database. |
|
|
|
Login username of the database. |
|
|
|
Login password of the database. |
|
|
|
Fully qualified name of the JDBC driver. |
The default DataSource will be used by the JdbcBatchItemWriter if the jdbcbatchitemwriter_datasource is not specified.
|
See the JdbcBatchItemWriter
documentation.
KafkaItemWriter
To write step output to a Kafka topic, you need KafkaItemWriter
. This starter
provides autoconfiguration for a KafkaItemWriter
by using facilities from two places.
The first is Spring Boot’s Kafka autoconfiguration. (See the Spring Boot Kafka documentation.)
Second, this starter lets you configure two properties on the writer.
Property | Type | Default Value | Description |
---|---|---|---|
|
|
|
The Kafka topic to which to write. |
|
|
|
Whether the items being passed to the writer are all to be sent as delete events to the topic. |
For more about the configuration options for the KafkaItemWriter
, see the KafkaItemWiter
documentation.
Spring AOT
When using Spring AOT with Single Step Batch Starter you must set the reader and
writer name properties at compile time (unless you create a bean(s) for the reader and or writer).
To do this you must include the name of the reader and writer that you wish to use as
and argument or environment variable in the boot maven plugin or gradle plugin. For example if
you wish to enable the FlatFileItemReader
and FlatFileItemWriter
in Maven it would look like:
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<executions>
<execution>
<id>process-aot</id>
<goals>
<goal>process-aot</goal>
</goals>
</execution>
</executions>
<configuration>
<arguments>
<argument>--spring.batch.job.flatfileitemreader.name=foobar</argument>
<argument>--spring.batch.job.flatfileitemwriter.name=fooWriter</argument>
</arguments>
</configuration>
</plugin>