Configuring a Step

Despite the relatively short list of required dependencies for a Step, it is an extremely complex class that can potentially contain many collaborators.

  • Java

  • XML

When using Java configuration, you can use the Spring Batch builders, as the following example shows:

Java Configuration
/**
 * Note the JobRepository is typically autowired in and not needed to be explicitly
 * configured
 */
@Bean
public Job sampleJob(JobRepository jobRepository, Step sampleStep) {
    return new JobBuilder("sampleJob", jobRepository)
                .start(sampleStep)
                .build();
}

/**
 * Note the TransactionManager is typically autowired in and not needed to be explicitly
 * configured
 */
@Bean
public Step sampleStep(JobRepository jobRepository, (1)
		PlatformTransactionManager transactionManager) { (2)
	return new StepBuilder(jobRepository) (3)
				.<String, String>chunk(10).transactionManager(transactionManager) (4)
				.reader(itemReader())
				.writer(itemWriter())
				.build();
}
1 repository: The Java-specific name of the JobRepository that periodically stores the StepExecution and ExecutionContext during processing (just before committing).
2 transactionManager: Spring’s PlatformTransactionManager that begins and commits transactions during processing.
3 Step name: when the step is declared as a bean, the name can be omitted and will be derived from the method name. However, if the step is not defined as a bean, the name must be explicitly provided to the StepBuilder constructor like new StepBuilder("myStep", jobRepository).
4 chunk: The Java-specific name of the dependency that indicates that this is an item-based step and the number of items to be processed before the transaction is committed.
Note that repository defaults to jobRepository (provided through @EnableBatchProcessing) and transactionManager defaults to transactionManager (provided from the application context). The transaction manager is optional and defaults to a ResourcelessTransactionManager. Also, the ItemProcessor is optional, since the item could be directly passed from the reader to the writer.

To ease configuration, you can use the Spring Batch XML namespace, as the following example shows:

XML Configuration
<job id="sampleJob" job-repository="jobRepository"> (2)
    <step id="step1">
        <tasklet transaction-manager="transactionManager"> (1)
            <chunk reader="itemReader" writer="itemWriter" commit-interval="10"/> (3)
        </tasklet>
    </step>
</job>
1 transaction-manager: Spring’s PlatformTransactionManager that begins and commits transactions during processing.
2 job-repository: The XML-specific name of the JobRepository that periodically stores the StepExecution and ExecutionContext during processing (just before committing). For an in-line <step/> (one defined within a <job/>), it is an attribute on the <job/> element. For a standalone <step/>, it is defined as an attribute of the <tasklet/>.
3 commit-interval: The XML-specific name of the number of items to be processed before the transaction is committed.
Note that job-repository defaults to jobRepository and transaction-manager defaults to transactionManager. Also, the ItemProcessor is optional, since the item could be directly passed from the reader to the writer.

The preceding configuration includes the only required dependencies to create an item-oriented step:

  • reader: The ItemReader that provides items for processing.

  • writer: The ItemWriter that processes the items provided by the ItemReader.

The transaction manager used in the step could be different from the one used in the job repository. The caveat though is that the job repository and the processing database won’t be in the same transaction, so if a failure occurs after processing but before the job repository is updated, the step could be re-executed and lead to duplicate processing. This could be mitigated through idempotent processing or external transaction management (e.g., JTA).