Spring Batch Features and Roadmap

See also details of main themes of 2.0.

2.0 Features

The following features are supported by Spring Batch 2.0:

Optimisation and Infrastructure

  • RepeatOperations: an abstraction for grouping repeated operations together and moving the iteration logic into the framework.
  • RetryOperations: an abstraction for automatic retry.
  • ItemReader abstraction and implementations for flat files, xml streaming and simple database queries.
  • Flat files are supported with fixed length and delimited records (input and ouput).
  • Xml is supported through Spring OXM mapping between objects and Xml elements (input and ouput). Large files are streamed, not read as a whole.
  • Database implementations of ItemReader are provided that map a row of a ResultSet identified by a simple (single or multiple column) primary key.
  • ItemWriter abstraction and implementations for flat files and xml (the Sql case is just a regular Jdbc Dao).
  • ItemReader and ItemWriter implementations are generally ItemStreams. An ItemStream provides the facility to be restored from a persistent ExecutionContext so that jobs can fail and be restarted in another process.
  • For modifying an item before it is written, there is the ItemProcessor abstraction. ItemProcessor and ItemWriter are the two most common application developer touch points.

Core Domain

  • Job is the root of the core domain - it is a recipe for how to construct and run a JobInstance.
  • A Job is composed of a list of Steps (sequential step model for job).
  • Job is also the entry point for launching a JobExecution.
  • Step is the corresponding point for a StepExecution. Step is the main strategy for different scaling, distribution and processing approaches. The 2.0 release contains implementations for in-process execution (single VM), and a PartitionStep as part of an SPI for remote execution of steps. See below (under Execution).
  • The most commonly used implementation of Step is a wrapper for an ItemReader and an ItemWriter. There is also a special implementation that wraps a Tasklet, which can be used to execute a single action like a stored procedure call.
  • FactoryBeans are provided for creating Step instances with the most common features. See in particular FaultTolerantStepFactoryBean for a factory that provides convenient configuration points for skips and retries.
  • Late binding of environment properties, job parameters and execution context values into a Step when it starts. A custom Spring Scope takes care of deferring the initialization of components until a step is executing.

Job Execution and Management

  • A simple JobLauncher to launch jobs. Start a new one or restart one that has previously failed. This can be used by a command-line or JMX launcher to take simple input parameters and convert them to the form required by the Core. (Examples of both are in the Samples module.)
  • Persistence of job meta data for management and reporting purposes: job and step identifiers, job parameters, commit counts, rollback counts. Execution attributes (a human readable represenation of the state of the job - can be augmented by developers).
  • Adjustiable exception handling strategies allowing fault tolerance through skipping bad records.
  • Concurrent execution of chunks (a chunk is a batch of items processed in the same transaction) through the Spring TaskExecutor abstraction.
  • Automatic retry of a chunk and recovery for items that have exhausted their retry count.
  • Translation of job execution result into an exit code for schedulers running the job as an OS process.
  • A set of listener callbacks that users can implement and register with a Step to add custom behaviour like footer records.
  • Remote chunking of steps. The step proceeds as in the single JVM case, but each chunk is passed on to the remote processes. The remote execution is an asynchronous listener of some sort (e.g. message-driven component or web service). Implemented using Spring Integration in a Batch sub-project (spring-batch-integration).
  • Partitioning - steps execute concurrently and optionally in separate processes. Feedback loop between consumers and producers to prevent overflows. Spring Batch provides an SPI for Partitioning and an implementation for local (multi-threaded, single JVM) execution.
  • OSGi support. Deploy the Spring Batch framework as a set of OSGi services. Deploy individual jobs or groups of jobs as additional bundles that depend on the core. Spring Batch JAR files are also OSGi bundles, and can be deployed easily in SpringSource dm Server.
  • Non-sequential models for Job configuration (branching and descision support).

Samples

  • A range of samples is available as a separate module. They all use a common simple configuration and extend in various ways to show the different features of the Execution module.

Roadmap (Beyond 2.0).

  • Issue tracking - a job is not finished until all issues with its executions are resolved. Spring Batch can provide hooks to integrate with internal issue tracking systems so that the lifetime of a job can be properly managed.
  • Auditing. Implement hooks to monitor not only what jobs execute and the result of the execution (as per 1.0 possibly with some richer options for detailed outcome reports), but also who has executed the job, what changes they made to runtime parameters.

SpringSource Enterprise Batch

  • The plan is for SpringSource to provide an enterprise product that deals with runtime concerns, as opposed to programming and configuration.
  • Triggering. Other runtime concerns, like monitoring and managemtn of jobs and historical executions.