[top] |
We are still in the milestone release phase (1.0-m5 is just out). This means that we are still adding functionality that we want to be part of a 1.0 release. We do not rule out changes to package and interface names in this phase, but that said we think the basic domain concepts in Spring Batch are sound enough to survive significant re-factoring. The bulk of the application developer "touch points" have been stable for quite some time now, and we have several early adopter projects already using snapshot releases.
The process from here is to collect feedback from the community and use that to decide on what extra features need to be added to get us to 1.0. When we are feature complete we will move to the "release candidate" phase, and the first release in that phase will be 1.0-rc1. We only expect one release candidate, but if there is enough demand for new features, or significant problems occur in rc1, then we might need an rc2. We aim for 2 weeks elapsed time between release candidates (and between the last milestone and rc1).
[top] |
[top] |
The "layers" described are nicely segregated in terms of dependency. Each layer only depends (at compile time) on layers below it.
We recognised that what we used to call the container layer actually is composed of two distinct contexts, "Core" and "Execution". So the full catalogue of contexts is:
The "execution" layer is fertile ground for
collaboration and contributions from the community
and from projects in the field. The central
interface is
JobLauncher
with methods for starting and stopping jobs. The
vision for this is that there can be multiple
implementations of
JobLauncher
providing different architectural patterns, and
delivering different levels of scalability and
robustness, without changing either the business
logic or the job configuration.
[top] |
CompletionPolicy
), rules about how to deal with exceptions (
ExceptionHandler
), and many others.
[top] |
Spring Batch and Quartz have different goals. Spring
Batch provides functionality for processing large
volumes of data and Quartz provides functionality
for scheduling tasks. So Quartz could complement
Spring Batch, but are not excluding technologies. A
common combination would be to use Quartz as a
trigger for a Spring Batch job using a Cron
expression and the Spring Core convenience
SchedulerFactoryBean
.
[top] |
Use a scheduling tool. There are plenty of them out there. Examples: Quartz, Control-M, Autosys. Quartz doesn't have all the features of Control-M or Autosys - it is supposed to be lightweight. If you want something even more lightweight you can just use the OS (cron, at, etc.).
Simple sequential dependencies can be implemented using the job-steps model of Spring Batch. We think this is quite common. And in fact it makes it easier to correct a common mis-use of scehdulers - having hundreds of jobs configured, many of which are not independent, but only depend on one other.
[top] |
StepExecutor
can deal with the concern of breaking apart the business
logic and sharing it efficiently between parallel
processes or processors. There are a number of
technologies that could play a role here. The essence is
just a set of concurrent remote calls to distributed
agents that can handle some business processing. Since
the business processing is already typically modularised
- e.g. input an item, process it - Spring Batch can
strategise the distribution in a number of ways. One
implementation that we have had some experience with
(and have a prototype for) is a set of remote EJBs
handling the business processing. We switch off Home
caching in the container and then send a specific range
of primary keys for the inputs to each of a number of
remote calls. The same basic strategy would work with
any of the Spring Remoting protocols (plain RMI,
HttpInvoker, JMS, Hessian etc.) with little more than a
couple of lines change in the execution layer
configuration.
[top] |
In a nutshell: A JobConfiguration with a list of StepConfigurations is passed to a JobExecutor. From this a Job is constructed consisting of a series of Steps, each of which is executed by a StepExecutor. The StepExecutor contains all the strategies for deciding when to complete, when to commit, when to abort and when to continue.
Many Jobs in practice consist of a single Step. Step is very useful and best practice for breaking a Job down into logical units, rather than having to execute separate Jobs (potentially in separate OS processes) which have no obvious logical connection.
Jobs can be executed once, or many times with different logical identifiers (JobIdentifier). It is also possible to restart a failed Job with the same or a modified input source, and identify the resulting JobExecution as a separate entity. In this way the progress of a Job and its history of successful and failed executions can easily be tracked. The same argument applies to Steps, which have their corresponding StepExecution entity.
[top] |
StepExecutor
or (more broadly) execution runtime if the deployment is
grid- or cluster-based, or in any way involves multiple
OS processes.
[top] |
[top] |