Increased the efficiency of chunk processing by having it execute asynchronously: in multiple threads. Maintain transactional intergrity of the chunk.
The vanilla case proceeds as for normal chunk processing, but:
If there is an exception in one of the record processing threads, the whole chunk should roll back:
If there is a timeout during a chunk, it might happen before the chunk has finished, or while waiting for the processes to complete before exiting.
A "normal" local transaction is thread bound - i.e. it only executes in one thread. If the code inside the transaction creates new threads, then they might not finish processing before the parent exits and the transaction wants to finish. The transaction needs to wait for the sub-processes before committing, or (more difficult) rolling back. The rollback case basically forces us to a model of one transaction per thread, and therefore to one transaction per data item in a concurrent environment.
Otherwise some transactional semantics might be respected in a parallel process, but others certainly will not be because synchronizations and resources are managed at the level of the thread where the transaction started. If the transaction manager is a local one (not XA) there is little hope even that the datasource resource would be the same for all the parallel threads and the parent method.
If we use a global transaction manager to make the parallel processes transactional, how will they know which transaction to participate in? There could be many active chunks, and each would have its own threads - how would each one be able to guide its child processes to participate in the same transaction?
This is the origin of the signature:
public interface ItemReader { Object next(); }
There is no peeking and no iterator-style hasNext. If there is a processing problem, transactional clients of the ItemReader throw an exception after the provider's next() has been called, but in the same thread (so that transactional semantics are preserved and the data provider reverts to its previous state).
This means that in the callback interface also picks up an Object return type
public interface RepeatCallback { Object doInIteration(BatchContext context); }
so we can return an object, which is null when the processing has finished.
In the end we decided against the Object return type and went with an exit status to signal for no more processing.