Meta-Data Schema

Appendix A: Meta-Data Schema

Overview

The Spring Batch Metadata tables closely match the domain objects that represent them in Java. For example, JobInstance, JobExecution, JobParameters, and StepExecution map to BATCH_JOB_INSTANCE, BATCH_JOB_EXECUTION, BATCH_JOB_EXECUTION_PARAMS, and BATCH_STEP_EXECUTION, respectively. ExecutionContext maps to both BATCH_JOB_EXECUTION_CONTEXT and BATCH_STEP_EXECUTION_CONTEXT. The JobRepository is responsible for saving and storing each Java object into its correct table. This appendix describes the metadata tables in detail, along with many of the design decisions that were made when creating them. When viewing the various table creation statements described later in this appendix, note that the data types used are as generic as possible. Spring Batch provides many schemas as examples. All of them have varying data types, due to variations in how individual database vendors handle data types. The following image shows an ERD model of all six tables and their relationships to one another:

Figure 1. Spring Batch Meta-Data ERD

Example DDL Scripts

The Spring Batch Core JAR file contains example scripts to create the relational tables for a number of database platforms (which are, in turn, auto-detected by the job repository factory bean or namespace equivalent). These scripts can be used as is or modified with additional indexes and constraints, as desired. The file names are in the form schema-*.sql, where * is the short name of the target database platform. The scripts are in the package org.springframework.batch.core.

Migration DDL Scripts

Spring Batch provides migration DDL scripts that you need to execute when you upgrade versions. These scripts can be found in the Core Jar file under org/springframework/batch/core/migration. Migration scripts are organized into folders corresponding to version numbers in which they were introduced:

2.2: Contains scripts you need to migrate from a version before 2.2 to version 2.2
4.1: Contains scripts you need to migrate from a version before 4.1 to version 4.1

Version

Many of the database tables discussed in this appendix contain a version column. This column is important, because Spring Batch employs an optimistic locking strategy when dealing with updates to the database. This means that each time a record is “touched” (updated), the value in the version column is incremented by one. When the repository goes back to save the value, if the version number has changed, it throws an OptimisticLockingFailureException, indicating that there has been an error with concurrent access. This check is necessary, since, even though different batch jobs may be running in different machines, they all use the same database tables.

Identity

BATCH_JOB_INSTANCE, BATCH_JOB_EXECUTION, and BATCH_STEP_EXECUTION each contain columns ending in _ID. These fields act as primary keys for their respective tables. However, they are not database generated keys. Rather, they are generated by separate sequences. This is necessary because, after inserting one of the domain objects into the database, the key it is given needs to be set on the actual object so that they can be uniquely identified in Java. Newer database drivers (JDBC 3.0 and up) support this feature with database-generated keys. However, rather than require that feature, sequences are used. Each variation of the schema contains some form of the following statements:

CREATE SEQUENCE BATCH_STEP_EXECUTION_SEQ;
CREATE SEQUENCE BATCH_JOB_EXECUTION_SEQ;
CREATE SEQUENCE BATCH_JOB_SEQ;

Many database vendors do not support sequences. In these cases, work-arounds are used, such as the following statements for MySQL:

CREATE TABLE BATCH_STEP_EXECUTION_SEQ (ID BIGINT NOT NULL) type=InnoDB;
INSERT INTO BATCH_STEP_EXECUTION_SEQ values(0);
CREATE TABLE BATCH_JOB_EXECUTION_SEQ (ID BIGINT NOT NULL) type=InnoDB;
INSERT INTO BATCH_JOB_EXECUTION_SEQ values(0);
CREATE TABLE BATCH_JOB_SEQ (ID BIGINT NOT NULL) type=InnoDB;
INSERT INTO BATCH_JOB_SEQ values(0);

In the preceding case, a table is used in place of each sequence. The Spring core class, MySQLMaxValueIncrementer, then increments the one column in this sequence to give similar functionality.

The `BATCH_JOB_INSTANCE` Table

The BATCH_JOB_INSTANCE table holds all information relevant to a JobInstance and serves as the top of the overall hierarchy. The following generic DDL statement is used to create it:

CREATE TABLE BATCH_JOB_INSTANCE  (
  JOB_INSTANCE_ID BIGINT  PRIMARY KEY ,
  VERSION BIGINT,
  JOB_NAME VARCHAR(100) NOT NULL ,
  JOB_KEY VARCHAR(32) NOT NULL
);

The following list describes each column in the table:

JOB_INSTANCE_ID: The unique ID that identifies the instance. It is also the primary key. The value of this column should be obtainable by calling the getId method on JobInstance.
VERSION: See Version.
JOB_NAME: Name of the job obtained from the Job object. Because it is required to identify the instance, it must not be null.
JOB_KEY: A serialization of the JobParameters that uniquely identifies separate instances of the same job from one another. (JobInstances with the same job name must have different JobParameters and, thus, different JOB_KEY values).

The `BATCH_JOB_EXECUTION_PARAMS` Table

The BATCH_JOB_EXECUTION_PARAMS table holds all information relevant to the JobParameters object. It contains 0 or more key/value pairs passed to a Job and serves as a record of the parameters with which a job was run. For each parameter that contributes to the generation of a job’s identity, the IDENTIFYING flag is set to true. Note that the table has been denormalized. Rather than creating a separate table for each type, there is one table with a column indicating the type, as the following listing shows:

CREATE TABLE BATCH_JOB_EXECUTION_PARAMS  (
	JOB_EXECUTION_ID BIGINT NOT NULL ,
	TYPE_CD VARCHAR(6) NOT NULL ,
	KEY_NAME VARCHAR(100) NOT NULL ,
	STRING_VAL VARCHAR(250) ,
	DATE_VAL DATETIME DEFAULT NULL ,
	LONG_VAL BIGINT ,
	DOUBLE_VAL DOUBLE PRECISION ,
	IDENTIFYING CHAR(1) NOT NULL ,
	constraint JOB_EXEC_PARAMS_FK foreign key (JOB_EXECUTION_ID)
	references BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
);