Before we dive deeper into the details of creating Tasks, we need to understand the typical lifecycle for tasks in the context of Spring Cloud Data Flow:
While Spring Cloud Task does provide a number of out of the box applications (via the spring-cloud-task-app-starters), most task applications will be custom developed. In order to create a custom task application:
Create a new project via Spring Initializer via either the website or your IDE making sure to select the following starters:
Cloud Task
- This dependency is the spring-cloud-starter-task
.JDBC
- This is the dependency for the spring-jdbc
starter.@EnableTask @SpringBootApplication public class MyTask { public static void main(String[] args) { SpringApplication.run(MyTask.class, args); } }
CommandLineRunner
or ApplicationRunner
within
your application. You can either implement your own or use the ones provided by Spring
Boot (there is one for running batch jobs for example).Register a Task App with the App Registry using the Spring Cloud Data Flow Shell
app register
command. You must provide a unique name and a URI that can be
resolved to the app artifact. For the type, specify "task". Here are a few examples:
dataflow:>app register --name task1 --type task --uri maven://com.example:mytask:1.0.2 dataflow:>app register --name task2 --type task --uri file:///Users/example/mytask-1.0.2.jar dataflow:>app register --name task3 --type task --uri http://example.com/mytask-1.0.2.jar
When providing a URI with the maven
scheme, the format should conform to the following:
maven://<groupId>:<artifactId>[:<extension>[:<classifier>]]:<version>
If you would like to register multiple apps at one time, you can store them in a properties file
where the keys are formatted as <type>.<name>
and the values are the URIs. For example, this
would be a valid properties file:
task.foo=file:///tmp/foo.jar task.bar=file:///tmp/bar.jar
Then use the app import
command and provide the location of the properties file via --uri
:
app import --uri file:///tmp/task-apps.properties
For convenience, we have the static files with application-URIs (for both maven and docker) available for all the out-of-the-box Task app-starters. You can point to this file and import all the application-URIs in bulk. Otherwise, as explained in previous paragraphs, you can register them individually or have your own custom property file with only the required application-URIs in it. It is recommended, however, to have a "focused" list of desired application-URIs in a custom property file.
List of available static property files:
Artifact Type | Stable Release | SNAPSHOT Release |
---|---|---|
Maven | http://bit.ly/Belmont-BUILD-SNAPSHOT-task-applications-maven | |
Docker | http://bit.ly/Belmont-BUILD-SNAPSHOT-task-applications-docker |
For example, if you would like to register all out-of-the-box task applications in bulk, you can with the following command.
dataflow:>app import --uri http://bit.ly/Belmont-GA-task-applications-maven
You can also pass the --local
option (which is TRUE by default) to indicate whether the
properties file location should be resolved within the shell process itself. If the location should
be resolved from the Data Flow Server process, specify --local false
.
When using either app register
or app import
, if a task app is already registered with
the provided name, it will not be overridden by default. If you would like to override the
pre-existing task app, then include the --force
option.
Note | |
---|---|
In some cases the Resource is resolved on the server side, whereas in others the URI will be passed to a runtime container instance where it is resolved. Consult the specific documentation of each Data Flow Server for more detail. |
Create a Task Definition from a Task App by providing a definition name as well as
properties that apply to the task execution. Creating a task definition can be done via
the restful API or the shell. To create a task definition using the shell, use the
task create
command to create the task definition. For example:
dataflow:>task create mytask --definition "timestamp --format=\"yyyy\"" Created new task 'mytask'
A listing of the current task definitions can be obtained via the restful API or the
shell. To get the task definition list using the shell, use the task list
command.
An adhoc task can be launched via the restful API or via the shell. To launch an ad-hoc
task via the shell use the task launch
command. For example:
dataflow:>task launch mytask Launched task 'mytask'
When a task is launched, any properties that need to be passed as the command line arguments to the task application can be set when launching the task as follows:
dataflow:>task launch mytask --arguments "--server.port=8080,--foo=bar"
Additional properties meant for a TaskLauncher
itself can be passed
in using a --properties
option. Format of this option is a comma
delimited string of properties prefixed with app.<task definition
name>.<property>
. Properties are passed
to TaskLauncher
as application properties and it is up to an
implementation to choose how those are passed into an actual task
application. If the property is prefixed with deployer
instead of app
it is
passed to TaskLauncher
as a deployment property and its meaning may
be TaskLauncher
implementation specific.
dataflow:>task launch mytask --properties "deployer.timestamp.foo1=bar1,app.timestamp.foo2=bar2"
In addition to configuration via DSL, Spring Cloud Data Flow provides a mechanism for setting common properties to all
the task applications that are launched by it.
This can be done by adding properties prefixed with spring.cloud.dataflow.applicationProperties.task
when starting the server.
When doing so, the server will pass all the properties, without the prefix, to the instances it launches.
For example, all the launched applications can be configured to use the properties foo
and fizz
by launching the Data Flow server
with the following options:
--spring.cloud.dataflow.applicationProperties.task.foo=bar --spring.cloud.dataflow.applicationProperties.task.fizz=bar2
This will cause the properties foo=bar
and fizz=bar2
to be passed to all the launched applications.
Note | |
---|---|
Properties configured using this mechanism have lower precedence than task deployment properties.
They will be overridden if a property with the same key is specified at task launch time (e.g. |
Once the task is launched the state of the task is stored in a relational DB. The state includes:
A user can check the status of their task executions via the restful API or by the shell.
To display the latest task executions via the shell use the task execution list
command.
To get a list of task executions for just one task definition, add --name
and
the task definition name, for example task execution list --name foo
. To retrieve full
details for a task execution use the task display
command with the id of the task execution,
for example task display --id 549
.
Destroying a Task Definition will remove the definition from the definition repository.
This can be done via the restful API or via the shell. To destroy a task via the shell
use the task destroy
command. For example:
dataflow:>task destroy mytask Destroyed task 'mytask'
The task execution information for previously launched tasks for the definition will remain in the task repository.
Note | |
---|---|
This will not stop any currently executing tasks for this definition, instead it just removes the task definition from the database. |