Spring Cloud Data Flow Server for Cloud Foundry

Authors

Sabby Anandan, Eric Bottard, Mark Fisher, Ilayaperumal Gopinathan, Gunnar Hillert, Mark Pollack, Thomas Risberg, Marius Bogoevici, Josh Long, Michael Minella

1.1.0.M1

Copies of this document may be made for your own use and for distribution to others, provided that you do not charge any fee for such copies and further provided that each copy contains this Copyright Notice, whether distributed in print or electronically.


Table of Contents

I. Spring Cloud Data Flow for Cloud Foundry
1. Spring Cloud Data Flow
2. Spring Cloud Stream
3. Spring Cloud Task
II. Getting started
4. Deploying on Cloud Foundry
4.1. Provision a Redis service instance on Cloud Foundry
4.2. Provision a Rabbit service instance on Cloud Foundry
4.3. Provision a MySQL service instance on Cloud Foundry
4.4. Download the Spring Cloud Data Flow Server and Shell apps
4.5. Running the Server
4.5.1. Deploying and Running the Server app on Cloud Foundry
Configuring Defaults for Deployed Apps
4.5.2. Running the Server app locally
4.6. Running Tasks
4.7. Running Spring Cloud Data Flow Shell locally
5. Security
6. Application Names and Prefixes
7. Authentication and Cloud Foundry
8. Configuration Reference
8.1. Using Spring Cloud Config Server
9. Application Level Service Bindings
10. A Note About User Provided Services
11. Application Rolling Upgrades
III. Tasks on Cloud Foundry
12. Version Compatibility
13. Tooling
14. Running Task Applications
14.1. Create a Task
14.2. Launch a Task
14.3. View Task Logs
14.4. List Tasks
14.5. List Task Executions
14.6. Destroy a Task
IV. Appendices

Part I. Spring Cloud Data Flow for Cloud Foundry

This project provides support for orchestrating the deployment of Spring Cloud Stream applications to Cloud Foundry.

1. Spring Cloud Data Flow

Spring Cloud Data Flow is a cloud-native programming and operating model for composable data microservices on a structured platform. With Spring Cloud Data Flow, developers can create and orchestrate data pipelines for common use cases such as data ingest, real-time analytics, and data import/export.

The Spring Cloud Data Flow architecture consists of a server that deploys Streams. A future release will also support deploying Tasks. Streams are defined using a DSL or visually through the browser based designer UI. Streams are based on the Spring Cloud Stream programming model. The sections below describe more information about creating your own custom Streams.

For more details about the core architecture components and the supported features, please review Spring Cloud Data Flow’s core reference guide. There’re several samples available for reference.

2. Spring Cloud Stream

Spring Cloud Stream is a framework for building message-driven microservice applications. Spring Cloud Stream builds upon Spring Boot to create standalone, production-grade Spring applications, and uses Spring Integration to provide connectivity to message brokers. It provides opinionated configuration of middleware from several vendors, introducing the concepts of persistent publish-subscribe semantics, consumer groups, and partitions.

For more details about the core framework components and the supported features, please review Spring Cloud Stream’s reference guide.

There’s a rich ecosystem of Spring Cloud Stream Application-Starters that can be used either as standalone data microservice applications or in Spring Cloud Data Flow. For convenience, we have generated RabbitMQ and Apache Kafka variants of these application-starters that are available for use from Maven Repo and Docker Hub as maven artifacts and docker images, respectively.

Do you have a requirement to develop custom applications? No problem. Refer to this guide to create custom stream applications. There’re several samples available for reference.

3. Spring Cloud Task

Spring Cloud Task makes it easy to create short-lived microservices. We provide capabilities that allow short-lived JVM processes to be executed on demand in a production environment.

For more details about the core framework components and the supported features, please review Spring Cloud Task’s reference guide.

There’s a rich ecosystem of Spring Cloud Task Application-Starters that can be used either as standalone data microservice applications or in Spring Cloud Data Flow. For convenience, the generated application-starters are available for use from Maven Repo. There are several samples available for reference.

Unresolved directive in index.adoc - include::https://raw.githubusercontent.com/spring-cloud/spring-cloud-dataflow/$/spring-cloud-dataflow-docs/src/main/asciidoc/architecture.adoc[]

Part II. Getting started

4. Deploying on Cloud Foundry

Spring Cloud Data Flow can be used to deploy modules in a Cloud Foundry environment. When doing so, the server application can either run itself on Cloud Foundry, or on another installation (e.g. a simple laptop).

The required configuration amounts to the same in either case, and is merely related to providing credentials to the Cloud Foundry instance so that the server can spawn applications itself. Any Spring Boot compatible configuration mechanism can be used (passing program arguments, editing configuration files before building the application, using Spring Cloud Config, using environment variables, etc.), although some may prove more practicable than others when running on Cloud Foundry.

[Note]Note

By default, the application registry in Spring Cloud Data Flow’s Cloud Foundry server is empty. It is intentionally designed to allow users to have the flexibility of choosing and registering applications, as they find appropriate for the given use-case requirement. Depending on the message-binder of choice, users can register between RabbitMQ or Apache Kafka based maven artifacts.

4.1 Provision a Redis service instance on Cloud Foundry

Use cf marketplace to discover which plans are available to you, depending on the details of your Cloud Foundry setup. For example when using Pivotal Web Services:

cf create-service rediscloud 30mb redis

A redis instance is required for analytics apps, and would typically be bound to such apps when you create an analytics stream using the per-app-binding feature.

4.2 Provision a Rabbit service instance on Cloud Foundry

Use cf marketplace to discover which plans are available to you, depending on the details of your Cloud Foundry setup. For example when using Pivotal Web Services:

cf create-service cloudamqp lemur rabbit

Rabbit is typically used as a messaging middleware between streaming apps and would be bound to each deployed app thanks to the SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_SERVICES setting (see below).

4.3 Provision a MySQL service instance on Cloud Foundry

Use cf marketplace to discover which plans are available to you, depending on the details of your Cloud Foundry setup. For example when using Pivotal Web Services:

cf create-service p_mysql 100mb my_mysql

An RDBMS is used to persist Data Flow state, such as stream definitions and deployment ids. It can also be used for tasks to persist execution history.

4.4 Download the Spring Cloud Data Flow Server and Shell apps

wget http://repo.spring.io/milestone/org/springframework/cloud/spring-cloud-dataflow-server-cloudfoundry/1.1.0.M1/spring-cloud-dataflow-server-cloudfoundry-1.1.0.M1.jar
wget http://repo.spring.io/milestone/org/springframework/cloud/spring-cloud-dataflow-shell/1.1.0.M1/spring-cloud-dataflow-shell-1.1.0.M1.jar

4.5 Running the Server

You can either deploy the server application on Cloud Foundry itself or on your local machine. The following two sections explain each way of running the server.

4.5.1 Deploying and Running the Server app on Cloud Foundry

Push the server application on Cloud Foundry, configure it (see below) and start it.

[Note]Note

You must use a unique name for your app; an app with the same name in the same organization will cause your deployment to fail

cf push dataflow-server -m 1G --no-start -p spring-cloud-dataflow-server-cloudfoundry-1.1.0.M1.jar
cf bind-service dataflow-server redis
cf bind-service dataflow-server my_mysql
[Important]Important

The recommended minimal memory setting for the server is 1G. Also, to obtain extra runtime information about which properties are available for apps, the server currently downloads those apps (typically Spring Boot uber-jars) in a local Maven repository. As such, you may want to increase the allocated disk size as well.

[Note]Note

If you are pushing to a space with multiple users, for example on PWS, there may already be a route taken for the applicaiton name you have chosen. You can use the options --random-route to avoid this when pushing the app.

Now we can configure the app. The following configuration is for Pivotal Web Services. You need to fill in {org}, {space}, {email} and {password} before running these commands.

cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_URL https://api.run.pivotal.io
cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_ORG {org}
cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_SPACE {space}
cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_DOMAIN cfapps.io
cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_SERVICES rabbit
cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_USERNAME {email}
cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_PASSWORD {password}
cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_SKIP_SSL_VALIDATION false
[Warning]Warning

Only set 'Skip SSL Validation' to true if you’re running on a Cloud Foundry instance using self-signed certs (e.g. in development). Do not use for production.

[Note]Note

If you are deploying in an environment that requires you to sign on using the Pivotal Single Sign-On Service, refer to the section Chapter 7, Authentication and Cloud Foundry for information on how to configure the server.

Spring Cloud Data Flow server implementations (be it for Cloud Foundry, Mesos, YARN, or Kubernetes) do not have any default remote maven repository configured. This is intentionally designed to provide the flexibility for the users, so they can override and point to a remote repository of their choice. The out-of-the-box applications that are supported by Spring Cloud Data Flow are available in Spring’s repository, so if you want to use them, you must set it as the remote repository as listed below.

cf set-env dataflow-server MAVEN_REMOTE_REPOSITORIES_REPO1_URL https://repo.spring.io/libs-snapshot

where repo1 is an alias name for the remote repository.

Configuring Defaults for Deployed Apps

You can also set other optional properties that alter the way Spring Cloud Data Flow will deploy stream and task apps:

  • The default memory and disk sizes for a deployed application can be configured. By default they are 1024 MB memory and 1024 MB disk. To change these, as an example to 512 and 2048 respectively, use

    cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_MEMORY 512
    cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_DISK 2048
  • The default number of instances to deploy is set to 1, but can be overridden using

    cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_INSTANCES 1
  • You can set the buildpack that will be used to deploy each application. For example, to use the Java offline buildback, set the following environment variable

    cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_BUILDPACK java_buildpack_offline
  • The health check mechanism used by Cloud Foundry to assert if apps are running can be customized. Current supported options are port (the default) and none. Change the default like so:

    cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_HEALTH_CHECK none
[Note]Note

These settings can be configured separately for stream and task apps. To alter settings for tasks, simply substitute STREAM with TASK in the property name. As an example,

cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_TASK_MEMORY 512
[Tip]Tip

All the properties mentioned above are @ConfigurationProperties of the Cloud Foundry deployer. See CloudFoundryDeploymentProperties.java for more information.

  • If you’d like to use config-server to manage centralized configurations for all the applications orchestrated by Spring Cloud Data Flow, you can set it up like the following.

    cf set-env dataflow-server SPRING_APPLICATION_JSON '{"spring.cloud.dataflow.applicationProperties.stream.spring.cloud.config.uri": "http://<CONFIG_SERVER_URI>"}'

We are now ready to start the app.

cf start dataflow-server

Alternatively, you can run the Admin application locally on your machine which is described in the next section.

4.5.2 Running the Server app locally

To run the server application locally, targeting your Cloud Foundry installation, you you need to configure the application either by passing in command line arguments (see below) or setting a number of environment variables.

To use environment variables set the following:

export SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_URL=https://api.run.pivotal.io
export SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_ORG={org}
export SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_SPACE={space}
export SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_DOMAIN=cfapps.io
export SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_USERNAME={email}
export SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_PASSWORD={password}
export SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_SKIP_SSL_VALIDATION=false

export SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_SERVICES=rabbit

You need to fill in {org}, {space}, {email} and {password} before running these commands.

[Warning]Warning

Only set 'Skip SSL Validation' to true if you’re running on a Cloud Foundry instance using self-signed certs (e.g. in development). Do not use for production.

Now we are ready to start the server application:

java -jar spring-cloud-dataflow-server-cloudfoundry-1.1.0.M1.jar [--option1=value1] [--option2=value2] [etc.]
[Tip]Tip

Of course, all other parameterization options that were available when running the server on Cloud Foundry are still available. This is particularly true for configuring defaults for applications. Just substitute cf set-env syntax with export.

4.6 Running Tasks

Tasks are enabled as an experimental feature in Spring Cloud Data Flow Cloud Foundry server. To enable running tasks, you can set the environment variable

export SPRING_CLOUD_DATAFLOW_FEATURES_EXPERIMENTAL_TASKSENABLED=true

or, as a command line argument when starting the data flow server --spring.cloud.dataflow.features.experimental.tasksEnabled=true

4.7 Running Spring Cloud Data Flow Shell locally

Run the shell and optionally target the Admin application if not running on the same host (will typically be the case if deployed on Cloud Foundry as explained here)

$ java -jar spring-cloud-dataflow-shell-1.1.0.M1.jar
server-unknown:>dataflow config server http://dataflow-server.cfapps.io
Successfully targeted http://dataflow-server.cfapps.io
dataflow:>

By default, the application registry will be empty. If you would like to register all out-of-the-box stream applications built with the RabbitMQ binder in bulk, you can with the following command. For more details, review how to register applications.

dataflow:>app import --uri http://bit.ly/stream-applications-rabbit-maven

You can now use the shell commands to list available applications (source/processors/sink) and create streams. For example:

dataflow:> stream create --name httptest --definition "http | log" --deploy
[Note]Note

You will need to wait a little while until the apps are actually deployed successfully before posting data. Tail the log file for each application to verify the application has started.

Now post some data. The URL will be unique to your deployment, the following is just an example

dataflow:> http post --target http://dataflow-nonconcentrative-knar-httptest-http.cfapps.io --data "hello world"

Look to see if hello world ended up in log files for the log application.

5. Security

By default, the Data Flow server is unsecured and runs on an unencrypted HTTP connection. You can secure your REST endpoints, as well as the Data Flow Dashboard by enabling HTTPS and requiring clients to authenticate. More details about securing the REST endpoints and configuring to authenticate against an OAUTH backend (i.e: UAA/SSO running on Cloud Foundry), please review the security section from the core reference guide. The security configurations can be configured in dataflow-server.yml or passed as environment variables through cf set-env commands.

6. Application Names and Prefixes

To help avoid clashes with routes across spaces in Cloud Foundry, a naming strategy to provide a random prefix to a deployed application is available and is enabled by default. The default configurations are overridable and the respective properties can be set via cf set-env commands.

For instance, if you’d like to disable the randmoization, you can override it through:

cf set-env dataflow-server SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_ENABLE_RANDOM_APP_NAME_PREFIX false

7. Authentication and Cloud Foundry

When deploying Spring Cloud Data Flow to Cloud Foundry, you can take advantage of the Spring Cloud Single Sign-On Connector, which provides Cloud Foundry specific auto-configuration support for OAuth 2.0, when used in conjunction with the Pivotal Single Sign-On Service.

Simply set security.basic.enabled to true and in Cloud Foundry bind the SSO service to your Data Flow Server app and SSO will be enabled.

8. Configuration Reference

The following pieces of configuration must be provided. These are Spring Boot @ConfigurationProperties so you can set them as environment variables or by any other means that Spring Boot supports. Here is a listing in environment variable format as that is an easy way to get started configuring Boot applications in Cloud Foundry.

# Default values cited after the equal sign.
# Example values, typical for Pivotal Web Services, cited as a comment

# url of the CF API (used when using cf login -a for example), e.g. https://api.run.pivotal.io
# (for setting env var use SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_URL)
spring.cloud.deployer.cloudfoundry.url=

# name of the organization that owns the space above, e.g. youruser-org
# (For Setting Env var use SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_ORG)
spring.cloud.deployer.cloudfoundry.org=

# name of the space into which modules will be deployed, e.g. development
# (for setting env var use SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_SPACE)
spring.cloud.deployer.cloudfoundry.space=

# the root domain to use when mapping routes, e.g. cfapps.io
# (for setting env var use SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_DOMAIN)
spring.cloud.deployer.cloudfoundry.domain=

# username and password of the user to use to create apps
# (for setting env var use SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_USERNAME and SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_PASSWORD)
spring.cloud.deployer.cloudfoundry.username=
spring.cloud.deployer.cloudfoundry.password=

# Whether to allow self-signed certificates during SSL validation
# (for setting env var use SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_SKIP_SSL_VALIDATION)
spring.cloud.deployer.cloudfoundry.skipSslValidation=false

# Comma separated set of service instance names to bind to every stream app deployed.
# Amongst other things, this should include a service that will be used
# for Spring Cloud Stream binding, e.g. rabbit
# (for setting env var use SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_SERVICES)
spring.cloud.deployer.cloudfoundry.stream.services=

# Health check type to use for stream apps. Accepts 'none' and 'port'
spring.cloud.deployer.cloudfoundry.stream.health-check=


# Comma separated set of service instance names to bind to every task app deployed.
# Amongst other things, this should include an RDBMS service that will be used
# for Spring Cloud Task execution reporting, e.g. my_mysql
# (for setting env var use SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_TASK_SERVICES)
spring.cloud.deployer.cloudfoundry.task.services=

# Timeout to use, in seconds, when doing task related deployments.
# (for setting env var use SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_TASK_TASK_TIMEOUT)
spring.cloud.deployer.cloudfoundry.task.taskTimeout=360

Note that you can set the following properties spring.cloud.deployer.cloudfoundry.services, spring.cloud.deployer.cloudfoundry.memory, and spring.cloud.deployer.cloudfoundry.disk as part of an individual deployment request prefixed by the app.<name of application>. For example

>stream create --name ticktock --definition "time | log"
>stream deploy --name ticktock --properties "app.time.spring.cloud.deployer.cloudfoundry.memory=2048"

will deploy the time source with 2048MB of memory, while the log sink will use the default 1024MB.

8.1 Using Spring Cloud Config Server

If using Spring Cloud Config Server as a Cloud Foundry service, the easiest way to externalize the above configuration and consume it from the Data Flow server is to use the spring-cloud-services-starter-config-client dependency. As this support is specific to Pivotal Cloud Foundry, it is not included by default. But building a Data Flow server that embeds it is as simple as adding a dependency similar to

<dependency>
    <groupId>io.pivotal.spring.cloud</groupId>
    <artifactId>spring-cloud-services-starter-config-client</artifactId>
    <version>1.1.0.RELEASE</version>
    <scope>runtime</scope>
</dependency>

to your own version of the Data Flow server, and building it yourself. Then follow the documentation for Config Server for Pivotal Cloud Foundry. For more details, please refer to Spring Cloud Services client-dependencies documentation.

9. Application Level Service Bindings

When deploying streams in Cloud Foundry, you can take advantage of application specific service bindings, so not all services are globally configured for all the apps orchestrated by Spring Cloud Data Flow.

For instance, if you’d like to provide mysql service binding only for the jdbc application in the following stream definition, you can pass the service binding as a deployment property.

dataflow:>stream create --name httptojdbc --definition "http | jdbc"
dataflow:>stream deploy --name httptojdbc --properties "app.jdbc.spring.cloud.deployer.cloudfoundry.services=mysqlService"

Where, mysqlService is the name of the service specifically only bound to jdbc application and the http application wouldn’t get the binding by this method. If you have more than one service to bind, they can be passed as comma separated items (eg: app.jdbc.spring.cloud.deployer.cloudfoundry.services=mysqlService,someService).

10. A Note About User Provided Services

In addition to marketplace services, Cloud Foundry supports User Provided Services. Throughout this reference manual, regular services have been mentioned, but there is nothing precluding the use of UPSs as well, whether for use as the messaging middleware (e.g. if you’d like to use an external Apache Kafka installation) or for ad hoc usage by some of the stream apps (e.g. an Oracle Database).

11. Application Rolling Upgrades

Similar to Cloud Foundry’s blue-green deployments, you can perform rolling upgrades on the applications orchestrated by Spring Cloud Data Flow.

Let’s start with the following simple stream definition.

dataflow:>stream create --name foo --definition "time | log" --deploy

List Apps.

→ cf apps
Getting apps in org test-org / space development as test@pivotal.io...
OK

name       requested state   instances   memory   disk   urls
foo-log    started           1/1         1G       1G     foo-log.cfapps.io
foo-time   started           1/1         1G       1G     foo-time.cfapps.io

Let’s assume you’ve to make an enhancement to update the "logger" to append extra text in every log statement.

  • Download the Log Sink application starter with "Rabbit binder starter" from start-scs.cfapps.io/
  • Load the downloaded project in an IDE
  • Import the LogSinkConfiguration.class
  • Adapt the handler to add extra text: loggingHandler.setLoggerName("TEST [" + this.properties.getName() + "]");
  • Build the application locally
@SpringBootApplication
@Import(LogSinkConfiguration.class)
public class DemoApplication {

	@Autowired
	private LogSinkProperties properties;

	public static void main(String[] args) {
		SpringApplication.run(DemoApplication.class, args);
	}

	@Bean
	@ServiceActivator(inputChannel = Sink.INPUT)
	public LoggingHandler logSinkHandler() {
		LoggingHandler loggingHandler = new LoggingHandler(this.properties.getLevel().name());
		loggingHandler.setExpression(this.properties.getExpression());
		loggingHandler.setLoggerName("TEST [" + this.properties.getName() + "]");
		return loggingHandler;
	}
}

Let’s deploy the locally built application to Cloud Foundry

→ cf push foo-log-v2 -p demo-0.0.1-SNAPSHOT.jar -n foo-log-v2 --no-start

List Apps.

→ cf apps
Getting apps in org test-org / space development as test@pivotal.io...
OK

name       requested state   instances   memory   disk   urls
foo-log    started           1/1         1G       1G     foo-log.cfapps.io
foo-time   started           1/1         1G       1G     foo-time.cfapps.io
foo-log-v2 stopped           1/1         1G       1G     foo-log-v2.cfapps.io

The stream applications do not communicate via (Go)Router, so they aren’t generating HTTP traffic. Instead, they communicate via the underlying messaging middleware such as Kafka or RabbitMQ. In order to rolling upgrade to route the payload from old to the new version of the application, you’d have to replicate the SPRING_APPLICATION_JSON environment variable from the old application that includes spring.cloud.stream.bindings.input.destination and spring.cloud.stream.bindings.input.group credentials.

[Note]Note

You can find the SPRING_APPLICATION_JSON of the old application via: "cf env foo-log".

cf set-env foo-log-v2 SPRING_APPLICATION_JSON '{"spring.cloud.stream.bindings.input.destination":"foo.time","spring.cloud.stream.bindings.input.group":"foo"}'

Let’s start foo-log-v2 application.

cf start foo-log-v2

As soon as the application bootstraps, you’d now notice the payload being load balanced between two log application instances running on Cloud Foundry. Since they both share the same "destination" and "consumer group", they are now acting as competing consumers.

Old App Logs:

2016-08-08T17:11:08.94-0700 [APP/0]      OUT 2016-08-09 00:11:08.942  INFO 19 --- [ foo.time.foo-1] log.sink                                 : 08/09/16 00:11:08
2016-08-08T17:11:10.95-0700 [APP/0]      OUT 2016-08-09 00:11:10.954  INFO 19 --- [ foo.time.foo-1] log.sink                                 : 08/09/16 00:11:10
2016-08-08T17:11:12.94-0700 [APP/0]      OUT 2016-08-09 00:11:12.944  INFO 19 --- [ foo.time.foo-1] log.sink                                 : 08/09/16 00:11:12

New App Logs:

2016-08-08T17:11:07.94-0700 [APP/0]      OUT 2016-08-09 00:11:07.945  INFO 26 --- [ foo.time.foo-1] TEST [log.sink                       : 08/09/16 00:11:07]
2016-08-08T17:11:09.92-0700 [APP/0]      OUT 2016-08-09 00:11:09.925  INFO 26 --- [ foo.time.foo-1] TEST [log.sink                       : 08/09/16 00:11:09]
2016-08-08T17:11:11.94-0700 [APP/0]      OUT 2016-08-09 00:11:11.941  INFO 26 --- [ foo.time.foo-1] TEST [log.sink                       : 08/09/16 00:11:11]

Deleting the old version foo-log from the CF CLI would make all the payload consumed by the foo-log-v2 application. Now, you’ve successfully upgraded an application in the streaming pipeline without bringing it down in entirety to do an adjustment in it.

List Apps.

→ cf apps
Getting apps in org test-org / space development as test@pivotal.io...
OK

name       requested state   instances   memory   disk   urls
foo-time   started           1/1         1G       1G     foo-time.cfapps.io
foo-log-v2 started           1/1         1G       1G     foo-log-v2.cfapps.io
[Note]Note

A comprehensive canary analysis along with rolling upgrades will be supported via Spinnaker in future releases.

Unresolved directive in index.adoc - include::https://raw.githubusercontent.com/spring-cloud/spring-cloud-dataflow/$/spring-cloud-dataflow-docs/src/main/asciidoc/streams.adoc[] Unresolved directive in index.adoc - include::https://raw.githubusercontent.com/spring-cloud/spring-cloud-dataflow/$/spring-cloud-dataflow-docs/src/main/asciidoc/tasks.adoc[]

Part III. Tasks on Cloud Foundry

Spring Cloud Data Flow’s task functionality exposes new experimental capabilities within the Pivotal Cloud Foundry runtime. It’s important to note that the current underlying PCF capabilities are considered experimental, and therefore this functionality within Spring Cloud Data Flow is also considered experimental.

12. Version Compatibility

The task functionality depends on the latest versions of PCF for runtime support. This release requires PCF version 1.7.12 or higher to run tasks.

13. Tooling

Because the task functionality is currently considered experimental within PCF, the tooling around it within the CF ecosystem is not complete. In order to interact with tasks via the PCF command line interface (CLI), you need to install a plugin: v3-cli-plugin. It’s important to note that this plugin is only compatible with the PCF CLI version 6.17.0+5d0be0a-2016-04-15. You can read more about the functionality the plugin provides in its README.

It’s also important to note that there is no Apps Manager support for tasks as of this release. When running applications as tasks through Spring Cloud Data Flow, the only way to view them within the context of CF is via the plugin mentioned above.

14. Running Task Applications

Running a task application within Spring Cloud Data Flow goes through a slightly different lifecycle than running a stream application. Both types of applications need to be registered with the appropriate artifact coordinates. Both need a definition created via the SCDF DSL. However, that’s where the similarities end.

With stream based applications, you "deploy" them with the intent that they run until they are undeployed. A stream definition is only deployed once (it can be scaled, but only deployed as one instance of the stream as a whole). However, tasks are launched. A single task definition can be launched many times. With each launch, they will start, execute, and shut down with PCF cleaning up the resources once the shutdown has occurred. The following sections outline the process of creating, launching, destroying, and viewing tasks.

14.1 Create a Task

Similar to streams, creating a task application is done via the SCDF DSL or through the dashboard. To create a task definition in SCDF, you’ve to either develop a task application or use one of the out-of-the-box task app-starters. The maven coordinates of the task application should be registered in SCDF. For more details on how to register task applications, review register task applications section from the core docs.

Let’s see an example that uses the out-of-the-box timestamp task application.

dataflow:>task create --name foo --definition "timestamp"
Created new task 'foo'
[Note]Note

Tasks in SCDF do not require explicit deployment. They are required to be launched and with that there are different ways to launch them - refer to this section for more details.

14.2 Launch a Task

Unlike streams, tasks in SCDF requires an explicit launch trigger or it can be manually kicked-off.

dataflow:>task launch foo
Launched task 'foo'

14.3 View Task Logs

As previously mentioned, the v3-cli-plugin is the way to interact with tasks on PCF, including viewing the logs. In order to view the logs as a task is executing use the following command where foo is the name of the task you are executing:

cf v3-logs foo
Tailing logs for app foo...

....
....
....
....

2016-08-19T09:44:49.11-0700 [APP/TASK/bar1/0]OUT 2016-08-19 16:44:49.111  INFO 7 --- [           main] o.s.c.t.a.t.TimestampTaskApplication     : Started TimestampTaskApplication in 2.734 seconds (JVM running for 3.288)
2016-08-19T09:44:49.13-0700 [APP/TASK/bar1/0]OUT Exit status 0
2016-08-19T09:44:49.19-0700 [APP/TASK/bar1/0]OUT Destroying container
2016-08-19T09:44:50.41-0700 [APP/TASK/bar1/0]OUT Successfully destroyed container
[Note]Note

Logs are only viewable through the v3-cli-plugin as the app is running. Historic logs are not available.

14.4 List Tasks

Listing tasks is as simple as:

dataflow:>task list
╔══════════════════════╤═════════════════════════╤═══════════╗
║      Task Name       │     Task Definition     │Task Status║
╠══════════════════════╪═════════════════════════╪═══════════╣
║foo                   │timestamp                │complete   ║
╚══════════════════════╧═════════════════════════╧═══════════╝

14.5 List Task Executions

If you’d like to view the execution details of the launched task, you could do the following.

dataflow:>task execution list
╔════════════════════════╤══╤═════════════════════════╤═════════════════════════╤════════╗
║       Task Name        │ID│       Start Time        │        End Time         │  Exit  ║
║                        │  │                         │                         │  Code  ║
╠════════════════════════╪══╪═════════════════════════╪═════════════════════════╪════════╣
║foo:cloud:              │1 │ Fri Aug 19 09:44:49 PDT │Fri Aug 19 09:44:49 PDT  │0       ║
╚════════════════════════╧══╧═════════════════════════╧═════════════════════════╧════════╝

14.6 Destroy a Task

Destroying the task application from SCDF removes the task definition from task repository.

dataflow:>task destroy foo
Destroyed task 'foo'
dataflow:>task list
╔═════════╤═══════════════╤═══════════╗
║Task Name│Task Definition│Task Status║
╚═════════╧═══════════════╧═══════════╝

Unresolved directive in index.adoc - include::https://raw.githubusercontent.com/spring-cloud/spring-cloud-dataflow/$/spring-cloud-dataflow-docs/src/main/asciidoc/dashboard.adoc[] Unresolved directive in index.adoc - include::https://raw.githubusercontent.com/spring-cloud/spring-cloud-dataflow/$/spring-cloud-dataflow-docs/src/main/asciidoc/howto.adoc[]

Part IV. Appendices

Unresolved directive in appendix.adoc - include::https://raw.githubusercontent.com/spring-cloud/spring-cloud-dataflow/$/spring-cloud-dataflow-docs/src/main/asciidoc/appendix-migration-guide.adoc[]