24. Introduction

Streams are a collection of long lived Spring Cloud Stream applications that communicate with each other over messaging middleware. A text based DSL defines the configuration and data flow between the applications. While many applications are provided for you to implement common use-cases, you will typically create a custom Spring Cloud Stream application to implement custom business logic.

The general lifecycle of a Stream is:

  1. Register applications
  2. Create a Stream Definition
  3. Deploy the Stream
  4. Undeploy or Destroy the Stream.

There are two options for deploying streams:

  1. Use a Data Flow Server implementation that deploys to a single platform.
  2. Configure the Data Flow Server to delegate the deployment to new server in the Spring Cloud ecosystem named Skipper.

When using the first option, you can use the Data Flow Server for Cloud Foundry to deploy streams to a single org and space on Cloud Foundry. Alternatively, you can use Data Flow for Kuberenetes to deploy stream to a single namespace on a Kubernetes cluster. See here for a list of implementations.

When using the second option, you can configure Skipper to deploy applications to one or more Cloud Foundry org/spaces, one or more namespaces on a Kubernetes cluster, as well as deploy to the local machine. When deploying a stream in Data Flow using Skipper, you can specify which platfrom to use. Skipper also provides Data Flow with the ability to perform updates to deployed streams. There are many ways the applications in a stream can be updated, but one of the most common examples is to upgrade a processor application with new custom business logic while leaving the existing source and sink applications alone.

24.1 Stream Pipeline DSL

A stream is defined using a unix-inspired Pipeline syntax. The syntax uses vertical bars, also known as "pipes" to connect multiple commands. The command ls -l | grep key | less in Unix takes the output of the ls -l process and pipes it to the input of the grep key process. The output of grep in turn is sent to the input of the less process. Each | symbol will connect the standard ouput of the program on the left to the standard input of the command on the right. Data flows through the pipeline from left to right.

In Data Flow, the Unix command is replaced by a Spring Cloud Stream application and each pipe symbol represents connecting the input and output of applications via messaging middleware, such as RabbitMQ or Apache Kafka.

Each Spring Cloud Stream application is registered under a simple name. The registration process specifies where the application can be obtained, for example in a Maven Repository or a Docker registry. You can find out more information on how to register Spring Cloud Stream applications in this section. In Data Flow, we classify the Spring Cloud Stream applications as either Sources, Processors, or Sinks.

As a simple example consider the collection of data from an HTTP Source writing to a File Sink. Using the DSL the stream description is:

http | file

A stream that involves some processing would be expresed as:

http | filter | transform | file

Stream definitions can be created using the shell’s create stream command. For example:

dataflow:> stream create --name httpIngest --definition "http | file"

The Stream DSL is passed in to the --definition command option.

The deployment of stream definitions is done via the shell’s stream deploy command.

dataflow:> stream deploy --name ticktock

The Getting Started section shows you how to start the server and how to start and use the Spring Cloud Data Flow shell.

Note that shell is calling the Data Flow Servers' REST API. For more information on making HTTP request directly to the server, consult the REST API Guide.

24.2 Application properties

Each application takes properties to customize its behavior. As an example the http source module exposes a port setting which allows the data ingestion port to be changed from the default value.

dataflow:> stream create --definition "http --port=8090 | log" --name myhttpstream

This port property is actually the same as the standard Spring Boot server.port property. Data Flow adds the ability to use the shorthand form port instead of server.port. One may also specify the longhand version as well.

dataflow:> stream create --definition "http --server.port=8000 | log" --name myhttpstream

This shorthand behavior is discussed more in the section on Section 25.2.1, “Whitelisting application properties”. If you have registered application property metadata you can use tab completion in the shell after typing -- to get a list of candidate property names.

The shell provides tab completion for application properties and also the shell command app info <appType>:<appName> provides additional documentation for all the supported properties.

[Note]Note

Supported Stream `<appType>’s are: source, processor, and sink