Old | New |
---|---|
XD-Admin | Server (implementations: local, cloud foundry, apache yarn, kubernetes, and apache mesos) |
XD-Container | N/A |
Modules | Applications |
Admin UI | Dashboard |
Message Bus | Binders |
Batch / Job | Task |
If you have custom Spring XD modules, you’d have to refactor them to use Spring Cloud Stream and Spring Cloud Task annotations, with updated dependencies and built as normal Spring Boot "applications".
http
, file
, or as hdfs
coordinatescounter-sink:
redis
is not required in Spring Cloud Data Flow. If you intend to use the counter-sink
, then redis
becomes required, and you’re expected to have your own running redis
clusterfield-value-counter-sink:
redis
is not required in Spring Cloud Data Flow. If you intend to use the field-value-counter-sink
, then redis
becomes required, and you’re expected to have your own running redis
clusteraggregate-counter-sink:
redis
is not required in Spring Cloud Data Flow. If you intend to use the aggregate-counter-sink
, then redis
becomes required, and you’re expected to have your own running redis
clusterTerminology wise, in Spring Cloud Data Flow, the message bus implementation is commonly referred to as binders.
Similar to Spring XD, there’s an abstraction available to extend the binder interface. By default, we take the opinionated view of Apache Kafka and RabbitMQ as the production-ready binders and are available as GA releases.
Selecting a binder is as simple as providing the right binder dependency in the classpath. If you’re to choose Kafka as the binder, you’d register stream applications that are pre-built with Kafka binder in it. If you were to create a custom application with Kafka binder, you’d add the following dependency in the classpath.
<dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-stream-binder-kafka</artifactId> <version>1.0.2.RELEASE</version> </dependency>
Fundamentally, all the messaging channels are backed by pub/sub semantics. Unlike Spring XD, the
messaging channels are backed only by topics
or topic-exchange
and there’s no representation of
queues
in the new architecture.
${xd.module.index}
is not supported anymore; instead, you can directly interact with named
destinationsstream.index
changes to :<stream-name>.<label/app-name>
ticktock.0
changes to :ticktock.time
“topic/queue” prefixes are not required to interact with named-channels
topic:foo
changes to :foo
stream create stream1 --definition ":foo > log"
If you’re building non-linear streams, you could take advantage of named destinations to build directed graphs.
for instance, in Spring XD:
stream create f --definition "queue:foo > transform --expression=payload+'-foo' | log" --deploy stream create b --definition "queue:bar > transform --expression=payload+'-bar' | log" --deploy stream create r --definition "http | router --expression=payload.contains('a')?'queue:foo':'queue:bar'" --deploy
for instance, in Spring Cloud Data Flow:
stream create f --definition ":foo > transform --expression=payload+'-foo' | log" --deploy stream create b --definition ":bar > transform --expression=payload+'-bar' | log" --deploy stream create r --definition "http | router --expression=payload.contains('a')?'foo':'bar'" --deploy
A Task by definition, is any application that does not run forever, including Spring Batch jobs, and they end/stop at some point. Task applications can be majorly used for on-demand use-cases such as database migration, machine learning, scheduled operations etc. Using Spring Cloud Task, users can build Spring Batch jobs as microservice applications.
Old Command | New Command |
---|---|
module upload | app register / app import |
module list | app list |
module info | app info |
admin config server | dataflow config server |
job create | task create |
job launch | task launch |
job list | task list |
job status | task status |
job display | task display |
job destroy | task destroy |
job execution list | task execution list |
runtime modules | runtime apps |
Old API | New API |
---|---|
/modules | /apps |
/runtime/modules | /runtime/apps |
/runtime/modules/{moduleId} | /runtime/apps/{appId} |
/jobs/definitions | /task/definitions |
/jobs/deployments | /task/deployments |
The Admin-UI is now renamed as Dashboard. The URI for accessing the Dashboard is changed from localhost:9393/admin-ui to localhost:9393/dashboard
xd-container
is gone, replaced by out-of-the-box applications running as autonomous Spring Boot applications. The Runtime tab displays the applications
running in the runtime platforms (implementations: cloud foundry, apache yarn, apache mesos, or
kubernetes). You can click on each application to review relevant details about the application such
as where it is running with, and what resources etc.(New) Tasks:
Spring Cloud Data Flow comes with a significantly simplified architecture. In fact, when compared with Spring XD, there are less peripherals that are necessary to operationalize Spring Cloud Data Flow.
Spring Cloud Data Flow uses an RDBMS instead of Redis for stream/task definitions, application registration, and for job repositories.The default configuration uses an embedded H2 instance, but Oracle, DB2, SqlServer, MySQL/MariaDB, PostgreSQL, H2, and HSQLDB databases are supported. To use Oracle, DB2 and SqlServer you will need to create your own Data Flow Server using Spring Initializr and add the appropriate JDBC driver dependency.
Running a Redis cluster is only required for analytics functionality. Specifically, when the counter-sink
,
field-value-counter-sink
, or aggregate-counter-sink
applications are used, it is expected to also
have a running instance of Redis cluster.
Spring XD’s xd-admin
and xd-container
server components are replaced by stream and task
applications themselves running as autonomous Spring Boot applications. The applications run natively
on various platforms including Cloud Foundry, Apache YARN, Apache Mesos, or Kubernetes. You can develop,
test, deploy, scale +/-, and interact with (Spring Boot) applications individually, and they can
evolve in isolation.
To support centralized and consistent management of an application’s configuration properties, Spring Cloud Config client libraries have been included into the Spring Cloud Data Flow server as well as the Spring Cloud Stream applications provided by the Spring Cloud Stream App Starters. You can also pass common application properties to all streams when the Data Flow Server starts.
Spring Cloud Data Flow is a Spring Boot application. Depending on the platform of your choice, you
can download the respective release uber-jar and deploy/push it to the runtime platform
(cloud foundry, apache yarn, kubernetes, or apache mesos). For example, if you’re running Spring
Cloud Data Flow on Cloud Foundry, you’d download the Cloud Foundry server implementation and do a
cf push
as explained in the reference guide.
The hdfs-sink
application builds upon Spring Hadoop 2.4.0 release, so this application is compatible
with following Hadoop distributions.
Spring Cloud Data Flow can be deployed and used with Apche YARN in two different ways.
Let’s review some use-cases to compare and contrast the differences between Spring XD and Spring Cloud Data Flow.
(It is assumed both XD and SCDF distributions are already downloaded)
Description: Simple ticktock
example using local/singlenode.
Spring XD | Spring Cloud Data Flow |
---|---|
Start
| Start a binder of your choice Start
|
Start
| Start
|
Create
| Create
|
Review | Review |
(It is assumed both XD and SCDF distributions are already downloaded)
Description: Stream with custom module/application.
Spring XD | Spring Cloud Data Flow |
---|---|
Start
| Start a binder of your choice Start
|
Start
| Start
|
Register custom “processor” module to transform payload to a desired format
| Register custom “processor” application to transform payload to a desired format
|
Create a stream with custom module
| Create a stream with custom application
|
Review results in the | Review results by tailing the |
(It is assumed both XD and SCDF distributions are already downloaded)
Description: Simple batch-job.
Spring XD | Spring Cloud Data Flow |
---|---|
Start
| Start
|
Start
| Start
|
Register custom “batch-job” module
| Register custom “batch-job” as task application
|
Create a job with custom batch-job module
| Create a task with custom batch-job application
|
Deploy job
| NA |
Launch job
| Launch task
|
Review results in the | Review results by tailing the |