3. Introducing Spring Cloud Data Flow

Spring Cloud Data Flow is a cloud native data framework that unifies stream and batch processing for data microservices, across the cloud or on-prem. It allows developers to create, orchestrate and refactor data pipelines with a single programming model for common use cases like data ingest, real time analytics, and data import/export.

Spring Cloud Data Flow defines the best practices for distributed stream and batch data processing.

Spring Cloud Data Flow is the cloud native redesign of Spring XD - a project that aimed to simplify Big Data application development. This redesign allows running stream and batch applications as data microservices and they can independently evolve in isolation.

3.1 Features

  • Orchestrate applications across a variety of distributed runtime platforms including: Cloud Foundry, Lattice, and Apache YARN
  • Separate runtime dependencies backed by ‘spring profiles’
  • Consume stream and batch microservices as maven dependency and push it to production
  • Develop using: DSL, Shell, REST-APIs, Admin-UI, and Flo
  • Take advantage of metrics, health checks and remote management functionalities
  • Scale stream and batch pipelines without interrupting data flows