6. Deploying on YARN

The Admin server application is run as a standalone application. All modules used for streams and tasks will be deployed on the YARN cluster that is targeted by the Admin server. configured to be used.

  1. download the Spring Cloud Data Flow YARN distribution ZIP file which includes the Admin and the Shell apps:
wget http://repo.spring.io/milestone/org/springframework/cloud/dist/spring-cloud-dataflow-admin-yarn-dist/1.0.0.M1/spring-cloud-dataflow-admin-yarn-dist-1.0.0.M1.zip

Unzip the distribution ZIP file and change to the directory containing the deployment files.

cd spring-cloud-dataflow-admin-yarn-1.0.0.M1
  1. Make sure Hadoop and Redis are running. If either one is not running on localhost you need to configure them in config/servers.yml
  1. If this is the first time deploying make sure the user that runs the Admin app has rights to create and write to /dataflow directory. If there is an existing deployment on hdfs remove it using:
$ hdfs dfs -rm -R /dataflow
  1. start the Spring Cloud Data Flow Admin app for YARN
$ ./bin/dataflow-admin-yarn
  1. start spring-cloud-dataflow-shell
$ ./bin/dataflow-shell
  1. Test the deployment

Create a stream:

dataflow:>stream create --name "ticktock" --definition "time | hdfs --rollover=100" --deploy

List streams:

dataflow:>stream list
  Stream Name  Stream Definition           Status
  -----------  --------------------------  --------
  ticktock     time | hdfs --rollover=100  deployed

After some time, destroy the stream:

dataflow:>stream destroy --name ticktock

The YARN application is pushed and started automatically during a stream deployment process. Once all streams are destroyed the YARN application will exit.