In this section we will install the Spring Cloud Data Flow Server on a Kubernetes cluster. Spring Cloud Data Flow depends on a few services and their availability. For example, we need an RDBMS service for the app registry, stream/task repositories and task management. For streaming pipelines, we also need a transport option such as Apache Kafka or Rabbit MQ. In addition to this, we need a Redis service if the analytics features are in use.
![]() | Important |
---|---|
This guide describes setting up an environment for testing Spring Cloud Data Flow on Google Kubernetes Engine and is not meant to be a definitive guide for setting up a production environment. Feel free to adjust the suggestions to fit your test set-up. Please remember that a production environment requires much more consideration for persistent storage of message queues, high availability, security etc. |
![]() | Note |
---|---|
Currently, only apps registered with a Note that we do support Maven resources for the E.g. the below app registration is valid: dataflow:>app register --type source --name time --uri docker://springcloudstream/time-source-rabbit:1.3.0.RELEASE --metadata-uri maven://org.springframework.cloud.stream.app:time-source-rabbit:jar:metadata:1.3.0.RELEASE but any app registered with a Maven, HTTP or File resource for the executable jar (using a |
The Kubernetes Picking the Right Solution guide lets you choose among many options so you can pick one that you are most comfortable using.
All our testing is done using the Google Kubernetes Engine that is part of the Google Cloud Platform. That is a also the target platform for this section. We have also successfully deployed using Minikube and we will note where you need to adjust for deploying on Minikube.
![]() | Note |
---|---|
When starting Minikube you should allocate some extra resources since we will be deploying several services. We have used |
The rest of this getting started guide assumes that you have a working Kubernetes cluster and a kubectl
command line utility. See the docs for installation instructions: Installing and Setting up kubectl.
Get the Kubernetes configuration files.
There are sample deployment and service YAML files in the https://github.com/spring-cloud/spring-cloud-dataflow-server-kubernetes repository that you can use as a starting point. They have the required metadata set for service discovery by the different apps and services deployed. To check out the code enter the following commands:
$ git clone https://github.com/spring-cloud/spring-cloud-dataflow-server-kubernetes $ cd spring-cloud-dataflow-server-kubernetes $ git checkout master
Deploy Rabbit MQ.
The Rabbit MQ service will be used for messaging between modules in the stream. You could also use Kafka, but, in order to simplify, we only show the Rabbit MQ configurations in this guide.
Run the following commands to start the Rabbit MQ service:
$ kubectl create -f src/kubernetes/rabbitmq/
You can use the command kubectl get all -l app=rabbitmq
to verify that the deployment, pod and service resources are running. Use the command kubectl delete all -l app=rabbitmq
to clean up afterwards.
Deploy MySQL.
We are using MySQL for this guide, but you could use Postgres or H2 database instead. We include JDBC drivers for all three of these databases, you would just have to adjust the database URL and driver class name settings.
![]() | Important |
---|---|
You can modify the password in the |
Run the following commands to start the MySQL service:
$ kubectl create -f src/kubernetes/mysql/
You can use the command kubectl get all -l app=mysql
to verify that the deployment, pod and service resources are running. Use the command kubectl delete all,pvc,secrets -l app=mysql
to clean up afterwards.
Deploy Redis.
The Redis service will be used for the analytics functionality. Run the following commands to start the Redis service:
$ kubectl create -f src/kubernetes/redis/
![]() | Note |
---|---|
If you don’t need the analytics functionality you can turn this feature off by changing |
You can use the command kubectl get all -l app=redis
to verify that the deployment, pod and service resources are running. Use the command kubectl delete all -l app=redis
to clean up afterwards.
Deploy the Metrics Collector.
The Metrics Collector will provide message rates for all deployed stream apps. These message rates will be visible in the Dashboard UI. Run the following commands to start the Metrics Collector:
$ kubectl create -f src/kubernetes/metrics/metrics-deployment-rabbit.yaml $ kubectl create -f src/kubernetes/metrics/metrics-svc.yaml
You can use the command kubectl get all -l app=metrics
to verify that the deployment, pod and service resources are running. Use the command kubectl delete all -l app=metrics
to clean up afterwards.
Deploy Skipper
This is an optional step. Deploy Skipper if you want the added features of upgrading and rolling back Streams since Data Flow delegates to Skipper for those features. For more details, review the reference guide for a complete overview and the feature capabilities. See the section Section 1.3, “Deploy Skipper” for details.
Deploy the Data Flow Server.
![]() | Important |
---|---|
You should specify the version of the Spring Cloud Data Flow server that you want to deploy. |
The deployment is defined in the src/kubernetes/server/server-deployment.yaml
file. To control what version of the Spring Cloud Data Flow server that gets deployed you should modify the tag used for the Docker image in the container spec:
spec: containers: - name: scdf-server image: springcloud/spring-cloud-dataflow-server-kubernetes:latestimagePullPolicy: Always
change |
The Data Flow Server uses the Fabric8 Java client library to connect to the Kubernetes cluster. We are using environment variables to set the values needed when deploying the Data Flow server to Kubernetes. We are also using the Fabric8 Spring Cloud integration with Kubernetes library to access Kubernetes ConfigMap and Secrets settings.
The ConfigMap settings are specified in the src/kubernetes/server/server-config-rabbit.yaml
file and the secrets are in the src/kubernetes/mysql/mysql-secrets.yaml
file. If you modified the password for MySQL you should have changed it in the src/kubernetes/mysql/mysql-secrets.yaml
file. Any secrets have to be provided base64 encoded.
![]() | Note |
---|---|
We are now configuring the Data Flow server with file based security and the default user is 'user' with a password of 'password'. Feel free to change this in the |
![]() | Note |
---|---|
The default memory for the pods is set to 1024Mi. Update the value in the |
Deploy the Spring Cloud Data Flow Server for Kubernetes using the Docker image and the configuration settings.
$ kubectl create -f src/kubernetes/server/server-config-rabbit.yaml $ kubectl create -f src/kubernetes/server/server-svc.yaml $ kubectl create -f src/kubernetes/server/server-deployment.yaml
You can use the command kubectl get all -l app=scdf-server
to verify that the deployment, pod and service resources are running. Use the command kubectl delete all,cm -l app=scdf-server
to clean up afterwards.
Use the kubectl get svc scdf-server
command to locate the EXTERNAL_IP address assigned to scdf-server
, we will use that later to connect from the shell.
$ kubectl get svc NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE scdf-server 10.103.246.82 130.211.203.246 80/TCP 4m
So the URL you need to use is in this case 130.211.203.246
If you are using Minikube then you don’t have an external load balancer and the EXTERNAL-IP will show as <pending>
. You need to use the NodePort assigned for the scdf-server
service. Use this command to look up the URL to use:
$ minikube service --url scdf-server http://192.168.99.100:31991
This is an optional step. Deploy Skipper if you want the added features of upgrading and rolling back Streams since Data Flow delegates to Skipper for those features.
The Deployment resource for Skipper is shown below:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: skipper labels: app: skipper spec: replicas: 1 template: metadata: labels: app: skipper spec: containers: - name: skipper image: springcloud/spring-cloud-skipper-server:1.0.0.BUILD-SNAPSHOT imagePullPolicy: Always ports: - containerPort: 80 resources: limits: cpu: 1.0 memory: 1024Mi requests: cpu: 0.5 memory: 640Mi env: - name: SPRING_APPLICATION_JSON value: "{\"spring.cloud.skipper.server.enable.local.platform\" : false, \"spring.cloud.skipper.server.platform.kubernetes.accounts.minikube.environmentVariables\" : \"SPRING_RABBITMQ_HOST=${RABBITMQ_SERVICE_HOST},SPRING_RABBITMQ_PORT=${RABBITMQ_SERVICE_PORT}\",\"spring.cloud.skipper.server.platform.kubernetes.accounts.minikube.memory\" : \"1024Mi\",\"spring.cloud.skipper.server.platform.kubernetes.accounts.minikube.createDeployment\" : true}"
![]() | Note |
---|---|
Skipper includes the concept of platforms,
so it is important to define the "accounts" based on the project preferences. In the above YAML file, the accounts map
to |
![]() | Note |
---|---|
If you’d like to change the version of Skipper server, you can do so by updating the image from |
![]() | Note |
---|---|
If you’d like to orchestrate stream processing pipelines with Apache Kafka as the messaging middleware, you must change the value for "{\"spring.cloud.skipper.server.platform.kubernetes.accounts.minikube.environmentVariables\" : \"SPRING_CLOUD_STREAM_KAFKA_BINDER_BROKERS=${KAFKA_SERVICE_HOST}:${KAFKA_SERVICE_PORT}, SPRING_CLOUD_STREAM_KAFKA_BINDER_ZK_NODES=${KAFKA_ZK_SERVICE_HOST}:${KAFKA_ZK_SERVICE_PORT}\"}" |
The resource for the Skipper service is shown below:
apiVersion: v1 kind: Service metadata: name: skipper labels: app: skipper spec: # If you are running k8s on a local dev box or using minikube, you can use type NodePort instead type: LoadBalancer ports: - port: 80 targetPort: 7577 # port used by 'skpr' (i.e., 7577) selector: app: skipper
Run the following commands to start Skipper as the companion server for Spring Cloud Data Flow:
$ kubectl create -f src/kubernetes/skipper/skipper-deployment.yaml $ kubectl create -f src/kubernetes/skipper/skipper-svc.yaml
You can use the command kubectl get all -l app=skipper
to verify that the deployment, pod and service resources are running.
Use the command kubectl delete all -l app=skipper
to clean up afterwards.
Use the kubectl get svc scdf-server
command to locate the EXTERNAL_IP address assigned to scdf-server
, we will use that
later to connect from the shell.
$ kubectl get svc NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE skipper 10.103.246.83 130.211.203.247 80/TCP 4m
So the URL you need to use is in this case is: 130.211.203.247
If you are using Minikube then you don’t have an external load balancer and the EXTERNAL-IP will show as <pending>
.
You need to use the NodePort assigned for the skipper
service. Use this command to look up the URL to use:
$ minikube service --url skipper http://192.168.99.100:32060