2. Spark Client Task

This task is intended to launch a Spark application. The task submits the Spark application for local execution. This task is appropriate for a local deployment where any local file references can be resolved. It is not meant for any type of cluster deployments.

2.1 Options

The spark-client task has the following options:

spark.app-args
The arguments for the Spark application. (String[], default: [])
spark.app-class
The main class for the Spark application. (String, default: <none>)
spark.app-jar
The path to a bundled jar that includes your application and its dependencies, excluding any Spark dependencies. (String, default: <none>)
spark.app-name
The name to use for the Spark application submission. (String, default: <none>)
spark.executor-memory
The memory setting to be used for each executor. (String, default: 1024M)
spark.master
The master setting to be used (local, local[N] or local[*]). (String, default: local)
spark.resource-archives
A comma separated list of archive files to be included with the app submission. (String, default: <none>)
spark.resource-files
A comma separated list of files to be included with the application submission. (String, default: <none>)

2.2 Building with Maven

$ ./mvnw clean install -PgenerateApps
$ cd apps/spark-client-task
$ ./mvnw clean package

2.3 Example

Run the spark-client-task app using the following command and parameters (we are using a class name of org.apache.spark.examples.JavaSparkPi for the --spark.app-class parameter in this example)

java -jar spark-client-task-{version}.jar --spark.app-class=org.apache.spark.examples.JavaSparkPi \
  --spark.app-jar=/shared/drive/spark-pi-test.jar \
  --spark.app-args=10

Then review the log output to make sure the app completed.