10. Using the runner classes

Spring for Apache Hadoop provides for each Hadoop interaction type, whether it is vanilla Map/Reduce, Hive or Pig, a runner, a dedicated class used for declarative (or programmatic) interaction. The list below illustrates the existing runner classes for each type, their name and namespace element.

Table 10.1. Available _Runner_s

TypeNameNamespace elementDescription

Map/Reduce Job

JobRunner

job-runner

Runner for Map/Reduce jobs, whether vanilla M/R or streaming

Hadoop Tool

ToolRunner

tool-runner

Runner for Hadoop `Tool`s (whether stand-alone or as jars).

Hadoop `jar`s

JarRunner

jar-runner

Runner for Hadoop jars.

Hive queries and scripts

HiveRunner

hive-runner

Runner for executing Hive queries or scripts.

Pig queries and scripts

PigRunner

pig-runner

Runner for executing Pig scripts.

JSR-223/JVM scripts

HdfsScriptRunner

script

Runner for executing JVM 'scripting' languages (implementing the JSR-223 API).


While most of the configuration depends on the underlying type, the runners share common attributes and behaviour so one can use them in a predictive, consistent way. Below is a list of common features: