8. Using the runner classes

Spring for Apache Hadoop provides for each Hadoop interaction type, whether it is vanilla Map/Reduce, Cascading, Hive or Pig, a runner, a dedicated class used for declarative (or programmatic) interaction. The list below illustrates the existing runner classes for each type, its name and namespace element

Table 8.1. Available Runners

TypeNameNamespace elementDescription
Map/Reduce JobJobRunnerjob-runnerRunner for Map/Reduce jobs, whether vanilla M/R or streaming
Hadoop ToolToolRunnertool-runnerRunner for Hadoop Tools (whether stand-alone or as jars).
Hadoop jarsJarRunnerjar-runnerRunner for Hadoop jars.
Hive queries and scriptsHiveRunnerhive-runnerRunner for executing Hive queries or scripts.
Pig queries and scriptsPigRunnerpig-runnerRunner for executing Pig scripts.
Cascading CascadesCascadeRunner-Runner for executing Cascading Cascades.
JSR-223/JVM scriptsHdfsScriptRunnerscriptRunner for executing JVM 'scripting' languages (implementing the JSR-223 API).

While most of the configuration depends on the underlying type, the runners share common attributes and behaviour so one can use them in a predictive, consistent way. Below is a list of common features: