Spring for Apache Hadoop

org.springframework.data.hadoop.mapreduce
Class JobExecutor

java.lang.Object
  extended by org.springframework.data.hadoop.mapreduce.JobExecutor
All Implemented Interfaces:
org.springframework.beans.factory.Aware, org.springframework.beans.factory.BeanFactoryAware, org.springframework.beans.factory.DisposableBean, org.springframework.beans.factory.InitializingBean
Direct Known Subclasses:
JobRunner, JobTasklet

public abstract class JobExecutor
extends java.lang.Object
implements org.springframework.beans.factory.InitializingBean, org.springframework.beans.factory.DisposableBean, org.springframework.beans.factory.BeanFactoryAware

Common class shared for executing Hadoop Jobs.

Author:
Costin Leau, Thomas Risberg

Nested Class Summary
protected static interface JobExecutor.JobListener
           
 
Field Summary
protected  org.apache.commons.logging.Log log
           
 
Constructor Summary
JobExecutor()
           
 
Method Summary
 void afterPropertiesSet()
           
 void destroy()
           
protected  java.util.Collection<org.apache.hadoop.mapreduce.Job> findJobs()
           
 boolean isKillJobsAtShutdown()
          Indicates whether the configured jobs should be 'killed' when the application shuts down or not.
 boolean isVerbose()
          Indicates whether the job execution is verbose (the default) or not.
 boolean isWaitForCompletion()
          Indicates whether the 'runner' should wait for the job to complete (default).
 void setBeanFactory(org.springframework.beans.factory.BeanFactory beanFactory)
           
 void setExecutor(java.util.concurrent.Executor executor)
          Sets the TaskExecutor used for executing the Hadoop job.
 void setJob(org.apache.hadoop.mapreduce.Job job)
          Sets the job to execute.
 void setJobNames(java.lang.String... jobName)
          Sets the jobs to execute by (bean) name.
 void setJobs(java.util.Collection<org.apache.hadoop.mapreduce.Job> jobs)
          Sets the jobs to execute.
 void setKillJobAtShutdown(boolean killJobsAtShutdown)
          Indicates whether the configured jobs should be 'killed' when the application shuts down (default) or not.
 void setVerbose(boolean verbose)
          Indicates whether the job execution is verbose (the default) or not.
 void setWaitForCompletion(boolean waitForJob)
          Indicates whether the 'runner' should wait for the job to complete (default) after submission or not.
protected  java.util.Collection<org.apache.hadoop.mapreduce.Job> startJobs()
           
protected  java.util.Collection<org.apache.hadoop.mapreduce.Job> startJobs(JobExecutor.JobListener listener)
           
protected  java.util.Collection<org.apache.hadoop.mapreduce.Job> stopJobs()
          Stops running job.
protected  java.util.Collection<org.apache.hadoop.mapreduce.Job> stopJobs(JobExecutor.JobListener listener)
          Stops running job.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

log

protected org.apache.commons.logging.Log log
Constructor Detail

JobExecutor

public JobExecutor()
Method Detail

afterPropertiesSet

public void afterPropertiesSet()
                        throws java.lang.Exception
Specified by:
afterPropertiesSet in interface org.springframework.beans.factory.InitializingBean
Throws:
java.lang.Exception

destroy

public void destroy()
             throws java.lang.Exception
Specified by:
destroy in interface org.springframework.beans.factory.DisposableBean
Throws:
java.lang.Exception

stopJobs

protected java.util.Collection<org.apache.hadoop.mapreduce.Job> stopJobs()
Stops running job.

Returns:
list of stopped jobs.
Throws:
java.lang.Exception

stopJobs

protected java.util.Collection<org.apache.hadoop.mapreduce.Job> stopJobs(JobExecutor.JobListener listener)
Stops running job.

Parameters:
listener - job listener
Returns:
list of stopped jobs.
Throws:
java.lang.Exception

startJobs

protected java.util.Collection<org.apache.hadoop.mapreduce.Job> startJobs()

startJobs

protected java.util.Collection<org.apache.hadoop.mapreduce.Job> startJobs(JobExecutor.JobListener listener)

findJobs

protected java.util.Collection<org.apache.hadoop.mapreduce.Job> findJobs()

setJob

public void setJob(org.apache.hadoop.mapreduce.Job job)
Sets the job to execute.

Parameters:
job - The job to execute.

setJobs

public void setJobs(java.util.Collection<org.apache.hadoop.mapreduce.Job> jobs)
Sets the jobs to execute.

Parameters:
jobs - The job to execute.

setJobNames

public void setJobNames(java.lang.String... jobName)
Sets the jobs to execute by (bean) name. This is the default method used by the hdp name space to allow lazy initialization and potential scoping to kick in.

Parameters:
jobName - The job to execute.

isWaitForCompletion

public boolean isWaitForCompletion()
Indicates whether the 'runner' should wait for the job to complete (default).

Returns:
whether to wait for the job to complete or not.

setWaitForCompletion

public void setWaitForCompletion(boolean waitForJob)
Indicates whether the 'runner' should wait for the job to complete (default) after submission or not.

Parameters:
waitForJob - whether to wait for the job to complete or not.

isVerbose

public boolean isVerbose()
Indicates whether the job execution is verbose (the default) or not.

Returns:
whether the job execution is verbose or not.

setVerbose

public void setVerbose(boolean verbose)
Indicates whether the job execution is verbose (the default) or not.

Parameters:
verbose - whether the job execution is verbose or not.

setBeanFactory

public void setBeanFactory(org.springframework.beans.factory.BeanFactory beanFactory)
                    throws org.springframework.beans.BeansException
Specified by:
setBeanFactory in interface org.springframework.beans.factory.BeanFactoryAware
Throws:
org.springframework.beans.BeansException

setExecutor

public void setExecutor(java.util.concurrent.Executor executor)
Sets the TaskExecutor used for executing the Hadoop job. By default, SyncTaskExecutor is used, meaning the calling thread is used. While this replicates the Hadoop behavior, it prevents running jobs from being killed if the application shuts down. For a fine-tuned control, a dedicated Executor is recommended.

Parameters:
executor - the task executor to use execute the Hadoop job.

isKillJobsAtShutdown

public boolean isKillJobsAtShutdown()
Indicates whether the configured jobs should be 'killed' when the application shuts down or not.

Returns:
whether or not to kill the configured jobs at shutdown

setKillJobAtShutdown

public void setKillJobAtShutdown(boolean killJobsAtShutdown)
Indicates whether the configured jobs should be 'killed' when the application shuts down (default) or not. For long-running or fire-and-forget jobs that live beyond the starting application, set this to false. Note that if setWaitForCompletion(boolean) is true, this flag is considered to be true as otherwise the application cannot shut down (since it has to keep waiting for the job).

Parameters:
killJobsAtShutdown - whether or not to kill configured jobs when the application shuts down

Spring for Apache Hadoop