Spring for Apache Hadoop

org.springframework.data.hadoop.mapreduce
Class StreamJobFactoryBean

java.lang.Object
  extended by org.springframework.data.hadoop.mapreduce.StreamJobFactoryBean
All Implemented Interfaces:
org.springframework.beans.factory.Aware, org.springframework.beans.factory.BeanNameAware, org.springframework.beans.factory.FactoryBean<org.apache.hadoop.mapreduce.Job>, org.springframework.beans.factory.InitializingBean

public class StreamJobFactoryBean
extends java.lang.Object
implements org.springframework.beans.factory.InitializingBean, org.springframework.beans.factory.FactoryBean<org.apache.hadoop.mapreduce.Job>, org.springframework.beans.factory.BeanNameAware

Factory bean focused on creating streaming jobs. As opposed to JobFactoryBean which is Java-specific, this factory is suitable for streaming scenarios (such as invoking Ruby/Python scripts or command-line scripts).

Author:
Costin Leau

Constructor Summary
StreamJobFactoryBean()
           
 
Method Summary
 void afterPropertiesSet()
           
 org.apache.hadoop.mapreduce.Job getObject()
           
 java.lang.Class<?> getObjectType()
           
 boolean isSingleton()
           
 void setArchives(org.springframework.core.io.Resource... archives)
          Sets the archives to be unarchive to the map reduce cluster.
 void setBeanName(java.lang.String name)
           
 void setCmdEnv(java.util.Properties cmdEnv)
          Sets the environment for the commands to be executed.
 void setCombiner(java.lang.String combiner)
          Sets the job combiner.
 void setConfiguration(org.apache.hadoop.conf.Configuration configuration)
          Sets the Hadoop configuration to use.
 void setFiles(org.springframework.core.io.Resource... files)
          Sets the files to be copied to the map reduce cluster.
 void setInputFormat(java.lang.String inputFormat)
          Sets the job input format.
 void setInputPath(java.lang.String... input)
          Sets the job input paths.
 void setLibs(org.springframework.core.io.Resource... libJars)
          Sets the jar files to include in the classpath.
 void setMapper(java.lang.String mapper)
          Sets the job mapper.
 void setNumberReducers(java.lang.Integer numReduceTasks)
          Sets the job number of reducer tasks.
 void setOutputFormat(java.lang.String outputFormat)
          Sets the job output format.
 void setOutputPath(java.lang.String output)
          Sets the job output paths.
 void setPartitioner(java.lang.String partitioner)
          Sets the job partitioner.
 void setProperties(java.util.Properties properties)
          Sets the configuration properties to use.
 void setReducer(java.lang.String reducer)
          Sets the job reducer.
 void setUser(java.lang.String user)
          Sets the user impersonation (optional) for running this job.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

StreamJobFactoryBean

public StreamJobFactoryBean()
Method Detail

setBeanName

public void setBeanName(java.lang.String name)
Specified by:
setBeanName in interface org.springframework.beans.factory.BeanNameAware

getObject

public org.apache.hadoop.mapreduce.Job getObject()
                                          throws java.lang.Exception
Specified by:
getObject in interface org.springframework.beans.factory.FactoryBean<org.apache.hadoop.mapreduce.Job>
Throws:
java.lang.Exception

getObjectType

public java.lang.Class<?> getObjectType()
Specified by:
getObjectType in interface org.springframework.beans.factory.FactoryBean<org.apache.hadoop.mapreduce.Job>

isSingleton

public boolean isSingleton()
Specified by:
isSingleton in interface org.springframework.beans.factory.FactoryBean<org.apache.hadoop.mapreduce.Job>

afterPropertiesSet

public void afterPropertiesSet()
                        throws java.lang.Exception
Specified by:
afterPropertiesSet in interface org.springframework.beans.factory.InitializingBean
Throws:
java.lang.Exception

setInputPath

public void setInputPath(java.lang.String... input)
Sets the job input paths.

Parameters:
input - The input to set.

setOutputPath

public void setOutputPath(java.lang.String output)
Sets the job output paths.

Parameters:
output - The output to set.

setMapper

public void setMapper(java.lang.String mapper)
Sets the job mapper.

Parameters:
mapper - The mapper to set.

setReducer

public void setReducer(java.lang.String reducer)
Sets the job reducer.

Parameters:
reducer - The reducer to set.

setCombiner

public void setCombiner(java.lang.String combiner)
Sets the job combiner.

Parameters:
combiner - The combiner to set.

setInputFormat

public void setInputFormat(java.lang.String inputFormat)
Sets the job input format.

Parameters:
inputFormat - The inputFormat to set.

setOutputFormat

public void setOutputFormat(java.lang.String outputFormat)
Sets the job output format.

Parameters:
outputFormat - The outputFormat to set.

setPartitioner

public void setPartitioner(java.lang.String partitioner)
Sets the job partitioner.

Parameters:
partitioner - The partitioner to set.

setConfiguration

public void setConfiguration(org.apache.hadoop.conf.Configuration configuration)
Sets the Hadoop configuration to use.

Parameters:
configuration - The configuration to set.

setCmdEnv

public void setCmdEnv(java.util.Properties cmdEnv)
Sets the environment for the commands to be executed.

Parameters:
cmdEnv - The environment command/property to set.

setNumberReducers

public void setNumberReducers(java.lang.Integer numReduceTasks)
Sets the job number of reducer tasks.

Parameters:
numReduceTasks - The numReduceTasks to set.

setProperties

public void setProperties(java.util.Properties properties)
Sets the configuration properties to use.

Parameters:
properties - The properties to set.

setLibs

public void setLibs(org.springframework.core.io.Resource... libJars)
Sets the jar files to include in the classpath. Note that a pattern can be used (e.g. mydir/*.jar), which the Spring container will automatically resolve.

Parameters:
libJars - The jar files to include in the classpath.

setFiles

public void setFiles(org.springframework.core.io.Resource... files)
Sets the files to be copied to the map reduce cluster. Note that a pattern can be used (e.g. mydir/*.txt), which the Spring container will automatically resolve.

Parameters:
files - The files to copy.

setArchives

public void setArchives(org.springframework.core.io.Resource... archives)
Sets the archives to be unarchive to the map reduce cluster. Note that a pattern can be used (e.g. mydir/*.zip), which the Spring container will automatically resolve.

Parameters:
archives - The archives to unarchive on the compute machines.

setUser

public void setUser(java.lang.String user)
Sets the user impersonation (optional) for running this job. Should be used when running against a Hadoop Kerberos cluster.

Parameters:
user - user/group information

Spring for Apache Hadoop