Spring for Apache Hadoop

org.springframework.yarn.batch.partition
Class HdfsSplitBatchPartitionHandler

java.lang.Object
  extended by org.springframework.yarn.batch.partition.AbstractBatchPartitionHandler
      extended by org.springframework.yarn.batch.partition.HdfsSplitBatchPartitionHandler
All Implemented Interfaces:
org.springframework.batch.core.partition.PartitionHandler

public class HdfsSplitBatchPartitionHandler
extends AbstractBatchPartitionHandler

Implementation of Spring Batch PartitionHandler which does partitioning based on number of input files from HDFS.

Author:
Janne Valkealahti

Constructor Summary
HdfsSplitBatchPartitionHandler(AbstractBatchAppmaster batchAppmaster)
          Instantiates a new hdfs split batch partition handler.
HdfsSplitBatchPartitionHandler(AbstractBatchAppmaster batchAppmaster, org.apache.hadoop.conf.Configuration configuration)
          Instantiates a new hdfs split batch partition handler.
 
Method Summary
protected  java.util.Map<org.springframework.batch.core.StepExecution,ContainerRequestHint> createResourceRequestData(java.util.Set<org.springframework.batch.core.StepExecution> stepExecutions)
          Subclass may override this method to assign a specific ContainerRequestHint to a StepExecution.
protected  java.util.Set<org.springframework.batch.core.StepExecution> createStepExecutionSplits(org.springframework.batch.core.partition.StepExecutionSplitter stepSplitter, org.springframework.batch.core.StepExecution stepExecution)
           
 org.apache.hadoop.conf.Configuration getConfiguration()
          Gets the Yarn configuration.
 void setConfiguration(org.apache.hadoop.conf.Configuration configuration)
          Sets the Yarn configuration.
 
Methods inherited from class org.springframework.yarn.batch.partition.AbstractBatchPartitionHandler
getContainerResolver, getStepName, handle, setContainerResolver, setStepName, waitCompleteState
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HdfsSplitBatchPartitionHandler

public HdfsSplitBatchPartitionHandler(AbstractBatchAppmaster batchAppmaster)
Instantiates a new hdfs split batch partition handler.

Parameters:
batchAppmaster - the batch appmaster

HdfsSplitBatchPartitionHandler

public HdfsSplitBatchPartitionHandler(AbstractBatchAppmaster batchAppmaster,
                                      org.apache.hadoop.conf.Configuration configuration)
Instantiates a new hdfs split batch partition handler.

Parameters:
batchAppmaster - the batch appmaster
Method Detail

getConfiguration

public org.apache.hadoop.conf.Configuration getConfiguration()
Gets the Yarn configuration.

Returns:
the Yarn configuration

setConfiguration

public void setConfiguration(org.apache.hadoop.conf.Configuration configuration)
Sets the Yarn configuration.

Parameters:
configuration - the new Yarn configuration

createStepExecutionSplits

protected java.util.Set<org.springframework.batch.core.StepExecution> createStepExecutionSplits(org.springframework.batch.core.partition.StepExecutionSplitter stepSplitter,
                                                                                                org.springframework.batch.core.StepExecution stepExecution)
                                                                                         throws java.lang.Exception
Specified by:
createStepExecutionSplits in class AbstractBatchPartitionHandler
Throws:
java.lang.Exception

createResourceRequestData

protected java.util.Map<org.springframework.batch.core.StepExecution,ContainerRequestHint> createResourceRequestData(java.util.Set<org.springframework.batch.core.StepExecution> stepExecutions)
                                                                                                              throws java.lang.Exception
Description copied from class: AbstractBatchPartitionHandler
Subclass may override this method to assign a specific ContainerRequestHint to a StepExecution. This would be needed in cases where step should be executed in a specific host or rack considering data locality. Default implementation returns an empty map.

Overrides:
createResourceRequestData in class AbstractBatchPartitionHandler
Parameters:
stepExecutions - Set of step executions
Returns:
Mapping between step executions and container request data
Throws:
java.lang.Exception - If error occurred

Spring for Apache Hadoop