Spring for Apache Hadoop

org.springframework.data.hadoop.fs
Class DistCp

java.lang.Object
  extended by org.springframework.data.hadoop.fs.DistCp

public class DistCp
extends java.lang.Object

Exposes the Hadoop command-line distcp as an embeddable API. Due to the number of options available in DistCp, one can either specify them in a command-like style (through one or multiple Strings) through copy(String...) or specify individual arguments through the rest of the methods.

Author:
Costin Leau, Thomas Risberg

Nested Class Summary
static class DistCp.Preserve
          Enumeration for the possible attributes that can be preserved by a copy operation.
 
Constructor Summary
DistCp(org.apache.hadoop.conf.Configuration configuration)
          Constructs a new DistCp instance.
DistCp(org.apache.hadoop.conf.Configuration configuration, java.lang.String user)
           
 
Method Summary
 void copy(java.lang.Boolean preserveReplication, java.lang.Boolean preserveBlockSize, java.lang.Boolean preserveUser, java.lang.Boolean preserveGroup, java.lang.Boolean preservePermission, java.lang.Boolean ignoreFailures, java.lang.Boolean skipCrc, java.lang.String logDir, java.lang.Integer mappers, java.lang.Boolean overwrite, java.lang.Boolean update, java.lang.Boolean delete, java.lang.Long fileLimit, java.lang.Long sizeLimit, java.lang.String fileList, java.lang.String... uris)
          Copies the given resources using the given parameters.
 void copy(java.util.EnumSet<DistCp.Preserve> preserve, java.lang.Boolean ignoreFailures, java.lang.Boolean overwrite, java.lang.Boolean update, java.lang.Boolean delete, java.lang.String... uris)
          Copy operation.
 void copy(java.util.EnumSet<DistCp.Preserve> preserve, java.lang.Boolean ignoreFailures, java.lang.Boolean skipCrc, java.lang.String logDir, java.lang.Integer mappers, java.lang.Boolean overwrite, java.lang.Boolean update, java.lang.Boolean delete, java.lang.Long fileLimit, java.lang.Long sizeLimit, java.lang.String fileList, java.lang.String... uris)
          Copies the given resources using the given parameters.
 void copy(java.lang.String... arguments)
          DistCopy using a command-line style (arguments are specified as Strings).
 void copy(java.lang.String arg1, java.lang.String arg2)
          Basic copy operation, between a source and a destination using the defaults.
 void copy(java.lang.String arg1, java.lang.String arg2, java.lang.String arg3)
          Basic copy operation, between a source and a destination using the defaults.
 void setUser(java.lang.String user)
          Sets the user impersonation (optional) for creating this utility.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DistCp

public DistCp(org.apache.hadoop.conf.Configuration configuration)
Constructs a new DistCp instance.

Parameters:
configuration - Hadoop configuration to use.

DistCp

public DistCp(org.apache.hadoop.conf.Configuration configuration,
              java.lang.String user)
Method Detail

copy

public void copy(java.util.EnumSet<DistCp.Preserve> preserve,
                 java.lang.Boolean ignoreFailures,
                 java.lang.Boolean overwrite,
                 java.lang.Boolean update,
                 java.lang.Boolean delete,
                 java.lang.String... uris)
Copy operation.

Parameters:
preserve -
ignoreFailures -
overwrite -
update -
delete -
uris -

copy

public void copy(java.util.EnumSet<DistCp.Preserve> preserve,
                 java.lang.Boolean ignoreFailures,
                 java.lang.Boolean skipCrc,
                 java.lang.String logDir,
                 java.lang.Integer mappers,
                 java.lang.Boolean overwrite,
                 java.lang.Boolean update,
                 java.lang.Boolean delete,
                 java.lang.Long fileLimit,
                 java.lang.Long sizeLimit,
                 java.lang.String fileList,
                 java.lang.String... uris)
Copies the given resources using the given parameters.

Parameters:
preserve -
ignoreFailures -
skipCrc -
logDir -
mappers -
overwrite -
update -
delete -
fileLimit -
sizeLimit -
fileList -
uris -

copy

public void copy(java.lang.Boolean preserveReplication,
                 java.lang.Boolean preserveBlockSize,
                 java.lang.Boolean preserveUser,
                 java.lang.Boolean preserveGroup,
                 java.lang.Boolean preservePermission,
                 java.lang.Boolean ignoreFailures,
                 java.lang.Boolean skipCrc,
                 java.lang.String logDir,
                 java.lang.Integer mappers,
                 java.lang.Boolean overwrite,
                 java.lang.Boolean update,
                 java.lang.Boolean delete,
                 java.lang.Long fileLimit,
                 java.lang.Long sizeLimit,
                 java.lang.String fileList,
                 java.lang.String... uris)
Copies the given resources using the given parameters.

Parameters:
preserveReplication -
preserveBlockSize -
preserveUser -
preserveGroup -
preservePermission -
ignoreFailures -
skipCrc -
logDir -
mappers -
overwrite -
update -
delete -
fileLimit -
sizeLimit -
fileList -
uris -

copy

public void copy(java.lang.String arg1,
                 java.lang.String arg2)
Basic copy operation, between a source and a destination using the defaults.

Parameters:
arg1 -
arg2 -

copy

public void copy(java.lang.String arg1,
                 java.lang.String arg2,
                 java.lang.String arg3)
Basic copy operation, between a source and a destination using the defaults.

Parameters:
arg1 -
arg2 -
arg3 -

copy

public void copy(java.lang.String... arguments)
DistCopy using a command-line style (arguments are specified as Strings).

Parameters:
arguments - copy arguments

setUser

public void setUser(java.lang.String user)
Sets the user impersonation (optional) for creating this utility. Should be used when running against a Hadoop Kerberos cluster.

Parameters:
user - user/group information

Spring for Apache Hadoop