25. SFTP Adapters

Spring Integration provides support for file transfer operations via SFTP.

25.1 Introduction

The Secure File Transfer Protocol (SFTP) is a network protocol which allows you to transfer files between two computers on the Internet over any reliable stream.

The SFTP protocol requires a secure channel, such as SSH, as well as visibility to a client's identity throughout the SFTP session.

Spring Integration supports sending and receiving files over SFTP by providing three client side endpoints: Inbound Channel Adapter, Outbound Channel Adapter, and Outbound Gateway It also provides convenient namespace configuration to define these client components.

xmlns:int-sftp="http://www.springframework.org/schema/integration/sftp"
xsi:schemaLocation="http://www.springframework.org/schema/integration/sftp
	http://www.springframework.org/schema/integration/sftp/spring-integration-sftp.xsd"

25.2 SFTP Session Factory

Before configuring SFTP adapters, you must configure an SFTP Session Factory. You can configure the SFTP Session Factory via a regular bean definition:

<beans:bean id="sftpSessionFactory"
    class="org.springframework.integration.sftp.session.DefaultSftpSessionFactory">
    <beans:property name="host" value="localhost"/>
    <beans:property name="privateKey" value="classpath:META-INF/keys/sftpTest"/>
    <beans:property name="privateKeyPassphrase" value="springIntegration"/>
    <beans:property name="port" value="22"/>
    <beans:property name="user" value="kermit"/>
</beans:bean>

Every time an adapter requests a session object from its SessionFactory, a new SFTP session is being created. Under the covers, the SFTP Session Factory relies on the JSch library to provide the SFTP capabilities.

However, Spring Integration also supports the caching of SFTP sessions, please see Section 25.3, “SFTP Session Caching” for more information.

[Note]Note
If you experience connectivity problems and would like to trace Session creation as well as see which Sessions are polled you may enable it by setting the logger to TRACE level (e.g., log4j.category.org.springframework.integration.file=TRACE). Please also see Section 25.7, “SFTP/JSCH Logging”.

Now all you need to do is inject this SFTP Session Factory into your adapters.

[Note]Note
A more practical way to provide values for the SFTP Session Factory would be via Spring's property placeholder support.

25.2.1 Configuration Properties

Below you will find all properties that are exposed by the DefaultSftpSessionFactory.

clientVersion

Allows you to set the client version property. It's default depends on the underlying JSch version but it will look like: SSH-2.0-JSCH-0.1.45

enableDaemonThread

If true, all threads will be daemon threads. If set to false, normal non-daemon threads will be used instead. This property will be set on the underlying JSch Session. There, this property will default to false, if not explicitly set.

host

The url of the host you want connect to. Mandatory.

hostKeyAlias

Sets the host key alias, used when comparing the host key to the known hosts list.

knownHosts

Specifies the filename that will be used to create a host key repository. The resulting file has the same format as OpenSSH's known_hosts file.

password

The password to authenticate against the remote host. If a password is not provided, then the privateKey property is mandatory.

port

The port over which the SFTP connection shall be established. If not specified, this value defaults to 22. If specified, this properties must be a positive number.

privateKey

Allows you to set a Resource, which represents the location of the private key used for authenticating against the remote host. If the privateKey is not provided, then the password property is mandatory.

privateKeyPassphrase

The password for the private key. Optional.

proxy

Allows for specifying a JSch-based Proxy. If set, then the proxy object is used to create the connection to the remote host.

serverAliveCountMax

Specifies the number of server-alive messages, which will be sent without any reply from the server before disconnecting. If not set, this property defaults to 1.

serverAliveInterval

Sets the timeout interval (milliseconds) before a server alive message is sent, in case no message is received from the server.

sessionConfig

Using Properties, you can set additional configuration setting on the underlying JSch Session.

socketFactory

Allows you to pass in a SocketFactory. The socket factory is used to create a socket to the target host. When a proxy is used, the socket factory is passed to the proxy. By default plain TCP sockets are used.

timeout

The timeout property is used as the socket timeout parameter, as well as the default connection timeout. Defaults to 0, which means, that no timeout will occur.

user

The remote user to use. Mandatory.

25.3 SFTP Session Caching

As of version 2.1 we've exposed more flexibility with regard to session management for remote file adapters (e.g., FTP, SFTP etc). In previous versions the sessions were cached automatically by default. We did expose a cache-sessions attribute for disabling the auto caching, but that solution did not provide a way to configure other session caching attributes. For example, one of the requested features was to support a limit on the number of sessions created since a remote server may impose a limit on the number of client connections. To support that requirement and other configuration options, we decided to promote explicit definition of the CachingSessionFactory instance. That provides the sessionCacheSize and sessionWaitTimeout properties. As its name suggests, the sessionCacheSize property controls how many active sessions this adapter will maintain in its cache (the DEFAULT is unbounded). If the sessionCacheSize threshold has been reached, any attempt to acquire another session will block until either one of the cached sessions becomes available or until the wait time for a Session expires (the DEFAULT wait time is Integer.MAX_VALUE). The sessionWaitTimeout property enables configuration of that value.

If you want your Sessions to be cached, simply configure your default Session Factory as described above and then wrap it in an instance of CachingSessionFactory where you may provide those additional properties.

<bean id="sftpSessionFactory"
    class="org.springframework.integration.sftp.session.DefaultSftpSessionFactory">
    <property name="host" value="localhost"/>
</bean>

<bean id="cachingSessionFactory"
    class="org.springframework.integration.file.remote.session.CachingSessionFactory">
    <constructor-arg ref="sftpSessionFactory"/>
    <property name="sessionCacheSize" value="10"/>
    <property name="sessionWaitTimeout" value="1000"/>
</bean>

In the above example you see a CachingSessionFactory created with the sessionCacheSize set to 10 and the sessionWaitTimeout set to 1 second (its value is in millliseconds).

25.4 SFTP Inbound Channel Adapter

The SFTP Inbound Channel Adapter is a special listener that will connect to the server and listen for the remote directory events (e.g., new file created) at which point it will initiate a file transfer.

<int-sftp:inbound-channel-adapter id="sftpAdapterAutoCreate"
  			session-factory="sftpSessionFactory"
			channel="requestChannel"
			filename-pattern="*.txt"
			remote-directory="/foo/bar"
			local-directory="file:target/foo"
			auto-create-local-directory="true"
			local-filename-generator-expression="#this.toUpperCase() + '.a'"
			delete-remote-files="false">
		<int:poller fixed-rate="1000"/>
</int-sftp:inbound-channel-adapter>

As you can see from the configuration above you can configure the SFTP Inbound Channel Adapter via the inbound-channel-adapter element while also providing values for various attributes such as local-directory - where files are going to be transferred TO and remote-directory - the remote source directory where files are going to be transferred FROM - as well as other attributes including a session-factory reference to the bean we configured earlier.

By default the transferred file will carry the same name as the original file. If you want to override this behavior you can set the local-filename-generator-expression attribute which allows you to provide a SpEL Expression to generate the name of the local file. Unlike outbound gateways and adapters where the root object of the SpEL Evaluation Context is a Message, this inbound adapter does not yet have the Message at the time of evaluation since that's what it ultimately generates with the transferred file as its payload. So, the root object of the SpEL Evaluation Context is the original name of the remote file (String).

Sometimes file filtering based on the simple pattern specified via filename-pattern attribute might not be sufficient. If this is the case, you can use the filename-regex attribute to specify a Regular Expression (e.g. filename-regex=".*\.test$"). And of course if you need complete control you can use the filter attribute to provide a reference to a custom implementation of the org.springframework.integration.file.filters.FileListFilter - a strategy interface for filtering a list of files.

Please refer to the schema for more detail on these attributes.

It is also important to understand that SFTP Inbound Channel Adapter is a Polling Consumer and therefore you must configure a poller (either a global default or a local sub-element). Once the file has been transferred to a local directory, a Message with java.io.File as its payload type will be generated and sent to the channel identified by the channel attribute.

More on File Filtering and Large Files

Sometimes a file that just appeared in the monitored (remote) directory is not complete. Typically such a file will be written with some temporary extension (e.g., foo.txt.writing) and then renamed after the writing process completes. As a user in most cases you are only interested in files that are complete and would like to filter only those files. To handle these scenarios, use filtering support provided via the filename-pattern, filename-regex and filter attributes. If you need a custom filter implementation simply include a reference in your adapter via the filter attribute.

<int-sftp:inbound-channel-adapter id="sftpInbondAdapter"
			channel="receiveChannel"
			session-factory="sftpSessionFactory"
			filter="customFilter"
			local-directory="file:/local-test-dir"
			remote-directory="/remote-test-dir">
		<int:poller fixed-rate="1000" max-messages-per-poll="10" task-executor="executor"/>
</int-sftp:inbound-channel-adapter>

<bean id="customFilter" class="org.foo.CustomFilter"/>

25.5 SFTP Outbound Channel Adapter

The SFTP Outbound Channel Adapteris a special MessageHandler that will connect to the remote directory and will initiate a file transfer for every file it will receive as the payload of an incoming Message. It also supports several representations of the File so you are not limited to the File object. Similar to the FTP outbound adapter, the SFTP Outbound Channel Adapter supports the following payloads: 1) java.io.File - the actual file object; 2) byte[] - byte array that represents the file contents; 3) java.lang.String - text that represents the file contents.

<int-sftp:outbound-channel-adapter id="sftpOutboundAdapter"
				session-factory="sftpSessionFactory"
				channel="inputChannel"
				charset="UTF-8"
				remote-directory="foo/bar"
				remote-filename-generator-expression="payload.getName() + '-foo'"/>

As you can see from the configuration above you can configure the SFTP Outbound Channel Adapter via the outbound-channel-adapter element. Please refer to the schema for more detail on these attributes.

SpEL and the SFTP Outbound Adapter

As with many other components in Spring Integration, you can benefit from the Spring Expression Language (SpEL) support when configuring an SFTP Outbound Channel Adapter, by specifying two attributes remote-directory-expression and remote-filename-generator-expression (see above). The expression evaluation context will have the Message as its root object, thus allowing you to provide expressions which can dynamically compute the file name or the existing directory path based on the data in the Message (either from 'payload' or 'headers'). In the example above we are defining the remote-filename-generator-expression attribute with an expression value that computes the file name based on its original name while also appending a suffix: '-foo'.

Avoiding Partially Written Files

One of the common problems, when dealing with file transfers, is the possibility of processing a partial file - a file might appear in the file system before its transfer is actually complete.

To deal with this issue, Spring Integration SFTP adapters use a very common algorithm where files are transferred under a temporary name and than renamed once they are fully transferred.

By default, every file that is in the process of being transferred will appear in the file system with an additional suffix which, by default, is .writing; this can be changed using the temporary-file-suffix attribute.

However, there may be situations where you don't want to use this technique (for example, if the server does not permit renaming files). For situations like this, you can disable this feature by setting use-temporary-file-name to false (default is true). When this attribute is false, the file is written with its final name and the consuming application will need some other mechanism to detect that the file is completely uploaded before accessing it.

25.6 SFTP Outbound Gateway

The SFTP Outbound Gateway provides a limited set of commands to interact with a remote SFTP server.

Commands supported are:

  • ls (list files)
  • get (retrieve file)
  • mget (retrieve file(s))
  • rm (remove file(s))

ls

ls lists remote file(s) and supports the following options:

  • -1 - just retrieve a list of filenames, default is to retrieve a list of FileInfo objects.
  • -a - include all files (including those starting with '.')
  • -f - do not sort the list
  • -dirs - include directories (excluded by default)
  • -links - include symbolic links (excluded by default)

In addition, filename filtering is provided, in the same manner as the inbound-channel-adapter.

The message payload resulting from an ls operation is a list of file names, or a list of FileInfo objects. These objects provide information such as modified time, permissions etc.

The remote directory that the ls command acted on is provided in the file_remoteDirectory header.

get

get retrieves a remote file and supports the following option:

  • -P - preserve the timestamp of the remote file

The message payload resulting from a get operation is a File object representing the retrieved file.

The remote directory is provided in the file_remoteDirectory header, and the filename is provided in the file_remoteFile header.

mget

mget retrieves multiple remote files based on a pattern and supports the following option:

  • -x - Throw an exception if no files match the pattern (otherwise an empty list is returned)

The message payload resulting from an mget operation is a List<File> object - a List of File objects, each representing a retrieved file.

The remote directory is provided in the file_remoteDirectory header, and the pattern for the filenames is provided in the file_remoteFile header.

rm

The rm command has no options.

The message payload resulting from an rm operation is Boolean.TRUE if the remove was successful, Boolean.FALSE otherwise. The remote directory is provided in the file_remoteDirectory header, and the filename is provided in the file_remoteFile header.

In each case, the PATH that these commands act on is provided by the 'expression' property of the gateway. For the mget command, the expression might evaluate to '*', meaning retrieve all files, or 'somedirectory/*' etc.

Here is an example of a gateway configured for an ls command...

<int-ftp:outbound-gateway id="gateway1"
		session-factory="ftpSessionFactory"
		request-channel="inbound1"
		command="ls"
		command-options="-1"
		expression="payload"
		reply-channel="toSplitter"/>

The payload of the message sent to the toSplitter channel is a list of String objects containing the filename of each file. If the command-options was omitted, it would be a list of FileInfo objects. Options are provided space-delimited, e.g. command-options="-1 -dirs -links".

25.7 SFTP/JSCH Logging

Since we use JSch libraries (http://www.jcraft.com/jsch/) to provide SFTP support, at times you may require more information from the JSch API itself, especially if something is not working properly (e.g., Authentication exceptions). Unfortunately JSch does not use commons-logging but instead relies on custom implementations of their com.jcraft.jsch.Logger interface. As of Spring Integration 2.0.1, we have implemented this interface. So, now all you need to do to enable JSch logging is to configure your logger the way you usually do. For example, here is valid configuration of a logger using Log4J.

log4j.category.com.jcraft.jsch=DEBUG