This version is still in development and is not considered stable yet. For the latest stable version, please use Spring Integration 6.5.1! |
Inbound Channel Adapters: Controlling Remote File Fetching
You should consider two properties when configuring inbound channel adapters.
max-messages-per-poll
, as with all pollers, can be used to limit the number of messages emitted on each poll (if more than the configured value are ready).
max-fetch-size
(since version 5.0) can limit the number of files retrieved from the remote server at a time.
The following scenarios assume the starting state is an empty local directory:
-
max-messages-per-poll=2
andmax-fetch-size=1
: The adapter fetches one file, emits it, fetches the next file, and emit it. Then it sleeps until the next poll. -
max-messages-per-poll=2
andmax-fetch-size=2
: The adapter fetches both files and then emits each one. -
max-messages-per-poll=2
andmax-fetch-size=4
: The adapter fetches up to 4 files (if available) and emits the first two (if there are at least two). The next two files will be emitted in the next poll. -
max-messages-per-poll=2
andmax-fetch-size
not specified: The adapter fetches all remote files and emits the first two (if there are at least two). The subsequent files are emitted on subsequent polls (two at a time). When all are consumed, the remote fetch is attempted again to pick up any new files.
When you deploy multiple instances of an application, we recommend setting a small max-fetch-size , to avoid one instance “grabbing” all the files and starving other instances.
|
Another use for max-fetch-size
is when you want to stop fetching remote files but continue to process files that have already been fetched.
Setting the maxFetchSize
property on the MessageSource
(programmatically, via JMX, or via a control bus) effectively stops the adapter from fetching more files but lets the poller continue to emit messages for files that have previously been fetched.
If the poller is active when the property is changed, the change takes effect on the next poll.
Starting with version 5.1, the synchronizer can be provided with a Comparator<?>
.
This is useful when restricting the number of files fetched with maxFetchSize
.
Starting with version 6.4, the AbstractRemoteFileStreamingMessageSource
has now a convenient clearFetchedCache()
API to remove references from cache for not processed remote files.
The references stay in cache because polling configuration does not allow processing all of them in one cycle, and the target SessionFactory
might be changed between polling cycles, e.g., via RotatingServerAdvice
.
Starting with version 7.0, the AbstractInboundFileSynchronizer
caches a filtered Session.list(remoteDirectory)
after applying a maxFetchSize
slicing.
The logic of the AbstractInboundFileSynchronizer.transferFilesFromRemoteToLocal()
method is the following:
-
If
maxFetchSize > 0
, the lock is acquired againstremoteDirectory
to avoid race condition from different threads, when work is done around cache. The performance degradation is minimal since all the later synchronizations deal only with in-memory cached leftover; -
If no cache entry for the
remoteDirectory
, theSession.list(remoteDirectory)
is called and all returned remote files are filtered; -
the filtered result then sliced to the
maxFetchSize
; -
then these file entries are being transferred to the local directory;
-
the rest of filtered remote files are cached for later synchronizations;
-
if there is a cache entry for the
remoteDirectory
, such a list is sliced to the maxFetchSize and iterated for the transfer to the local directory; -
if one of the transfers fails, the
filter
is reset from the failed remote file. The cache is also evicted; therefore, the next synchronization would start from a clean state.
Also see general SFTP Inbound Channel Adapter chapter for information about FileListFilter
configuration.