11. Feed Adapter

Spring Integration provides support for Syndication via Feed Adapters

11.1 Introduction

Web syndication is a form of publishing material such as news stories, press releases, blog posts, and other items typically available on a website but also made available in a feed format such as RSS or ATOM.

Spring integration provides support for Web Syndication via its 'feed' adapter and provides convenient namespace-based configuration for it. To configure the 'feed' namespace, include the following elements within the headers of your XML configuration file:


11.2 Feed Inbound Channel Adapter

The only adapter that is really needed to provide support for retrieving feeds is an inbound channel adapter. This allows you to subscribe to a particular URL. Below is an example configuration:

<int-feed:inbound-channel-adapter id="feedAdapter" 
	<int:poller fixed-rate="10000" max-messages-per-poll="100" />

In the above configuration, we are subscribing to a URL identified by the url attribute.

As news items are retrieved they will be converted to Messages and sent to a channel identified by the channel attribute. The payload of each message will be a com.sun.syndication.feed.synd.SyndEntry instance. That encapsulates various data about a news item (content, dates, authors, etc.).

You can also see that the Inbound Feed Channel Adapter is a Polling Consumer. That means you have to provide a poller configuration. However, one important thing you must understand with regard to Feeds is that its inner-workings are slightly different then most other poling consumers. When an Inbound Feed adapter is started, it does the first poll and receives a com.sun.syndication.feed.synd.SyndEntryFeed instance. That is an object that contains multiple SyndEntry objects. Each entry is stored in the local entry queue and is released based on the value in the max-messages-per-poll attribute such that each Message will contain a single entry. If during retrieval of the entries from the entry queue the queue had become empty, the adapter will attempt to update the Feed thereby populating the queue with more entries (SyndEntry instances) if available. Otherwise the next attempt to poll for a feed will be determined by the trigger of the poller (e.g., every 10 seconds in the above configuration).

Duplicate Entries

Polling for a Feed might result in entries that have already been processed ("I already read that news item, why are you showing it to me again?"). Spring Integration provides a convenient mechanism to eliminate the need to worry about duplicate entries. Each feed entry will have a published date field. Every time a new Message is generated and sent, Spring Integration will store the value of the latest published date in an instance of the org.springframework.integration.store.MetadataStore strategy. The MetadataStore interface is designed to store various types of generic meta-data (e.g., published date of the last feed entry that has been processed) to help components such as this Feed adapter deal with duplicates.

The default rule for locating this metadata store is as follows: Spring Integration will look for a bean of type org.springframework.integration.store.MetadataStore in the ApplicationContext. If one is found then it will be used, otherwise it will create a new instance of SimpleMetadataStore which is an in-memory implementation that will only persist metadata within the lifecycle of the currently running Application Context. This means that upon restart you may end up with duplicate entries. If you need to persist metadata between Application Context restarts, you may use the PropertiesPersistingMetadataStore which is backed by a properties file and a properties-persister. Alternatively, you could provide your own implementation of the MetadataStore interface (e.g. JdbcMetadataStore) and configure it as bean in the Application Context.

<bean id="metadataStore"