12. Apache Kafka Binder

12. Apache Kafka Binder
Prev	Part II. Binder Implementations	Next

12.1 Usage

For using the Apache Kafka binder, you just need to add it to your Spring Cloud Stream application, using the following Maven coordinates:

<dependency>
  <groupId>org.springframework.cloud</groupId>
  <artifactId>spring-cloud-stream-binder-kafka</artifactId>
</dependency>

Alternatively, you can also use the Spring Cloud Stream Kafka Starter.

<dependency>
  <groupId>org.springframework.cloud</groupId>
  <artifactId>spring-cloud-starter-stream-kafka</artifactId>
</dependency>

12.2 Apache Kafka Binder Overview

A simplified diagram of how the Apache Kafka binder operates can be seen below.

Figure 12.1. Kafka Binder

The Apache Kafka Binder implementation maps each destination to an Apache Kafka topic. The consumer group maps directly to the same Apache Kafka concept. Partitioning also maps directly to Apache Kafka partitions as well.

12.3 Configuration Options

This section contains the configuration options used by the Apache Kafka binder.

For common configuration options and properties pertaining to binder, refer to the core docs.

12.3.1 Kafka Binder Properties

spring.cloud.stream.kafka.binder.brokers

A list of brokers to which the Kafka binder will connect.

Default: localhost.

spring.cloud.stream.kafka.binder.defaultBrokerPort

brokers allows hosts specified with or without port information (e.g., host1,host2:port2). This sets the default port when no port is configured in the broker list.

Default: 9092.

spring.cloud.stream.kafka.binder.zkNodes

A list of ZooKeeper nodes to which the Kafka binder can connect.

Default: localhost.

spring.cloud.stream.kafka.binder.defaultZkPort

zkNodes allows hosts specified with or without port information (e.g., host1,host2:port2). This sets the default port when no port is configured in the node list.

Default: 2181.

spring.cloud.stream.kafka.binder.configuration

Key/Value map of client properties (both producers and consumer) passed to all clients created by the binder. Due to the fact that these properties will be used by both producers and consumers, usage should be restricted to common properties, especially security settings.

Default: Empty map.

spring.cloud.stream.kafka.binder.headers

The list of custom headers that will be transported by the binder.

Default: empty.

spring.cloud.stream.kafka.binder.offsetUpdateTimeWindow

The frequency, in milliseconds, with which offsets are saved. Ignored if 0.

Default: 10000.

spring.cloud.stream.kafka.binder.offsetUpdateCount

The frequency, in number of updates, which which consumed offsets are persisted. Ignored if 0. Mutually exclusive with offsetUpdateTimeWindow.

Default: 0.

spring.cloud.stream.kafka.binder.requiredAcks

The number of required acks on the broker.

Default: 1.

spring.cloud.stream.kafka.binder.minPartitionCount

Effective only if autoCreateTopics or autoAddPartitions is set. The global minimum number of partitions that the binder will configure on topics on which it produces/consumes data. It can be superseded by the partitionCount setting of the producer or by the value of instanceCount * concurrency settings of the producer (if either is larger).

Default: 1.

spring.cloud.stream.kafka.binder.replicationFactor

The replication factor of auto-created topics if autoCreateTopics is active.

Default: 1.

spring.cloud.stream.kafka.binder.autoCreateTopics

If set to true, the binder will create new topics automatically. If set to false, the binder will rely on the topics being already configured. In the latter case, if the topics do not exist, the binder will fail to start. Of note, this setting is independent of the auto.topic.create.enable setting of the broker and it does not influence it: if the server is set to auto-create topics, they may be created as part of the metadata retrieval request, with default broker settings.

Default: true.

spring.cloud.stream.kafka.binder.autoAddPartitions

If set to true, the binder will create add new partitions if required. If set to false, the binder will rely on the partition size of the topic being already configured. If the partition count of the target topic is smaller than the expected value, the binder will fail to start.

Default: false.

spring.cloud.stream.kafka.binder.socketBufferSize

Size (in bytes) of the socket buffer to be used by the Kafka consumers.

Default: 2097152.

12.3.2 Kafka Consumer Properties

The following properties are available for Kafka consumers only and must be prefixed with spring.cloud.stream.kafka.bindings.<channelName>.consumer..

autoRebalanceEnabled

When true, topic partitions will be automatically rebalanced between the members of a consumer group. When false, each consumer will be assigned a fixed set of partitions based on spring.cloud.stream.instanceCount and spring.cloud.stream.instanceIndex. This requires both spring.cloud.stream.instanceCount and spring.cloud.stream.instanceIndex properties to be set appropriately on each launched instance. The property spring.cloud.stream.instanceCount must typically be greater than 1 in this case.

Default: true.

autoCommitOffset

Whether to autocommit offsets when a message has been processed. If set to false, an Acknowledgment header will be available in the message headers for late acknowledgment.

Default: true.

autoCommitOnError

Effective only if autoCommitOffset is set to true. If set to false it suppresses auto-commits for messages that result in errors, and will commit only for successful messages, allows a stream to automatically replay from the last successfully processed message, in case of persistent failures. If set to true, it will always auto-commit (if auto-commit is enabled). If not set (default), it effectively has the same value as enableDlq, auto-committing erroneous messages if they are sent to a DLQ, and not committing them otherwise.

Default: not set.

recoveryInterval

The interval between connection recovery attempts, in milliseconds.

Default: 5000.

resetOffsets

Whether to reset offsets on the consumer to the value provided by startOffset.

Default: false.

startOffset

The starting offset for new groups, or when resetOffsets is true. Allowed values: earliest, latest.

Default: null (equivalent to earliest).

enableDlq

When set to true, it will send enable DLQ behavior for the consumer. Messages that result in errors will be forwarded to a topic named error.<destination>.<group>. This provides an alternative option to the more common Kafka replay scenario for the case when the number of errors is relatively small and replaying the entire original topic may be too cumbersome.

Default: false.

configuration

Map with a key/value pair containing generic Kafka consumer properties.

Default: Empty map.

12.3.3 Kafka Producer Properties

The following properties are available for Kafka producers only and must be prefixed with spring.cloud.stream.kafka.bindings.<channelName>.producer..

bufferSize

Upper limit, in bytes, of how much data the Kafka producer will attempt to batch before sending.

Default: 16384.

sync

Whether the producer is synchronous.

Default: false.

batchTimeout

How long the producer will wait before sending in order to allow more messages to accumulate in the same batch. (Normally the producer does not wait at all, and simply sends all the messages that accumulated while the previous send was in progress.) A non-zero value may increase throughput at the expense of latency.

Default: 0.

configuration

Map with a key/value pair containing generic Kafka producer properties.

Default: Empty map.

Note

The Kafka binder will use the partitionCount setting of the producer as a hint to create a topic with the given partition count (in conjunction with the minPartitionCount, the maximum of the two being the value being used). Exercise caution when configuring both minPartitionCount for a binder and partitionCount for an application, as the larger value will be used. If a topic already exists with a smaller partition count and autoAddPartitions is disabled (the default), then the binder will fail to start. If a topic already exists with a smaller partition count and autoAddPartitions is enabled, new partitions will be added. If a topic already exists with a larger number of partitions than the maximum of (minPartitionCount and partitionCount), the existing partition count will be used.

12.3.4 Usage examples

In this section, we illustrate the use of the above properties for specific scenarios.

Example: security configuration

Apache Kafka 0.9 supports secure connections between client and brokers. To take advantage of this feature, follow the guidelines in the Apache Kafka Documentation as well as the Kafka 0.9 security guidelines from the Confluent documentation. Use the spring.cloud.stream.kafka.binder.configuration option to set security properties for all clients created by the binder.

For example, for setting security.protocol to SASL_SSL, set:

spring.cloud.stream.kafka.binder.configuration.security.protocol=SASL_SSL

All the other security properties can be set in a similar manner.

When using Kerberos, follow the instructions in the reference documentation for creating and referencing the JAAS configuration. At the time of this release, the JAAS, and (optionally) krb5 file locations must be set for Spring Cloud Stream applications by using system properties. Here is an example of launching a Spring Cloud Stream application with SASL and Kerberos.

 java -Djava.security.auth.login.config=/path.to/kafka_client_jaas.conf -jar log.jar \\
   --spring.cloud.stream.kafka.binder.brokers=secure.server:9092 \\
   --spring.cloud.stream.kafka.binder.zkNodes=secure.zookeeper:2181 \\
   --spring.cloud.stream.bindings.input.destination=stream.ticktock \\
   --spring.cloud.stream.kafka.binder.clientConfiguration.security.protocol=SASL_PLAINTEXT

	Note
	Exercise caution when using the `autoCreateTopics` and `autoAddPartitions` if using Kerberos. Usually applications may use principals that do not have administrative rights in Kafka and Zookeeper, and relying on Spring Cloud Stream to create/modify topics may fail. In secure environments, we strongly recommend creating topics and managing ACLs administratively using Kafka tooling.

Using the binder with Apache Kafka 0.10

The binder also supports connecting to Kafka 0.10 brokers. In order to support this, when you create the project that contains your application, include spring-cloud-starter-stream-kafka as you normally would do for 0.9 based applications. Then add these dependencies at the top of the <dependencies> section in the pom.xml file to override the Apache Kafka, Spring Kafka, and Spring Integration Kafka with 0.10-compatible versions as in the following example:

<dependency>
  <groupId>org.springframework.kafka</groupId>
  <artifactId>spring-kafka</artifactId>
  <version>1.1.0.M1</version>
</dependency>
<dependency>
  <groupId>org.springframework.integration</groupId>
  <artifactId>spring-integration-kafka</artifactId>
  <version>2.0.1.RELEASE</version>
</dependency>
<dependency>
  <groupId>org.apache.kafka</groupId>
  <artifactId>kafka_2.11</artifactId>
  <version>0.10.0.0</version>
  <exclusions>
    <exclusion>
      <groupId>org.slf4j</groupId>
      <artifactId>slf4j-log4j12</artifactId>
    </exclusion>
  </exclusions>
</dependency>

	Note
	The versions above are provided only for the sake of the example. For best results, we recommend using the most recent 0.10-compatible versions of the projects.