Confluent Platform 4.0.0 Release Notes ¶

This is a major release of the Confluent Platform that provides Confluent users with Apache Kafka 1.0.0, the latest stable version of Kafka.

Confluent Platform users are encouraged to upgrade to CP 4.0.0. The technical details of this release are summarized below.

Table of Contents

The 4.0.0 release of Confluent Control Center is backward compatible and fully tested with releases of Confluent Platform 3.2.0 and higher and Apache Kafka releases 0.10.2.0 and higher. We encourage you to install and use Control Center before upgrading Apache Kafka so you can use Control Center to monitor broker health and application availability when performing the rolling upgrade of the system.

Control Center: Improvements ¶

Control Center can now monitor significantly larger clusters. Our internal tests show that we can now successfully monitor Kafka clusters with 1 million partitions, up from 30,000 partitions in previous releases.
Control Center now shows a highly visible notification bar when the Kafka cluster it is monitoring is not available.
Bug fixes including smaller metric messages, improved stability, handling misconfigured clients and misformated events and large number of UI improvements.

Schema Registry ACL Plug-In ¶

We have added a security plugin module to Schema Registry that provides authentication and authorization of Schema Registry operations. Please refer to Schema Registry Security Plugins for more details.

JMS Client ¶

This release includes a substantial upgrade to the JMS Client:

JMS Message Headers and JMS Message Properties are now stored in Kafka message headers. This feature was added to the Kafka message format in version 0.11.0 (Confluent Platform version 3.3.0). The JMS Client now requires brokers of this version or higher.
Messages are no longer encoded in Avro format and the dependency on Schema Registry has been removed.
The group.id configuration property is removed and replaced by the optional consumer.group.id property. By default, the group.id of the underlying Kafka consumer instance is now set automatically depending on the destination type.
Session CLIENT_ACKNOWLEDGE mode is implemented.
KafkaQueueConnection and KafkaTopicConnection classes and associated functionality is implemented.
Various bug fixes and more complete coverage of the JMS Specification.

Open Source Features ¶

JBOD Disk Failure Handling ¶

Apache Kafka can now tolerate disk failures better. With KIP-112, Kafka brokers handle JBOD disk failures more gracefully. A single disk failure in a JBOD broker will not bring the entire broker down; rather, the broker will continue serving any log files that remain on functioning disks.

Metrics Improvements ¶

Operating Apache Kafka at scale requires that the system remains observable, and to make that easier, several new metrics have been added to improve health monitoring. See KIP-164, KIP-168, KIP-187 and KIP-188 for details.

Kafka Connect ¶

Kafka Connect includes a number of features and improvements. First, Kafka Connect workers now have metrics for connectors, tasks, rebalances, and worker information. The metrics can be accessed through JMX and with custom metrics reporters, just like with Kafka producers and consumers. Second, the plugin.path now supports relative symlinks. Finally, Connect will now use a separate Converter for every task, which allows for more flexibility and performance tuning.

Kafka Connect also includes a number of bug fixes. Connect classloader isolation now correctly finds JDBC drivers when they are included in a plugin. Also, topic creation now works correctly when Connect uses pre-0.11 Kafka clusters by delegating to auto-topic creation. Transformations now properly handle tombstone events with non-null keys and null values. And, Connect will commit the original topic-partition offsets even when transformation mutate the topic-partitions.

Elasticsearch Connector ¶

The Elasticsearch connector now has the ability to drop invalid messages when needed.

HDFS Connector ¶

The HDFS connector aligns its partitioners and formatters with the S3 connector. In particular, the timestamp partitioners are more capable and can extract the timestamp from the records rather than just using wallclock time to extract a timestamp for the records. Also, the JSON formatter can now be used with the HDFS connector.

Existing connector configurations that use the older partitioners and formatters are still valid and will continue to work. However, the older partitioners and formatters are deprecated and will be removed in a future release, so users are encouraged to begin evaluating the new formatters and partitioners and converting existing connector configurations.

Several fixes are also included and a number of dependent libraries were upgraded. All configurations that specify a class now have recommenders, making it much easier to configure HDFS connectors using C3.

S3 Connector ¶

The S3 connector can now use S3 proxies, optionally write records using a provided ACL header, can write records at the root of the S3 bucket, and will retry part uploads when needed. A new ByteArrayFormat formatter makes it very easy to store the binary form of the records in S3. All configurations that specify a class now have recommenders, making it much easier to configure S3 connectors using C3. And finally, the connector has several other minor fixes and logging improvements.

JDBC Source Connector ¶

The JDBC source connector has a number of small improvements. First, the connector now retries to establish JDBC connections to the database when needed. Second, the JDBC source connector is now able to get the current timestamp when using DB2 UDB on AS/400.

Schema Registry ¶

Schema Registry now supports master election via the Kafka Coordinator protocol. This enables Schema Registry to be deployed without connectivity to Zookeeper. Please refer to Schema Registry Design for more details.

Schema Registry now supports https for forwarding write requests from slave to master node.

KafkaAvroSerializers can now be configured to disable schema registration during serialization.

Confluent CLI ¶

Adds support for setting ACLs for services, and properly sets the plugin.path when the Connect distributed worker configuration uses a relative path.

Kafka Streams: Extended Punctuation Semantics ¶

Applications that use the Processor API in Kafka Streams can now customize their punctuation methods based on either stream-time or wall-clock-time, to make the punctuation either data-driven or wall-clock time driven. For more details about this change, please read the Kafka Streams Developer Guide.

Kafka Streams: Deserialization Error Handling ¶

Managing records that are corrupted or malformatted is a common scenario in stream processing. You can now customize the respective error handling logic of your applications. For example, your applications can now decide wether to skip over corrupted records. In previous versions, corrupted records triggered deserialization exceptions that would always cause application instances to fail-fast and terminate. Please read the Streams FAQ section on handling corrupted records for more details.

Kafka Streams: State Restoration Listener ¶

You can use the new StateRestoreListener interface to monitor the state restoration progress during application initialization and re-initialization. For each state store, you can receive notifications when the restoration process has started, ended, and when a list of changelog records has been applied. Custom state store engines can also inject their own processing logic during the restoration process. For more details about this change, please read KIP-167.

Apache Kafka 1.0.0-cp1 ¶

Security Failure Handling ¶

Authentication error handling has been improved to provide feedback to clients. Previously, some authentication error conditions were indistinguishable from broker failures and were not logged in a clear way. Authentication failures are now handled as non-retriable errors that are logged and propagated to clients.

Java 9 Support ¶

Apache Kafka now supports Java 9, enabling significantly faster TLS and CRC32C implementations. Over-the-wire encryption will be faster now, which will keep Kafka fast and compute costs low when encryption is enabled.

Note that Java 9 support has been enabled only for Apache Kafka and other components of Confluent Platform do not support Java 9 yet.

Idempotent Producer Performance ¶

Throughput of idempotent producers (which are also used for exactly-once processing) have been significantly improved by relaxing the restriction on max.in.flight.requests.per.connection being limited to one. Now this can be as large as five, relaxing the constraint on maximum throughput.

More Bug Fixes and Improvements ¶

29 new features / config changes has been added along with over 250 bug fixes and performance improvements. For a full list of changes in this release of Apache Kafka, see the 1.0.0 Release Notes.

Previous releases ¶

Confluent Platform 3.3.1 Release Notes ¶

This is a bugfix release of the Confluent Platform that provides Confluent users with Apache Kafka 0.11.0.1, the latest stable version of Kafka and additional bug fixes.

Confluent Platform users are encouraged to upgrade to CP 3.3.1 as it includes important bug fixes. The technical details of this release are summarized below.

Table of Contents

Contents

Confluent Platform 4.0.0 Release Notes

local:

depth:	2

Highlights ¶

Enterprise Features ¶

Control Center ¶

Fixed an issue that could cause long request times if a cluster had brokers with noncontiguous ids
Additional logging during startup
Fixed an issue that could cause Control Center to show blank graphs when data was present
Set and enforce TimestampType for internal Control Center topics at startup
Fixed an issue that caused us to link to an invalid consumer group on Streams Monitoring
Fixed an issue where Broker Monitoring could check for data last update time on wrong cluster
Allow overriding ConfluentMetricsReporter default producer settings
Allow using a file for confluent.license config
Fixed an issue that resulted in Offline and Under Replicated Partitions data to be swapped
Fixed an issue that could cause Control Center to shutdown if it lost connection to Kafka

Open Source Features ¶

Elasticsearch Sink Connector ¶

Added a feature flag to control how string-keyed map entries within record values are serialized into JSON. Prior to the 3.3.0, each map entry was written to JSON using an expanded pair of fields: one for the entry’s key, and one for the entry’s value. The 3.3.0 release included a change that used a single JSON field for each string-keyed entry, but this was not backward compatible. This latest behavior is still the default, but a new feature flag is included so that the older behavior can be used.

When the Elasticsearch connector is not able to communicate with the Elasticsearch backend service, the connector now does not fail after 3 attempts but instead waits until communication can be restored. The connector uses expontential backoff with jitter (randomization) to reduce the chance of a thundering herd problem when many tasks all using the same backend lose communication all at once. This improves the overall stability of the connector.

HDFS Sink Connector ¶

Several bugs were addressed, and new recommenders were added for properties that specify a class. This improves the experience in C3.

S3 Sink Connector ¶

Addressed bugs and added new recommenders for properties that specify a class, improving the experience in C3. Also, configurations can now specify URLs and regions together. Finally, the connector’s Avro format now correctly writes primitive values.

JDBC Source Connector ¶

The connector’s recommender for table names is faster and more efficient, will cache results for a short period of time, and will properly close the JDBC connection.

librdkafka 0.11.1 ¶

These changes also apply to librdkafka-based clients, such as confluent-kafka-go, confluent-kafka-python and confluent-kafka-dotnet.

#1489: Fix OpenSSL instability on Windows - the bug resulted in SSL connections being torn down for no apparent reason.
#1384: Fix Consumer performance regression in v0.11.0.
#1386: Fix Consumer Message Null Key/Value regression in v0.11.0.

Full v0.11.1 release notes.

The default Producer batch linger time linger.ms was changed from 1000ms to 0ms in the previous librdkafka release (v0.11.0), improving latency at the cost of performance. If performance is a concern the suggestion is to set linger.ms to 50ms.

Kafka Streams: Improved Operability due to Rebalancing Optimizations ¶

Kafka Streams rebalance latency has been further improved by delaying the state restoration process after task initialization has been done. The rebalance callback completes much faster now. In prior versions, the slower rebalance callback was often causing the broker-side group coordinator to believe that application instances failed, which resulted in unnecessary rebalances of the application. The operability of Kafka Streams applications has been greatly improved by this change:

Applications start, stop, and restart faster (notably during rolling restarts and rolling upgrades).
Failover behavior of applications is faster and more efficient.
Elastic scalability of applications is faster and more efficient, i.e., when scaling out (adding capacity) or scaling in (removing capacity) during live operations.
Less network traffic between and less load on application instances (client side) and Kafka clusters (server side) when rebalancing happens.

Apache Kafka 0.11.0.1-cp1 ¶

KAFKA-6116: Major performance issue due to excessive logging during leader election
KAFKA-6026: Fix for indefinite wait in KafkaFutureImpl
KAFKA-6042: Avoid deadlock between two groups with delayed operations
KAFKA-6003: Accept appends on replicas and when rebuilding the log unconditionally
KAFKA-5970: Use ReentrantLock for delayed operation lock to avoid blocking
KAFKA-5152: Delay the restoration process after the rebalance in Kafka Streams
KAFKA-6087: Scanning plugin.path needs to support relative symlinks
KAFKA-5567: With transformations that mutate the topic-partition committing offsets should to refer to the original topic-partition
KAFKA-6134: High memory usage on controller during partition reassignment
KAFKA-5986: Streams State Restoration never completes when logging is disabled
KAFKA-5634: Do not allow segment deletion beyond high watermark
KAFKA-6030: Fix Integer overflow in cleanable ratio computation
KAFKA-5747: Producer snapshot loading should cover schema errors
KAFKA-5752: Update index files correctly during async delete
KAFKA-5610: WriteTxnMarker handler should return UNKNOWN_TOPIC_OR_PARTITION if replica is not available
KAFKA-5431: cleanSegments should not set length for cleanable segment files
KAFKA-5611: AbstractCoordinator should handle wakeup raised from onJoinComplete
KAFKA-5417: Fix Selector’s handling of some failures during prepare
KAFKA-5700: Producer should not drop header information when splitting batches
KAFKA-5630: Consumer should block on corrupt records and keep throwing an exception
KAFKA-5512: Awake the heartbeat thread when timetoNextHeartbeat is equal to 0
KAFKA-5659: Fix error handling, efficiency issue in AdminClient#describeConfigs
KAFKA-5658: Fix AdminClient request timeout handling bug resulting in continual BrokerNotAvailableExceptions
KAFKA-5167: Fix the LockException issue in Kafka Streams task assignment process during rebalance
KAFKA-5797: StoreChangelogReader should be resilient to broker-side metadata not available
KAFKA-5717: Fix null values in state store put operations as delete markers in Kafka Streams
KAFKA-5659: Fix error handling, efficiency issue in AdminClient#describeConfigs
KAFKA-5818: Fix Kafka Streams finite state machine transition diagram
KAFKA-5704: Auto topic creation causes failure with older clusters
KAFKA-5731: Connect WorkerSinkTask out of order offset commit can lead to inconsistent state
KAFKA-5756: Synchronization issue on flush

Confluent Platform 3.3.0 Release Notes ¶

This is a major release of the Confluent Platform that provides Confluent users with Apache Kafka 0.11.0.0, the latest stable version of Kafka. In addition, this release includes a Confluent CLI, Rest Proxy ACLs Plug-in, Control Center interceptors for our non-Java clients (Python, Go and C#). This release also includes major improvements to Confluent Control Center.

The technical details of this release are summarized below.

Highlights ¶

Enterprise Features ¶

REST Proxy ACL Plug-In ¶

We have added a security plugin module to REST Proxy that allows credentials of authenticated incoming requests to be propagated to Kafka and authorization enforced via Kafka ACL’s.

Control Center Interceptors for Non-Java Clients ¶

We have added a new plugin architecture and metrics interceptor to librdkafka. Confluent Control Center is now able to track delivery of messages produced and consumed by all Confluent supported client libraries (Java, C, C++, Python, Go and .NET).

Control Center ¶

The ‘Streams Monitoring’ capabilities were updated to signifiantly increase the number of streaming applications which can be concurrently monitored. The smallest viewable time bucket in the Streams Monitoring charting UI also changed from 15 seconds to 1 minute and the UI itself was made faster and more responsive.
Time to restore cached local state after an unclean shutdown of Control Center or when relocating it to a new server greatly reduced.
Reduced the disk space consumed by old data accumulated over long periods of time.
Alert UI should render more information about triggered alerts.
Alert triggers can be created directly though the UI on the System Health panel.

Metrics Publisher ¶

The metrics publisher used by Control Center and Automatic Data Balancing has been updated to use the Kafka protocol instead of ZooKeeper for creating the _confluent-metrics topic. As such, confluent.metrics.reporter.zookeeper.connect is now unused.

Open Source Features ¶

Exactly-Once Semantics for Apache Kafka ¶

Prior to this release, Apache Kafka’s best delivery guarantee was at-least-once, in-order delivery per partition. In particular producer retries during broker failures could cause duplicates. Additionally, Kafka Streams applications might have reprocessed inputs in some failure scenarios.

With this release of Apache Kafka, the delivery guarantees are strengthened with the addition of the following features:

Idempotent producer: Exactly-once, in-order, delivery per partition. This can be enabled by setting enable.idempotence=true in the producer config.
Transactions: Atomic writes across multiple topics and partitions. Consumers can be configured to read only committed transactional messages, but may not necessarily read all messages of a given transaction atomically.
Exactly once stream processing: Kafka Streams applications are now able to perform read-process-write operations atomically, thus solving the problem of reprocessing inputs in failure scenarios. This can be enabled by setting processing.mode=exactly_once in the streams configuration.

For more details, have a look at the the KIP.

Confluent CLI ¶

A unified Confluent Command Line Interface (CLI) is introduced in order to simplify service startup and management for app developers. The new confluent command allows developers to iterate more quickly when implementing their apps and interact in a more robust way with all the services of the Confluent ecosystem. At the same time, the flexibility and fine-tuning capabilities offered to anyone who is deploying Confluent services in production remain intact.

Non-Java Clients ¶

We have added support for the new Kafka message formats to librdkafka and Confluent’s other non-Java clients (Python, .NET and Go) making them transparently compatible with the EOS (Exactly Once Semantics) supporting Java client.

The default value of the api.version.request configuration property has changed. For more information, refer to the upgrade guide.

Kafka Streams: Confluent serializers/deserializers (serdes) for Avro ¶

We now provide out-of-the-box serdes for both generic and specific Avro, which integrate with Confluent Schema Registry. These serdes make it much easier to read and/or write data in Avro format in your applications.

Kafka Streams: Capacity Planning Guide available ¶

Our new Capacity Planning Guide helps you to size your applications correctly. It covers how much memory, CPU, or network resources you need for your applications, how many client machines/VMs/containers you should run for your applications, and much more.

Apache Kafka 0.11.0 ¶

Admin client ¶

We have introduced a Java Admin client with support for managing and inspecting topics, brokers, configurations and ACLs.

Record headers ¶

It is now possible to produce and consume records containing headers. A header is a key/value pair and multiple values per key are supported. The key is a string and the value is an array of bytes.

Leader epoch in replication protocol ¶

The replication protocol has been enhanced so that multiple hard failures (e.g. DC power outage) or quick leadership changes after log truncation will no longer lead to data loss or log divergence in rare cases.

Single-threaded controller ¶

The first step in our plan to make the Kafka Controller more robust under certain failure scenarios. An immediate benefit is eliminating the possibility of deadlocks such as KAFKA-4595. And upcoming improvements will be easier to implement.

Classpath Isolation for Apache Kafka Connectors ¶

Mixing different connectors in the same cluster has been a bit challenging. There was always a risk that these connectors bring with them different versions of the same dependency, leading to classpath errors. In this release, we’ve added classpath isolation feature to Kafka’s Connect API. This feature is disabled by default to reduce the upgrade risk and is enabled by specifying plugin.path in the connect worker configuration file and listing the directories that contain the Connector plugin Jars.

Request rate quotas ¶

Byte rate quotas were introduced in 0.9.0.0. Request rate quotas complement byte rate quotas by throttling clients that are sending an excessive number of small requests.

Kafka Streams: rebalancing improvements ¶

Rebalancing enables elasticity and fault-tolerance of Kafka Streams applications. Previously, rebalancing was a rather costly operation. In this release, we reduced the cost of rebalancing operations and also reduced the likelihood of rebalancing operations to happen in the first place. The net effect is that applications are now faster to start up, to restart, to shutdown, to failover, and to elastically scale out and scale in (i.e., adding and removing application instances). Also, there is now less network traffic and less load on both application instances (client side) and Kafka brokers (server side) when rebalancing does happen.

More bug fixes and improvements ¶

Over 400 bug fixes and performance improvements. For a full list of changes in this release of Apache Kafka, see the 0.11.0.0 Release Notes.

Confluent Platform 3.2.2 Release Notes ¶

This is a bugfix release of the Confluent Platform that provides Confluent users with Apache Kafka 0.10.2.1, the latest stable version of Kafka and additional bug fixes.

Confluent Platform users are encouraged to upgrade to CP 3.2.2 as it includes important bug fixes. The technical details of this release are summarized below.

Highlights ¶

Apache Kafka 0.10.2.1-cp2 ¶

KAFKA-5316: Log cleaning can increase message size and cause cleaner to crash with buffer overflow
KAFKA-5232: Kafka broker fails to start if a topic containing dot in its name is marked for delete but hasn’t been deleted during previous uptime
KAFKA-5118: Improve message for Kafka failed startup with non-Kafka data in data.dirs
KAFKA-5150: LZ4 decompression is 4-5x slower than Snappy on small batches / messages
KAFKA-5395: Distributed Herder Deadlocks on Shutdown
KAFKA-5230: Recommended values for Connect transformations contain the wrong class name
KAFKA-4965: set internal.leave.group.on.close to false in KafkaStreams
KAFKA-5167: streams task gets stuck after re-balance due to LockException
KAFKA-5205: CachingSessionStore doesn’t use the default keySerde
KAFKA-5206: RocksDBSessionStore doesn’t use default aggSerde
KAFKA-5345: Some socket connections not closed after restart of Kafka Streams
KAFKA-5241: GlobalKTable does not checkpoint offsets after restoring state

Confluent Control Center ¶

Added shutdown script.
Force faster compaction and cleanup on changelog topics.
Several Streams improvements to reduce state-store restoration time after unclean shutdown.
Reduce the volume of data to be restored to state-stores on startup.
Fixed possible concurrency problem in setting RocksDB configuration options.
Fixed possible deadlock on exception after streams startup.

Replicator ¶

Fix concurrent modification of source topic creation/expansion collection
Validate record timestamp value instead of record timestamp type, providing compatibility with data generated by older clients

S3 Connector ¶

PR-18 - CC-497: Add timestamp based partitioners.
PR-19 - CC-500: Provide exactly-once time-based partitioning in S3

Confluent Platform 3.2.1 Release Notes ¶

This is a bugfix release of the Confluent Platform that provides Confluent users with Apache Kafka 0.10.2.1, the latest stable version of Kafka.

The technical details of this release are summarized below.

Highlights ¶

Apache Kafka 0.10.2.1 ¶

For a full list of changes in this release of Apache Kafka, see the 0.10.2.1 Release Notes.

Fixes ¶

KAFKA-4943: SCRAM secret’s should be better protected with Zookeeper ACLs
KAFKA-4864: Kafka Secure Migrator tool doesn’t secure all the nodes
KAFKA-4959: Remove controller concurrent access to non-threadsafe NetworkClient, Selector, and SSLEngine
KAFKA-4861: log.message.timestamp.type=LogAppendTime breaks Kafka based consumers
KAFKA-4788: Broker level configuration ‘log.segment.bytes’ not used when ‘segment.bytes’ not configured per topic.
KAFKA-4901: Make ProduceRequest thread-safe
KAFKA-4631: Refresh consumer metadata more frequently for unknown subscribed topics
KAFKA-4791: Kafka Streams - unable to add state stores when using wildcard topics on the source
KAFKA-4848: Stream thread getting into deadlock state while trying to get rocksdb lock in retryWithBackoff
KAFKA-4851: SessionStore.fetch(key) is a performance bottleneck
KAFKA-4863: Querying window store may return unwanted keys
KAFKA-4916: Add streams tests with brokers failing
KAFKA-4919: Document that stores must not be closed when Processors are closed
KAFKA-5003: StreamThread should catch InvalidTopicException
KAFKA-5038: Running multiple kafka streams instances causes one or more instance to get into file contention
KAFKA-5040: Increase number of Streams producer retries from the default of 0
KAFKA-4837: Config validation in Connector plugins need to compare against both canonical and simple class names
KAFKA-4878: Kafka Connect does not log connector configuration errors

S3 Connector ¶

Fixes ¶

PR-33 - Separate JSON records using line separator instead of single white space.
PR-34 - Reduce the default s3.part.size to 25MB to avoid OOM exceptions with the current default java heap size settings for Connect.

Confluent Platform 3.2.0 Release Notes ¶

This is a minor release of the Confluent Platform that provides Confluent users with Apache Kafka 0.10.2.0, the latest stable version of Kafka. In addition, this release includes Confluent’s S3 Connector, Rest Proxy Security, JMS client and a .Net client. This release also includes major feature upgrades to Confluent Control Center and critical bug fixes to Auto Data Balancing and Replicator.

Confluent Platform users are encouraged to upgrade to Confluent Platform 3.2.0. The technical details of this release are summarized below.

Highlights ¶

Enterprise Features ¶

JMS Client ¶

Confluent Enterprise now includes a JMS-compatible client for Apache Kafka. This Kafka client implements the JMS 1.1 standard API, using Apache Kafka brokers as the backend. This is useful if you have legacy applications using JMS, and you would like to replace the existing JMS message broker with Apache Kafka. By replacing the legacy JMS message broker with Apache Kafka, existing applications can integrate with your modern streaming platform without a major rewrite of the application.

Confluent Control Center ¶

This release adds significant new monitoring and alerting features for clusters, brokers and topics to Confluent Control Center. Leveraging the experience Confluent has with dozens of the world’s largest Kafka installations, Control Center encodes operational best practices by distilling over 150 different available metrics from a running cluster into a manageable set of KPIs and supporting details to provide both broker-centric and topic-centric insights into platform health.

Automatic Data Balancing ¶

Automatic Data Balancing capability was added to Confluent Enterprise in release 3.1.0. In release 3.2.0 we improved the feature by making it safer to run unsupervised. In this release, a rebalance will not cause a broker to run out of disk space if the broker has a single log directory (this feature will be expanded to support multiple log directories in a future release). In addition, brokers with auto-generated broker ids are now supported.

Open Source Features ¶

Kafka Streams: Backwards Compatible to Older Kafka Clusters ¶

This (and future versions) of Kafka’s Streams API are now backwards-compatible with older Kafka clusters. This means you can upgrade your applications to use the latest version of the Streams API without having to first upgrade your Kafka clusters. Also, as has been the case before, the Streams API is still forwards-compatible, i.e. older versions of the Streams API can talk to newer Kafka clusters, too. This gives you full flexibility when planning and performing upgrades of applications and clusters across teams in your organization.

Kafka Streams: Compatibility Matrix ¶

	Kafka Broker (columns)
Streams API (rows)	3.0.x / 0.10.0.x	3.1.x / 0.10.1.x	3.2.x / 0.10.2.x
3.0.x / 0.10.0.x	compatible	compatible	compatible
3.1.x / 0.10.1.x		compatible	compatible
3.2.x / 0.10.2.x		compatible	compatible

See the Upgrade Guide of Kafka Streams for more information.

Kafka Streams: Session Windows ¶

You can now group data records into sessions, which is an essential feature for user behavior analysis in particular. Sessions represent a period of activity separated by a defined gap of inactivity. Session-based analyses can range from simple metrics (e.g. count of user visits to a website) to more complex metrics (e.g. customer conversion funnel and event flows). See Windowing in the Developer Guide of Kafka Streams for details.

Kafka Streams: Global KTables ¶

In addition to the existing KTable interface, which represents partitioned changelog streams, you can now leverage the GlobalKTable interface, which represents fully replicated changelog streams. Here, when a Kafka topic is read into a global table, each KafkaStreams instance (with the same application.id) will get a full copy of the data, rather than just a subset of partitions. Global tables allow you to perform “star joins”, to join against foreign keys, and they make many use cases feasible where you must work on heavily skewed data and thus suffer from hot partitions. Also, global tables are often more efficient than their KTable counterpart when you need to perform multiple joins in succession. See Creating source streams from Kafka in the Developer Guide of Kafka Streams for details.

Kafka Streams: ZooKeeper Dependency Removed ¶

Applications that use the Streams API no longer need to talk to ZooKeeper, which they previously did for Kafka topic management. This results in a simpler as well as a more secure Kafka infrastructure because you can now lock down network access to your ZooKeeper ensembles. The StreamsConfig.ZOOKEEPER_CONNECT_CONFIG aka zookeeper.connect application setting is now deprecated and ignored in 0.10.2.

Kafka Connect: Single Message Transform ¶

New feature of Kafka’s Connect API. Single Message Transformations allows you to modify events before they make it into Kafka, or on the way out. You can now mask personally identifiable information, perform data-based event routing, and normalize data formats. You can modify events without any coding by adding our built-in transformations into the connector configuration. A flexible API lets you implement your own transformations where needed.

S3 Connector ¶

Exactly-once connector that streams events from Kafka to S3 buckets. This connector supports Avro and JSON data formats and multiple partitioners, so your data will be available in S3 and easy to integrate with EMR, Redshift and other AWS services.

.Net Client ¶

We’re introducing a fully supported, up to date client for .NET languages, including C#. The client is reliable, high-performance, secured, compatible with all versions of Kafka and supports the full Kafka protocol.

REST Proxy ¶

In this release we’ve added a new version of the REST APIs (v2). The v2 APIs let you use the new (0.9.0.0) Apache Kafka consumer protocol features. This includes:

Option to configure a secure connection (SSL, SASL_SSL or SASL_PLAINTEXT) between REST Proxy and Apache Kafka
No need to choose between high level consumer and simple consumer - simply create a consumer and subscribe to topics as part of a group or assign specific partitions to the consumer directly.
Get the list of topics and partitions a consumer subcribed to
Choose between automatic or manual commit of offsets
Seek to specific offsets in partitions

Apache Kafka 0.10.2 ¶

Over 200 bug fixes and performance improvements. For a full list of changes in this release of Apache Kafka, see the 0.10.2 Release Notes.

Confluent Platform 3.1.2 Release Notes ¶

This is a minor release of the Confluent Platform that provides Confluent users with Apache Kafka 0.10.1.1, the latest stable version of Kafka.

Confluent Platform users are encouraged to upgrade to Confluent Platform 3.1.2 as it includes important bug fixes. The technical details of this release are summarized below.

Highlights ¶

Apache Kafka 0.10.1.1 ¶

For a full list of changes in this release of Apache Kafka, see the 0.10.1.1 Release Notes.

Fixes ¶

KAFKA-1464: Add a throttling option to the Kafka replication tool
KAFKA-4313: ISRs may thrash when replication quota is enabled
KAFKA-3994: Deadlock between consumer heartbeat expiration and offset commit.
KAFKA-4497: log cleaner breaks on timeindex
KAFKA-4529: tombstone may be removed earlier than it should
KAFKA-4469: Consumer throughput regression caused by inefficient list removal and copy
KAFKA-4472: offsetRetentionMs miscalculated in GroupCoordinator
KAFKA-4362: Consumer can fail after reassignment of the offsets topic partition
KAFKA-4384: ReplicaFetcherThread stopped after ReplicaFetcherThread received a corrupted message
KAFKA-4205: NullPointerException in fetchOffsetsBefore

Confluent Platform 3.1.1 Release Notes ¶

This is a minor release of the Confluent Platform that provides Confluent users with Apache Kafka 0.10.1.0, the latest stable version of Kafka. In addition, this release includes Confluent’s multi-datacenter replication and automatic data balancing tool.

Confluent Platform users are encouraged to upgrade to Confluent Platform 3.1.1 as it includes both new major functionality as well as important bug fixes. The technical details of this release are summarized below.

Highlights ¶

Enterprise Features ¶

Automatic Data Balancing ¶

Confluent Enterprise now includes Auto Data Balancing. As clusters grow, topics and partitions grow in different rates, brokers are added and removed and over time this leads to unbalanced workload across datacenter resources. Some brokers are not doing much at all, while others are heavily taxed with large or many partitions, slowing down message delivery. When executed, this feature monitors your cluster for number of brokers, size of partitions, number of partitions and number of leaders within the cluster. It allows you to shift data to create an even workload across your cluster, while throttling rebalance traffic to minimize impact on production workloads while rebalancing.

Multi-Datacenter Replication ¶

Confluent Enterprise now makes it easier than ever to maintain multiple Kafka clusters in multiple data centers. Managing replication of data and topic configuration between data centers enables use-cases such as:

Active-active geo-localized deployments: allows users to access a near-by data center to optimize their architecture for low latency and high performance
Centralized analytics: Aggregate data from multiple Kafka clusters into one location for organization-wide analytics
Cloud migration: Use Kafka to synchronize data between on-prem applications and cloud deployments

The new Multi-Datacenter Replication feature allows configuring and managing replication for all these scenarios from either Confluent Control Center or command-line tools.

Confluent Control Center ¶

Control Center added the capability to define alerts on the latency and completeness statistics of data streams, which can be delivered by email or queried from a centralized alerting system. In this release the stream monitoring features have also been extended to monitor topics from across multiple Kafka clusters and access to the Control Center can be protected via integration with enterprise authentication systems.

Open Source Features ¶

Kafka Streams: Interactive Queries ¶

Interactive queries let you get more from streaming than just the processing of data. This feature allows you to treat the stream processing layer as a lightweight embedded database and, more concretely, to directly query the latest state of your stream processing application, without needing to materialize that state to external databases or external storage first.

As a result, interactive queries simplify the architecture of many use cases and lead to more application-centric architectures. For example, you often no longer need to operate and interface with a separate database cluster – or a separate infrastructure team in your company that runs that cluster – to share data between a Kafka Streams application (say, an event-driven microservice) and downstream applications, regardless of whether these applications use Kafka Streams or not; they may even be applications that do not run on the JVM, e.g. implemented in Python, C/C++, or JavaScript.

Kafka Streams: Application Reset Tool ¶

The Application Reset Tool allows you to quickly reset a Kafka Streams application in order to reprocess its input data from scratch – think: an application “reset” button.

Kafka Streams: Improved memory management ¶

Kafka Streams applications now benefit from record caches. Notably, these caches are used to compact output records (similar to Kafka’s log compaction) so that fewer updates for the same record key are being sent downstream. These new caches are enabled by default and typically result in reduced load on your Kafka Streams application, your Kafka cluster, and/or downstream applications and systems such as external databases. However, these caches can also be disabled, if needed, to restore the Confluent Platform 3.0.x behavior of your applications.

Kafka Connect ¶

In this release, we’ve added two new connectors to Confluent Open Source:

Elasticsearch Sink - High performance connector to stream data from Kafka to Elasticsearch. This connector supports automatic generation of Elasticsearch mappings, all Elasticsearch datatypes and exactly-once delivery
JDBC Sink - Stream data from Kafka to any relational database. This connector supports inserts and upserts, schema evolution and exactly-once delivery.

Go Client ¶

We’re introducing a fully supported, up to date client for Go. The client is reliable, high-performance, secured, compatible with all versions of Kafka and supports the full Kafka protocol.

Apache Kafka 0.10.1 ¶

Here is a quick overview of the notable Kafka-related changes in the release:

New Feature ¶

KAFKA-1464: Add a throttling option to the Kafka replication tool
KAFKA-3176: Allow console consumer to consume from particular partitions when new consumer is used.
KAFKA-3492: support quota based on authenticated user name
KAFKA-3776: Unify store and downstream caching in streams
KAFKA-3858: Add functions to print stream topologies
KAFKA-3909: Queryable state for Kafka Streams
KAFKA-4015: Change cleanup.policy config to accept a list of valid policies
KAFKA-4093: Cluster id

Improvement ¶

KAFKA-724: Allow automatic socket.send.buffer from operating system
KAFKA-1981: Make log compaction point configurable
KAFKA-2063: Bound fetch response size (KIP-74)
KAFKA-2547: Make DynamicConfigManager to use the ZkNodeChangeNotificationListener introduced as part of KAFKA-2211
KAFKA-2800: Update outdated dependencies
KAFKA-3158: ConsumerGroupCommand should tell whether group is actually dead
KAFKA-3281: Improve message of stop scripts when no processes are running
KAFKA-3282: Change tools to use new consumer if zookeeper is not specified
KAFKA-3283: Remove beta from new consumer documentation
KAFKA-3479: Add new consumer metrics documentation
KAFKA-3595: Add capability to specify replication compact option for stream store
KAFKA-3644: Use Boolean protocol type for StopReplicaRequest delete_partitions
KAFKA-3680: Make Java client classloading more flexible
KAFKA-3683: Add file descriptor recommendation to ops guide
KAFKA-3697: Clean-up website documentation when it comes to clients
KAFKA-3699: Update protocol page on website to explain how KIP-35 should be used
KAFKA-3711: Allow configuration of MetricsReporter subclasses
KAFKA-3732: Add an auto accept option to kafka-acls.sh
KAFKA-3748: Add consumer-property to console tools consumer (similar to –producer-property)
KAFKA-3753: Add approximateNumEntries() to the StateStore interface for metrics reporting
KAFKA-3760: Set broker state as running after publishing to ZooKeeper
KAFKA-3762: Log.loadSegments() should log the message in exception
KAFKA-3765: Code style issues in Kafka
KAFKA-3768: Replace all pattern match on boolean value by if/elase block.
KAFKA-3771: Improving Kafka code
KAFKA-3775: Throttle maximum number of tasks assigned to a single KafkaStreams
KAFKA-3824: Docs indicate auto.commit breaks at least once delivery but that is incorrect
KAFKA-3842: Add Helper Functions Into TestUtils
KAFKA-3844: Sort configuration items in log
KAFKA-3845: Support per-connector converters
KAFKA-3846: Connect record types should include timestamps
KAFKA-3847: Connect tasks should not share a producer
KAFKA-3849: Add explanation on why polling every second in MirrorMaker is required
KAFKA-3888: Allow consumer to send heartbeats in background thread (KIP-62)
KAFKA-3920: Add Schema source connector to Kafka Connect
KAFKA-3922: Add a copy-constructor to AbstractStream
KAFKA-3936: Validate user parameters as early as possible
KAFKA-3942: Change IntegrationTestUtils.purgeLocalStreamsState to use java.io.tmpdir
KAFKA-3954: Consumer should use internal topics information returned by the broker
KAFKA-3997: Halting because log truncation is not allowed and suspicious logging
KAFKA-4012: KerberosShortNamer should implement toString()
KAFKA-4013: SaslServerCallbackHandler should include cause for exception
KAFKA-4016: Kafka Streams join benchmark
KAFKA-4044: log actual socket send/receive buffer size after connecting in Selector
KAFKA-4050: Allow configuration of the PRNG used for SSL
KAFKA-4052: Allow passing properties file to ProducerPerformance
KAFKA-4053: Refactor TopicCommand to remove redundant if/else statements
KAFKA-4062: Require –print-data-log if –offsets-decoder is enabled for DumpLogOffsets
KAFKA-4063: Add support for infinite endpoints for range queries in Kafka Streams KV stores
KAFKA-4070: Implement a useful Struct.toString()
KAFKA-4112: Remove alpha quality label from Kafka Streams in docs
KAFKA-4151: Update public docs for KIP-78
KAFKA-4165: Add 0.10.0.1 as a source for compatibility tests where relevant
KAFKA-4177: Replication Throttling: Remove ThrottledReplicationRateLimit from Server Config
KAFKA-4244: Update our website look & feel

Bug ¶

KAFKA-1196: java.lang.IllegalArgumentException Buffer.limit on FetchResponse.scala + 33
KAFKA-2311: Consumer’s ensureNotClosed method not thread safe
KAFKA-2684: Add force option to TopicCommand & ConfigCommand to suppress console prompts
KAFKA-2894: WorkerSinkTask doesn’t handle rewinding offsets on rebalance
KAFKA-2932: Adjust importance level of Kafka Connect configs
KAFKA-2935: Remove vestigial CLUSTER_CONFIG in WorkerConfig
KAFKA-2941: Docs for key/value converter in Kafka connect are unclear
KAFKA-2948: Kafka producer does not cope well with topic deletions
KAFKA-2971: KAFKA - Not obeying log4j settings, DailyRollingFileAppender not rolling files
KAFKA-3054: Connect Herder fail forever if sent a wrong connector config or task config
KAFKA-3111: java.lang.ArithmeticException: / by zero in ConsumerPerformance
KAFKA-3218: Kafka-0.9.0.0 does not work as OSGi module
KAFKA-3396: Unauthorized topics are returned to the user
KAFKA-3400: Topic stop working / can’t describe topic
KAFKA-3500: KafkaOffsetBackingStore set method needs to handle null
KAFKA-3501: Console consumer process hangs on SIGINT
KAFKA-3525: max.reserved.broker.id off-by-one error
KAFKA-3561: Auto create through topic for KStream aggregation and join
KAFKA-3562: Null Pointer Exception Found when delete topic and Using New Producer
KAFKA-3645: NPE in ConsumerGroupCommand and ConsumerOffsetChecker when running in a secure env
KAFKA-3650: AWS test script fails to install vagrant
KAFKA-3682: ArrayIndexOutOfBoundsException thrown by SkimpyOffsetMap.get() when full
KAFKA-3691: Confusing logging during metadata update timeout
KAFKA-3710: MemoryOffsetBackingStore creates a non-daemon thread that prevents clean shutdown
KAFKA-3716: Check against negative timestamps
KAFKA-3719: Pattern regex org.apache.kafka.common.utils.Utils.HOST_PORT_PATTERN is too narrow
KAFKA-3723: Cannot change size of schema cache for JSON converter
KAFKA-3735: RocksDB objects needs to be disposed after usage
KAFKA-3740: Enable configuration of RocksDBStores
KAFKA-3742: Can’t run connect-distributed.sh with -daemon flag
KAFKA-3761: Controller has RunningAsBroker instead of RunningAsController state
KAFKA-3767: Failed Kafka Connect’s unit test with Unknown license.
KAFKA-3769: KStream job spending 60% of time writing metrics
KAFKA-3781: Errors.exceptionName() can throw NPE
KAFKA-3782: Transient failure with kafkatest.tests.connect.connect_distributed_test.ConnectDistributedTest.test_bounce.clean=True
KAFKA-3786: Avoid unused property from parent configs causing WARN entries
KAFKA-3794: Add Stream / Table prefix in print functions
KAFKA-3807: OffsetValidationTest - transient failure on test_broker_rolling_bounce
KAFKA-3809: Auto-generate documentation for topic-level configuration
KAFKA-3810: replication of internal topics should not be limited by replica.fetch.max.bytes
KAFKA-3812: State store locking is incorrect
KAFKA-3830: getTGT() debug logging exposes confidential information
KAFKA-3840: OS auto tuning for socket buffer size in clients not allowed through configuration
KAFKA-3850: WorkerSinkTask should retry commits if woken up during rebalance or shutdown
KAFKA-3852: Clarify how to handle message format upgrade without killing performance
KAFKA-3854: Subsequent regex subscription calls fail
KAFKA-3864: Kafka Connect Struct.get returning defaultValue from Struct not the field schema
KAFKA-3894: LogCleaner thread crashes if not even one segment can fit in the offset map
KAFKA-3896: Unstable test KStreamRepartitionJoinTest.shouldCorrectlyRepartitionOnJoinOperations
KAFKA-3915: LogCleaner IO buffers do not account for potential size difference due to message format change
KAFKA-3916: Connection from controller to broker disconnects
KAFKA-3929: Add prefix for underlying clients configs in StreamConfig
KAFKA-3930: IPv6 address can’t used as ObjectName
KAFKA-3934: Start scripts enable GC by default with no way to disable
KAFKA-3937: Kafka Clients Leak Native Memory For Longer Than Needed With Compressed Messages
KAFKA-3938: Fix consumer session timeout issue in Kafka Streams
KAFKA-3945: kafka-console-producer.sh does not accept request-required-acks=all
KAFKA-3946: Protocol guide should say that Produce request acks can only be 0, 1, or -1
KAFKA-3949: Consumer topic subscription change may be ignored if a rebalance is in progress
KAFKA-3952: VerifyConsumerRebalance cannot succeed when checking partition owner
KAFKA-3963: Missing messages from the controller to brokers
KAFKA-3965: Mirror maker sync send fails will lose messages
KAFKA-4002: task.open() should be invoked in case that 0 partitions is assigned to task.
KAFKA-4019: LogCleaner should grow read/write buffer to max message size for the topic
KAFKA-4023: Add thread id as prefix in Kafka Streams thread logging
KAFKA-4031: Check DirectBuffer’s cleaner to be not null before using
KAFKA-4032: Uncaught exceptions when autocreating topics
KAFKA-4033: KIP-70: Revise Partition Assignment Semantics on New Consumer’s Subscription Change
KAFKA-4034: Consumer need not lookup coordinator when using manual assignment
KAFKA-4035: AclCommand should allow Describe operation on groups
KAFKA-4037: Transient failure in ConnectRestApiTest
KAFKA-4042: DistributedHerder thread can die because of connector & task lifecycle exceptions
KAFKA-4051: Strange behavior during rebalance when turning the OS clock back
KAFKA-4056: Kafka logs values of sensitive configs like passwords
KAFKA-4066: NullPointerException in Kafka consumer due to unsafe access to findCoordinatorFuture
KAFKA-4073: MirrorMaker should handle mirroring messages w/o timestamp better
KAFKA-4077: Backdate validity of certificates in system tests to cope with clock skew
KAFKA-4082: Support Gradle 3.0
KAFKA-4098: NetworkClient should not intercept all metadata requests on disconnect
KAFKA-4099: Change the time based log rolling to only based on the message timestamp.
KAFKA-4100: Connect Struct schemas built using SchemaBuilder with no fields cause NPE in Struct constructor
KAFKA-4103: DumpLogSegments cannot print data from offsets topic
KAFKA-4104: Queryable state metadata is sometimes invalid
KAFKA-4105: Queryable state tests for concurrency and rebalancing
KAFKA-4118: StreamsSmokeTest.test_streams started failing since 18 August build
KAFKA-4123: Queryable State returning null for key before all stores in instance have been initialized
KAFKA-4129: Processor throw exception when getting channel remote address after closing the channel
KAFKA-4130:
KAFKA-4131: Multiple Regex KStream-Consumers cause Null pointer exception in addRawRecords in RecordQueue class
KAFKA-4135: Inconsistent javadoc for KafkaConsumer.poll behavior when there is no subscription
KAFKA-4153: Incorrect KStream-KStream join behavior with asymmetric time window
KAFKA-4158: Reset quota to default value if quota override is deleted for a given clientId
KAFKA-4160: Consumer onPartitionsRevoked should not be invoked while holding the coordinator lock
KAFKA-4162: Typo in Kafka Connect document
KAFKA-4163: NPE in StreamsMetadataState during re-balance operations
KAFKA-4172: Fix masked test error in KafkaConsumerTest.testSubscriptionChangesWithAutoCommitEnabled
KAFKA-4173: SchemaProjector should successfully project when source schema field is missing and target schema field is optional
KAFKA-4175: Can’t have StandbyTasks in KafkaStreams where NUM_STREAM_THREADS_CONFIG > 1
KAFKA-4176: Node stopped receiving heartbeat responses once another node started within the same group
KAFKA-4183: Logical converters in JsonConverter don’t properly handle null values
KAFKA-4193: FetcherTest fails intermittently
KAFKA-4197: Make ReassignPartitionsTest System Test move data
KAFKA-4200: Minor issue with throttle argument in kafka-reassign-partitions.sh
KAFKA-4214: kafka-reassign-partitions fails all the time when brokers are bounced during reassignment
KAFKA-4216: Replication Quotas: Control Leader & Follower Throttled Replicas Separately
KAFKA-4222: Transient failure in QueryableStateIntegrationTest.queryOnRebalance
KAFKA-4223: RocksDBStore should close all open iterators on close
KAFKA-4225: Replication Quotas: Control Leader & Follower Limit Separately
KAFKA-4227: AdminManager is not shutdown when KafkaServer is shutdown
KAFKA-4233: StateDirectory fails to create directory if any parent directory does not exist
KAFKA-4234: Consumer should not commit offsets in unsubscribe()
KAFKA-4235: Fix the closing order in Sender.initiateClose().
KAFKA-4241: StreamsConfig doesn’t pass through custom consumer and producer properties to ConsumerConfig and ProducerConfig
KAFKA-4248: Consumer can return data from old regex subscription in poll()
KAFKA-4251: Test driver not launching in Vagrant 1.8.6
KAFKA-4252: Missing ProducerRequestPurgatory
KAFKA-4253: Fix Kafka Stream thread shutting down process ordering
KAFKA-4257: Inconsistencies in 0.10.1 upgrade docs
KAFKA-4262: Intermittent unit test failure ReassignPartitionsClusterTest.shouldExecuteThrottledReassignment
KAFKA-4265: Intermittent test failure ReplicationQuotasTest.shouldBootstrapTwoBrokersWithFollowerThrottle
KAFKA-4267: Quota initialization for <user, clientId> uses incorrect ZK path
KAFKA-4274: KafkaConsumer.offsetsForTimes() hangs and times out on an empty map
KAFKA-4283: records deleted from CachingKeyValueStore still appear in range and all queries
KAFKA-4290: High CPU caused by timeout overflow in WorkerCoordinator
KAFKA-3590: KafkaConsumer fails with “Messages are rejected since there are fewer in-sync replicas than required.” when polling
KAFKA-3774: GetOffsetShell tool reports ‘Missing required argument “[time]”’ despite the default
KAFKA-4008: Module “tools” should not be dependent on “core”

Task ¶

KAFKA-3163: KIP-33 - Add a time based log index to Kafka
KAFKA-3838: Bump zkclient and Zookeeper versions
KAFKA-4079: Document quota configuration changes from KIP-55
KAFKA-4148: KIP-79 add ListOffsetRequest v1 and search by timestamp interface to consumer.
KAFKA-4192: Update upgrade documentation for 0.10.1.0 to mention inter.broker.protocol.version

Wish ¶

KAFKA-3905: Check for null in KafkaConsumer#{subscribe, assign}</li>

Test ¶

KAFKA-3374: Failure in security rolling upgrade phase 2 system test
KAFKA-3799: Turn on endpoint validation in SSL system tests
KAFKA-3863: Add system test for connector failure/restart
KAFKA-3985: Transient system test failure ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol
KAFKA-4055: Add system tests for secure quotas
KAFKA-4145: Avoid redundant integration testing in ProducerSendTests
KAFKA-4213: Add system tests for replication throttling (KIP-73)

Sub-task ¶

KAFKA-2720: Periodic purging groups in the coordinator
KAFKA-2945: CreateTopic - protocol and server side implementation
KAFKA-2946: DeleteTopic - protocol and server side implementation
KAFKA-3290: WorkerSourceTask testCommit transient failure
KAFKA-3443: Support regex topics in addSource() and stream()
KAFKA-3576: Unify KStream and KTable API
KAFKA-3660: Log exception message in ControllerBrokerRequestBatch
KAFKA-3678: Fix stream integration test timeouts
KAFKA-3708: Rethink exception handling in KafkaStreams
KAFKA-3777: Extract the existing LRU cache out of RocksDBStore
KAFKA-3778: Avoiding using range queries of RocksDBWindowStore on KStream windowed aggregations
KAFKA-3780: Add new config cache.max.bytes.buffering to the streams configuration
KAFKA-3865: Transient failures in org.apache.kafka.connect.runtime.WorkerSourceTaskTest.testSlowTaskStart
KAFKA-3870: Expose state store names to DSL
KAFKA-3872: OOM while running Kafka Streams integration tests
KAFKA-3874: Transient test failure: org.apache.kafka.streams.integration.JoinIntegrationTest.shouldCountClicksPerRegion
KAFKA-3911: Enforce KTable materialization
KAFKA-3912: Query local state stores
KAFKA-3914: Global discovery of state stores
KAFKA-3926: Transient failures in org.apache.kafka.streams.integration.RegexSourceIntegrationTest
KAFKA-3973: Investigate feasibility of caching bytes vs. records
KAFKA-3974: LRU cache should store bytes/object and not records
KAFKA-4038: Transient failure in DeleteTopicsRequestTest.testErrorDeleteTopicRequests
KAFKA-4045: Investigate feasibility of hooking into RocksDb’s cache
KAFKA-4049: Transient failure in RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted
KAFKA-4058: Failure in org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterReset
KAFKA-4069: Forward records in context of cache flushing/eviction
KAFKA-4147: Transient failure in ConsumerCoordinatorTest.testAutoCommitDynamicAssignment
KAFKA-4167: Add cache metrics
KAFKA-4194: Add more tests for KIP-79

Confluent Platform 3.1.0 Release Notes ¶

This is a minor release of the Confluent Platform that provides Confluent users with Apache Kafka 0.10.1.0, the latest stable version of Kafka. In addition, this release includes Confluent’s multi-datacenter replication and automatic data balancing tool.

Confluent Platform users are encouraged to upgrade to CP 3.1.0 as it includes both new major functionality as well as important bug fixes. The technical details of this release are summarized below.

Table of Contents

Contents

Confluent Platform 4.0.0 Release Notes

local:

Highlights ¶

Enterprise Features ¶

Automatic Data Balancing ¶

Confluent Enterprise now includes Auto Data Balancing. As clusters grow, topics and partitions grow in different rates, brokers are added and removed and over time this leads to unbalanced workload across datacenter resources. Some brokers are not doing much at all, while others are heavily taxed with large or many partitions, slowing down message delivery. When executed, this feature monitors your cluster for number of brokers, size of partitions, number of partitions and number of leaders within the cluster. It allows you to shift data to create an even workload across your cluster, while throttling rebalance traffic to minimize impact on production workloads while rebalancing.

Multi-Datacenter Replication ¶

Confluent Enterprise now makes it easier than ever to maintain multiple Kafka clusters in multiple data centers. Managing replication of data and topic configuration between data centers enables use-cases such as:

Active-active geo-localized deployments: allows users to access a near-by data center to optimize their architecture for low latency and high performance
Centralized analytics: Aggregate data from multiple Kafka clusters into one location for organization-wide analytics
Cloud migration: Use Kafka to synchronize data between on-prem applications and cloud deployments

The new Multi-Datacenter Replication feature allows configuring and managing replication for all these scenarios from either Confluent Control Center or command-line tools.

Confluent Control Center ¶

Control Center added the capability to define alerts on the latency and completeness statistics of data streams, which can be delivered by email or queried from a centralized alerting system. In this release the stream monitoring features have also been extended to monitor topics from across multiple Kafka clusters and access to the Control Center can be protected via integration with enterprise authentication systems.

Open Source Features ¶

Kafka Streams ¶

Interactive Queries ¶

Interactive queries let you get more from streaming than just the processing of data. This feature allows you to treat the stream processing layer as a lightweight embedded database and, more concretely, to directly query the latest state of your stream processing application, without needing to materialize that state to external databases or external storage first.

As a result, interactive queries simplify the architecture of many use cases and lead to more application-centric architectures. For example, you often no longer need to operate and interface with a separate database cluster – or a separate infrastructure team in your company that runs that cluster – to share data between a Kafka Streams application (say, an event-driven microservice) and downstream applications, regardless of whether these applications use Kafka Streams or not; they may even be applications that do not run on the JVM, e.g. implemented in Python, C/C++, or JavaScript.

Application Reset Tool ¶

The Application Reset Tool allows you to quickly reset a Kafka Streams application in order to reprocess its input data from scratch – think: an application “reset” button.

Improved memory management ¶

Kafka Streams applications now benefit from record caches. Notably, these caches are used to compact output records (similar to Kafka’s log compaction) so that fewer updates for the same record key are being sent downstream. These new caches are enabled by default and typically result in reduced load on your Kafka Streams application, your Kafka cluster, and/or downstream applications and systems such as external databases. However, these caches can also be disabled, if needed, to restore the CP 3.0.x behavior of your applications.

Kafka Connect ¶

In this release, we’ve added two new connectors to Confluent Open Source:

Elasticsearch Sink - High performance connector to stream data from Kafka to Elasticsearch. This connector supports automatic generation of Elasticsearch mappings, all Elasticsearch datatypes and exactly-once delivery
JDBC Sink - Stream data from Kafka to any relational database. This connector supports inserts and upserts, schema evolution and exactly-once delivery.

Go Client ¶

We’re introducing a fully supported, up to date client for Go. The client is reliable, high-performance, secured, compatible with all versions of Kafka and supports the full Kafka protocol.

Apache Kafka 0.10.1 ¶

For a full list of changes in this release of Apache Kafka, see the 0.10.1 Release Notes.

Confluent Platform 3.0.1 Release Notes ¶

Confluent Platform 3.0.1 contains a number of bug fixes included in the Kafka 0.10.0.1 release. Details of the changes to Kafka in this patch release are found in the Kafka Release Notes. Details of the changes to other components of the Confluent Platform are listed in the respective changelogs such as Kafka REST Proxy changelog.

Highlights ¶

Confluent Control Center ¶

Control Center added performance improvements to reduce running overhead, including reducing the number of Kafka topics necessary and optimizing webclient fetches from the server. We also added the ability to delete a running connector and support for connecting to SSL/SASL secured Kafka clusters.

Kafka Streams ¶

New Feature: Application Reset Tool ¶

A common situation when implementing stream processing applications in practice is to tell an application to reprocess its data from scratch. This may be required for a number of reasons, including but not limited to: during development and testing, when addressing bugs in production, when doing A/B testing of algorithms and campaigns, when giving demos to customers, and so on.

Previously, resetting an application was a manual task that was cumbersome and error-prone. Kafka now includes an Application Reset Tool for Kafka Streams through which you can quickly reset an application so that it will reprocess its data from scratch – think: an application “reset button”.

Apache Kafka 0.10.0.1 ¶

Here is a quick overview of the notable Kafka-related changes in the release:

New Feature ¶

KAFKA-3538: Abstract the creation/retrieval of Producer for stream sinks for unit testing

Improvement ¶

KAFKA-3479: Add new consumer metrics documentation
KAFKA-3667: Improve Section 7.2 Encryption and Authentication using SSL to include proper hostname verification configuration
KAFKA-3683: Add file descriptor recommendation to ops guide
KAFKA-3699: Update protocol page on website to explain how KIP-35 should be used
KAFKA-3725: Update documentation with regards to XFS
KAFKA-3747: Close RecordBatch.records when append to batch fails
KAFKA-3785: Fetcher spending unnecessary time during metrics recording
KAFKA-3836: RocksDBStore.get() should not pass nulls to Deserializers
KAFKA-3880: Disallow Join Windows with size zero
KAFKA-3922: Add a copy-constructor to AbstractStream
KAFKA-4034: Consumer need not lookup coordinator when using manual assignment

Bug ¶

KAFKA-3185: Allow users to cleanup internal data
KAFKA-3258: BrokerTopicMetrics of deleted topics are never deleted
KAFKA-3393: Update site docs and javadoc based on max.block.ms changes
KAFKA-3500: KafkaOffsetBackingStore set method needs to handle null
KAFKA-3718: propagate all KafkaConfig __consumer_offsets configs to OffsetConfig instantiation
KAFKA-3728: EndToEndAuthorizationTest offsets_topic misconfigured
KAFKA-3781: Errors.exceptionName() can throw NPE
KAFKA-3782: Transient failure with kafkatest.tests.connect.connect_distributed_test.ConnectDistributedTest.test_bounce.clean=True
KAFKA-3783: Race condition on last ACL removal for a resource fails with a ZkBadVersionException
KAFKA-3784: TimeWindows#windowsFor misidentifies some windows if TimeWindows#advanceBy is used
KAFKA-3787: Preserve message timestamp in mirror mkaer
KAFKA-3789: Upgrade Snappy to fix snappy decompression errors
KAFKA-3802: log mtimes reset on broker restart
KAFKA-3805: Running multiple instances of a Streams app on the same machine results in Java_org_rocksdb_RocksDB_write error
KAFKA-3817: KTableRepartitionMap should handle null inputs
KAFKA-3830: getTGT() debug logging exposes confidential information
KAFKA-3850: WorkerSinkTask should retry commits if woken up during rebalance or shutdown
KAFKA-3851: Add references to important installation/upgrade notes to release notes
KAFKA-3852: Clarify how to handle message format upgrade without killing performance
KAFKA-3854: Subsequent regex subscription calls fail
KAFKA-3855: Guard race conditions in TopologyBuilder
KAFKA-3864: Kafka Connect Struct.get returning defaultValue from Struct not the field schema
KAFKA-3879: KafkaConsumer with auto commit enabled gets stuck when killed after broker is dead
KAFKA-3887: StreamBounceTest.test_bounce and StreamSmokeTest.test_streams failing
KAFKA-3890: Kafka Streams: task assignment is not maintained on cluster restart or rolling restart
KAFKA-3898: KStream.leftJoin(...) is missing a Serde for thisVal. This can cause failures after mapValues etc
KAFKA-3902: Optimize KTable.filter() to reduce unnecessary traffic
KAFKA-3915: LogCleaner IO buffers do not account for potential size difference due to message format change
KAFKA-3924: Data loss due to halting when LEO is larger than leader's LEO
KAFKA-3933: Kafka OOM During Log Recovery Due to Leaked Native Memory
KAFKA-3935: ConnectDistributedTest.test_restart_failed_task.connector_type=sink system test failing
KAFKA-3941: Avoid applying eviction listener in InMemoryKeyValueLoggedStore
KAFKA-3950: kafka mirror maker tool is not respecting whitelist option
KAFKA-3952: VerifyConsumerRebalance cannot succeed when checking partition owner
KAFKA-3960: Committed offset not set after first assign
KAFKA-3977: KafkaConsumer swallows exceptions raised from message deserializers
KAFKA-3983: It would be helpful if SocketServer's Acceptors logged both the SocketChannel port and the processor ID upon registra
KAFKA-3996: ByteBufferMessageSet.writeTo() should be non-blocking
KAFKA-4008: Module "tools" should not be dependent on "core"
KAFKA-4018: Streams causing older slf4j-log4j library to be packaged along with newer version
KAFKA-4073: MirrorMaker should handle mirroring messages w/o timestamp better
PR-1735 - Add application id prefix for copartitionGroups in TopologyBuilder

Test ¶

KAFKA-3863: Add system test for connector failure/restart

Sub-task ¶

KAFKA-3660: Log exception message in ControllerBrokerRequestBatch
KAFKA-3865: Transient failures in org.apache.kafka.connect.runtime.WorkerSourceTaskTest.testSlowTaskStart
KAFKA-3931: kafka.api.PlaintextConsumerTest.testPatternUnsubscription transient failure

Confluent Platform 3.0.0 Release Notes ¶

This is a major release of the Confluent Platform that provides Confluent users with Apache Kafka 0.10.0.0, the latest stable version of Kafka. In addition, this release includes the new Confluent Control Center application as well as the new Kafka Streams library that ships with Apache Kafka 0.10.0.0.

Confluent Platform users are encouraged to upgrade to Confluent Platform 3.0.0 as it includes both new major functionality as well as important bug fixes. The technical details of this release are summarized below.

Highlights ¶

We have added several new features to the Confluent Platform in Confluent Platform 3.0, to provide a more complete, easier to use, and higher performacne Stream Processing Platform:

Kafka Streams ¶

We’re very excited to introduce Kafka Streams. Kafka Streams is included in Apache Kafka 0.10.0.0. Kafka Streams is a library that turns Apache Kafka into a full featured, modern stream processing system. Kafka Streams includes a high level language for describing common stream operations (such as joining, filtering, and aggregating records), allowing developers to quickly develop powerful streaming applications. Kafka Streams applications can easily be deployed on many different systems— they can run on YARN, be deployed on Mesos, run in Docker containers, or just embedded into exisiting Java applications.

Further information is available in the Kafka Streams documentation. If you want to give it a quick spin, head straight to the Kafka Streams Quickstart.

Confluent Control Center ¶

Control Center is a web-based management and monitoring tool for Apache Kafka. In version 3.0.0, Control Center allows you to configure, edit, and manage connectors in Kafka Connect. It also includes Stream Monitoring: a system for measuring and monitoring your data streams end to end, from producer to consumer. To get started with Control Center, see Installing and Upgrading Control Center.

A term license for Confluent Control Center is available for Confluent Platform Enterprise Subscribers, but any user may download and try Confluent Control Center for free for 30 days.

Apache Kafka 0.10.0.0 ¶

Apache Kafka 0.10.0.0 is a major release of Apache Kafka and includes a number of new features and enhancements. Highlights include

Kafka Streams. As described above, Kafka Streams adds a simple but powerful streaming library to Apache Kafka.
Relative Offsets in Compressed Messages. In older versions of Kafka, recompression occurred when a broker received a batch of messages from ther producer. In 0.10.0.0, we have changed from using absolute offsets to relative offsets to avoid the recompression, reducing latency and reducing load on Kafka brokers. KAFKA-2511
Rack Awareness. Kafka can now run with a rack awareness feature that isolates replicas so they are guaranteed to span multiple racks or availability zones. This allows all of Kafka’s durability guarantees to be applied to these larger architectural units, significantly increasing availability. KAFKA-1215
Timestamps in Mesages. Messages are now tagged with timestamps at the time they are produced, allowing a number of future features including looking up message by time and measuring timing. KAFKA-2511
Kafka Consumer Max Records. In 0.9.0.0, developers had little control over the number of mesages returned when calling poll() for the new consumer. This feature introduces a new parameter max.poll.records that allows developers to limit the number of messages returned. KAKFA-3007
Client-Side Interceptors. We have introduced a new plugin architecture that allows developers to easily add “plugins” to Kafka clients. This allows developers to easily deploy additional code to inspect or modify Kafka messages. KAFKA-3162
Standardize Client Sequences. This features changed the arguments to some methods in the new consumer to work nore consistently with Java Collections. KAFKA-3006
List Connectors REST API. You can now query a distributed Kafka Connect cluster to discover the available connector classes. KAFKA-3316
Admin API changes. Some changes were made in the metadata request/response, improving performance in some situations. KAFKA-1694
Protocol Version Improvements. Kafka brokers now support a request that returns all supported protocol API versions. (This will make it easier for future Kafka clients to support multiple broker versions with a single client.) KAFKA-3307
SASL Improvements. Kafka 0.9.0.0 introduced new security features to Kafka, including support for Kerberos through SASL. In 0.10.0.0, Kafka now includes support for more SASL features, including external authentication servers, supporting multiple types of SASL authentication on one server, and other improvements. KAFKA-3149
Connect Status/Control APIs. In Kafka 0.10.0.0, we have continued to improve Kafka Connect. Previously, users had to monitor logs to view the status of connectors and their tasks, but we now support a status API for easier monitoring. We’ve also added control APIs, which allow you to pause a connector’s message processing in order to perform maintenance, and to manually restart tasks which have failed. KAFKA-3093, KAFKA-2370, KAFKA-3506
Allow cross origin HTTP requests on all HTTP methods. In Kafka 0.9.0.0, Kafka Connect only supported requests from the same domain; this enhancement removes that restriction. KAFKA-3578
Kafka LZ4 framing. Kafka’s implementation of LZ4 did not follow the standard LZ4 specification, creating problems for third party clients that wanted to leverage existing libraries. Kafka now conforms to the standard. KAFKA-3160

Note

Upgrading a Kafka Connect running in distributed mode from 0.9 versions of Kafka to 0.10 versions requires making a configuration change before the upgrade. See the Kafka Connect Upgrade Notes for more details.

For a complete list of features added and bugs fixed, see the Apache Kafka Release Notes.

Deprecating Camus ¶

Camus in Confluent Platform is deprecated in Confluent Platform 3.0 and may be removed in a release after Confluent Platform 3.1. To export data from Kafka to HDFS and Hive, we recommend Kafka Connect with the Confluent HDFS connector as an alternative.

Other Notable Changes ¶

We have also added some additional features to the Confluent Platform in Confluent Platform 3.0:

Preview release of Python Client. We’re introducing a fully supported, up to date client for Python. Over time, we will keep this client up to date with the latest Java clients, including support for new broker versions and Kafka features. Try it out and send us feedback, through the Confluent Platform Mailing List.
Security for Schema Registry. The Schema Registry now supports SSL both at its REST layer (via HTTPS) and in its communication with Kafka. The REST layer is the public, user-facing component, and the “communication with Kafka” is the backend communication with Kafka where schemas are stored.
Security for Kafka REST Proxy. The REST Proxy now supports REST calls over HTTPS. The REST Proxy does not currently support Kafka security.
We’ve removed the “beta” designation from the new Java consumer and encourage users to begin migration away from the old consumers (note that it is required to make use of Kafka security extensions).
The old Scala producer has been deprecated. Users should migrate to the Java producer as soon as possible.

Confluent Platform 2.0.1 Release Notes ¶

Confluent Platform 2.0.1 contains a number of bug fixes included in the Kafka 0.9.0.1 release. Details of the changes to Kafka in this patch release are found in the Kafka Release Notes. Details of the changes to other components of the Confluent Platform are listed in the respective changelogs such as Kafka REST Proxy changelog.

Here is a quick overview of the notable Kafka-related bug fixes in the release, grouped by the affected functionality:

New Java consumer ¶

KAFKA-2978: Topic partition is not sometimes consumed after rebalancing of consumer group
KAFKA-3179: Kafka consumer delivers message whose offset is earlier than sought offset.
KAFKA-3157: Mirror maker doesn’t commit offset with new consumer when there is no more messages
KAFKA-3170: Default value of fetch_min_bytes in new consumer is 1024 while doc says it is 1

Compatibility ¶

KAFKA-2695: Handle null string/bytes protocol primitives
KAFKA-3100: Broker.createBroker should work if json is version > 2, but still compatible
KAFKA-3012: Avoid reserved.broker.max.id collisions on upgrade

Security ¶

KAFKA-3198: Ticket Renewal Thread exits prematurely due to inverted comparison
KAFKA-3152: kafka-acl doesn’t allow space in principal name
KAFKA-3169: Kafka broker throws OutOfMemory error with invalid SASL packet
KAFKA-2878: Kafka broker throws OutOfMemory exception with invalid join group request
KAFKA-3166: Disable SSL client authentication for SASL_SSL security protocol

Performance/memory usage ¶

KAFKA-3003: The fetch.wait.max.ms is not honored when new log segment rolled for low volume topics
KAFKA-3159: Kafka consumer 0.9.0.0 client poll is very CPU intensive under certain conditions
KAFKA-2988: Change default configuration of the log cleaner
KAFKA-2973: Fix leak of child sensors on remove

Topic deletion ¶

KAFKA-2937: Topics marked for delete in Zookeeper may become undeletable

Confluent Platform 2.0.0 Release Notes ¶

The Confluent Platform 2.0.0 release includes a range of new features over the previous release Confluent Platform 1.0.x.

Security ¶

This release includes three key security features built directly within Kafka itself. First we now authenticate users using either Kerberos or TLS client certificates, so we now know who is making each request to Kafka. Second we have added a unix-like permissions system (ACLs) to control which users can access which data. Third, we support encryption on the wire using TLS to protect sensitive data on an untrusted network.

For more information on security features and how to enable them, see Security.

Kafka Connect ¶

Kafka Connect facilitates large-scale, real-time data import and export for Kafka. It abstracts away common problems that each such data integration tool needs to solve to be viable for 24x7 production environments: fault tolerance, partitioning, offset management and delivery semantics, operations, and monitoring. It offers the capability to run a pool of processes that host a large number of Kafka connectors while handling load balancing and fault tolerance.

Confluent Platform includes a file connector for importing data from text files or exporting to text files, JDBC connector for importing data from relational databases and an HDFS connector for exporting data to HDFS / Hive in Avro and Parquet formats.

To learn more about Kafka Connect and the available connectors, see Kafka Connect.

User Defined Quotas ¶

Confluent Platform 2.0 and Kafka 0.9 now support user-defined quotas. Users have the ability to enforce quotas on a per-client basis. Producer-side quotas are defined in terms of bytes written per second per client id while consumer quotas are defined in terms of bytes read per second per client id.

Learn more about user defined quotas in the Enforcing Client Quotas section of the post-deployment documentation.

New Consumer ¶

This release introduces beta support for the newly redesigned consumer client. At a high level, the primary difference in the new consumer is that it removes the distinction between the “high-level” ZooKeeper-based consumer and the “low-level” SimpleConsumer APIs, and instead offers a unified consumer API.

The new consumer allows the use of the group management facility (like the older high-level consumer) while still offering better control over offset commits at the partition level (like the older low-level consumer). It offers pluggable partition assignment amongst the members of a consumer group and ships with several assignment strategies. This completes a series of projects done in the last few years to fully decouple Kafka clients from Zookeeper, thus entirely removing the consumer client’s dependency on ZooKeeper.

To learn how to use the new consumer, refer to the Kafka Consumers documentation or the API docs.

New Client Library - librdkafka ¶

In this release of the Confluent Platform we are packaging librdkafka. librdkafka is a C/C++ library implementation of the Apache Kafka protocol, containing both Producer and Consumer support. It was designed with message delivery reliability and high performance in mind, current figures exceed 800,000 msgs/second for the producer and 3 million msgs/second for the consumer.

You can learn how to use librdkafka side-by-side with the Java clients in our Kafka Clients documentation.

Proactive Support ¶

Proactive Support is a component of the Confluent Platform that improves Confluent’s support for the platform by collecting and reporting support metrics (“Metrics”) to Confluent. Proactive Support is enabled by default in the Confluent Platform. We do this to provide proactive support to our customers, to help us build better products, to help customers comply with support contracts, and to help guide our marketing efforts. With Metrics enabled, a Kafka broker is configured to collect and report certain broker and cluster metadata (“Metadata”) every 24 hours about your use of the Confluent Platform (including without limitation, your remote internet protocol address) to Confluent, Inc. (“Confluent”) or its parent, subsidiaries, affiliates or service providers.

Proactive Support is enabled by default in the Confluent Platform, but you can disable it by following the instructions in Proactive Support documentation. Please refer to the Confluent Privacy Policy for an in-depth description of how Confluent processes such information.

Compatibility Notes ¶

Kafka 0.9 no longer supports Java 6 or Scala 2.9. If you are still on Java 6, consider upgrading to a supported version.
Configuration parameter replica.lag.max.messages was removed. Partition leaders will no longer consider the number of lagging messages when deciding which replicas are in sync.
Configuration parameter replica.lag.time.max.ms now refers not just to the time passed since last fetch request from replica, but also to time since the replica last caught up. Replicas that are still fetching messages from leaders but did not catch up to the latest messages in replica.lag.time.max.ms will be considered out of sync.
MirroMaker no longer supports multiple target clusters. As a result it will only accept a single --consumer.config parameter. To mirror multiple source clusters, you will need at least one MirrorMaker instance per source cluster, each with its own consumer configuration.
Broker IDs above 1000 are now reserved by default to automatically assigned broker IDs. If your cluster has existing broker IDs above that threshold make sure to increase the reserved.broker.max.id broker configuration property accordingly.

How to Download ¶

Confluent Platform is available for download at https://www.confluent.io/download/. See the Installing and Upgrading section for detailed information.

To upgrade Confluent Platform to a newer version, check the Upgrade documentation.

Questions?¶

If you have questions regarding this release, feel free to reach out via the Confluent Platform mailing list. Confluent customers are encouraged to contact our support directly.