Operations

This section is not an exhaustive guide to running Control Center in production, but it covers the key things to consider before going live.

Hardware

As of this release, Control Center must run on a single machine. The resources needed for this machine depend primarily on how many producers are monitored and how many partitions each producer writes to. The Stream Monitoring functionality of Control Center is implemented as a Kafka Streams application and consequently benefits from having a lot of memory to work with for RocksDB caches and OS page cache.

Memory

The more memory you give Control Center the better but we recommend at least 32GB of RAM. The JVM heap size can be fairly small (defaults to 3GB) but the application needs the additional memory for RocksDB in-memory indexes and caches as well as OS page cache for faster access to persistent data.

CPUs

The Stream Monitoring functionality of Control Center requires significant CPU power for data verification and aggregation. We recommend at least 8 cores. If you have more cores available, you can increase the number of threads in the Kafka Streams pool (confluent.controlcenter.streams.num.stream.threads) and increase the number of partitions on internal Control Center topics (confluent.controlcenter.internal.topics.partitions) for greater parallelism.

Disks

Control Center relies on local state in RocksDB. We recommend at least 300GB of storage space, preferably SSDs. All local data is kept in the directory specified by the confluent.controlcenter.data.dir config parameter.

Network

Control Center relies heavily on Kafka, so fast and reliable network is important for performance. Modern data-center networking (1 GbE, 10 GbE) should be sufficient.

OS

Control Center needs many open RocksDB files. Make sure the ulimit for the number of open files (ulimit -n) is at least 16384.

JVM

We recommend running the latest version of JDK 1.8 with a 3GB max heap size. JDK 1.7 is also supported.

Kafka

The amount of storage space needed in Kafka depends on how many producers and consumers are being monitored as well as the configured retention and replication settings.

By default, Control Center keeps 3 days worth of data for the source monitoring topic (named _confluent-monitoring by default) and 24 hours of data of all of it’s internal topics. This means that you can take Control Center down for maintenance for as long as 24 hours without data loss. You can change these values by setting the confluent.monitoring.interceptor.topic.retention.ms and confluent.controlcenter.internal.topics.retention.ms config parameters.

By default, Control Center stores three copies on all topic partitions for availability and fault tolerance.

The full set of configuration options are documented in Configuration.