Setting up a Kafka cluster with Zookeeper: A Step-by-Step Guide

Table of Contents

Kafka and ZooKeeper form a powerful team. Think of them as a company's executive and operations departments. ZooKeeper is the executive team, making all the high-level decisions and ensuring the company is stable, while Kafka is the operations team, handling the day-to-day work and interacting directly with clients.

ZooKeeper's Role: The Cluster's "Executive"

ZooKeeper is the central nervous system of the Kafka cluster. Its main job is to maintain a synchronized, up-to-date view of the entire cluster's state. It manages all the behind-the-scenes coordination that makes the system resilient and reliable.

Here’s what ZooKeeper does:

Broker Management: It keeps a registry of all active Kafka brokers. Each broker sends a regular heartbeat to ZooKeeper, so if a broker fails, ZooKeeper knows instantly.
Leader Election: When a broker goes down, ZooKeeper is responsible for electing a new leader for any partitions that the failed broker was leading. This ensures that the system stays online and continues to function smoothly.
Cluster State: It stores and tracks critical cluster information, including:

The list of topics and their configurations (like the number of partitions and replication factor).
The In-Sync Replicas (ISRs), which are the replicas that are fully caught up with the leader partition.
Access Control Lists (ACLs) and quotas for security.

In short, ZooKeeper handles all the metadata and coordination, providing the stable foundation that Kafka needs to operate.

Kafka's Role: The "Operations Team"

Kafka is the workhorse that handles all the data and direct client interactions. Once ZooKeeper has set up the cluster and decided on the leadership and topology, Kafka gets to work.

Kafka is responsible for:

Client Connections: It manages all connections with producers (clients that write data) and consumers (clients that read data).
Data Handling: It handles the actual topic logs, which are the partitioned, ordered streams of data. This includes writing new messages, replicating them, and serving them to consumers.
Consumer Groups: It tracks which messages each consumer group has processed using offsets, ensuring that messages are delivered correctly and efficiently.

Together, ZooKeeper and Kafka create a robust, fault-tolerant messaging system. ZooKeeper provides the brain and stability, while Kafka provides the brawn and direct client interaction.

Using Zookeeper: Understanding the files created

Besides myid, the rest of the files should remain untouched. They are managed by ZooKeeper.

myid: file representing the server id. That's how Zookeeper knows its identity.
version-2/: folder that holds the zookeeper data.
AcceptEpoch and CurrentEpoch: internal to Zookeeper.
Log. X: Zookeeper data files.

Zookeeper Architecture: Quorum sizing

ZooKeeper needs to have a strict majority of servers up to form a strict majority when votes happen
Therefore, Zookeeper quorums have 1, 3, 5, 7, 9, (2N+1) servers
This allows for 0, 1, 2, 3, 4, or N servers to go down

Quorum Sizing 1 and 3

Cluster Architecture Overview:

Cluster Architecture Overview

Step 1: Installation: (Follow the same steps on all the machines in the cluster)

Updates the package list on your Linux system.
sudo apt update
Installs the default OpenJDK (Java Development Kit), which is required to run Apache Kafka.
sudo apt install default-jdk
Check the Java version to verify that the installation was successful.
java -version
Download the Apache Kafka source code (version 3.5.1) from the Apache Kafka website.
wget https://downloads.apache.org/kafka/3.5.1/kafka-3.5.1-src.tgz
Extract the downloaded Kafka source code archive.
tar -xvf kafka-3.5.1-src.tgz
Rename the directory containing the extracted Kafka source code from "kafka-3.5.1-src" to "kafka" for convenience.
mv kafka-3.5.1-src kafka
Change the current working directory to the "kafka" directory, which contains the Kafka source code and configuration files.
cd kafka/
Enter the following command.
./gradlew jar -PscalaVersion=2.13.10
This command uses the Gradle build tool (specified by "./gradlew") to build the Kafka JAR files, and it sets the Scala version to 2.13.10 during the build process. Building the JAR files is necessary before running Kafka.

Step 2: Zookeeper Configuration: (Follow the same steps on all the machines in the cluster except for the last step where myid is set)

Remove the default ‘zookeeper.properties’ file from the config/ directory.
rm config/zookeeper.properties
Now, create a new ‘zookeeper.properties’ file in the config/ directory.
sudo vim config/zookeeper.properties
Enter the following configurations in the zookeeper.properties file:

zookeeper.properties

Data Directory (dataDir): This setting specifies the directory where ZooKeeper stores its snapshot and transaction log data. In this case, it's set to /home/azureuser/data/zookeeper. This is where ZooKeeper persists its data.
Client Port (clientPort): This is the port on which clients will connect to the ZooKeeper ensemble. In this configuration, it's set to 2181, which is the default port for ZooKeeper clients.
Max Client Connections (maxClientCnxns): It defines the maximum number of connections per IP address. Setting it to 0 disables this limit, as it's specified for a non-production configuration.
Tick Time (tickTime): Tick time is the basic time unit in milliseconds used by ZooKeeper. It's used for timekeeping, heartbeats, and timeouts. The tickTime is set to 2000 milliseconds (2 seconds).
Init Limit (initLimit): The initLimit specifies the number of ticks during which ZooKeeper servers must connect and synchronize. In this configuration, it's set to 10 ticks, which translates to 20 seconds (10 * 2 seconds).
Sync Limit (syncLimit): The syncLimit defines the number of ticks that can pass between sending a request and getting an acknowledgment. It's set to 5 ticks, meaning 10 seconds (5 * 2 seconds).
Zoo Servers (server 1, server 2, server 3): These lines define the configuration for ZooKeeper servers in the ensemble. Each server.x entry consists of three components: hostname or IP address, the quorum port, and the leader election port. In this example, you have a three-node ensemble with each node specified by its IP address and ports.
Admin Server Configuration (admin.enableServer, admin.serverPort): The admin server is disabled by default to avoid port conflicts. You can enable it by setting admin.enableServer to true and specifying the port using admin.serverPort. In this configuration, it's disabled (admin.enableServer=false).
4-Letter Words (4lw.commands.whitelist): This setting configures the list of 4-letter word (4lw) commands that are allowed by the ZooKeeper server. The * in this configuration indicates that all 4-letter words are allowed. These commands are short, textual commands used for interacting with ZooKeeper for diagnostics and management purposes.

Create a directory structure for ZooKeeper data.
mkdir -p data/zookeeper
Change the ownership of the data/ directory and its contents to the user ‘azureuser’. This is important because ZooKeeper will write data to this directory, and the user running ZooKeeper (typically azureuser) needs to have the necessary permissions.
sudo chown -R azureuser:azureuser data/
Create a file named myid inside the ~/data/zookeeper/ directory and set its content to "3". The myid file is used to identify the ZooKeeper server's unique ID in a multi-server ensemble. In this case, you are setting the ID of this ZooKeeper server to 3.
echo "3" > ~/data/zookeeper/myid

Note: Set a different number in myid for each member of the zookeeper cluster.

Step 3: Creating zookeeper service: (Follow the same steps on all the machines in the cluster)

Create the /etc/init.d/zookeeper file using the vim text editor. In this file, we’ll define how the ZooKeeper service should start, stop, and restart. You'll specify the actions to be taken when the service is managed by the system's init process.
sudo vim /etc/init.d/zookeeper
Copy and paste the following script into the /etc/init.d/zookeeper file
Make the /etc/init.d/zookeeper script executable. The system must run it as a service.
sudo chmod +x /etc/init.d/zookeeper
Change the ownership of the /etc/init.d/zookeeper script to the root user and root group.
sudo chown root:root /etc/init.d/zookeeper
Add the ZooKeeper script to the default runlevels, which means that ZooKeeper will start automatically when your system boots. The update-rc.d tool manages the symlinks in the /etc/rc*.d/ directories to control service execution during system startup and shutdown.
sudo update-rc.d zookeeper defaults
Start the ZooKeeper service using the system's service management utility. Once configured and added to the default runlevels, you can start the service this way.
sudo service zookeeper start
Check the status of the ZooKeeper service, indicating whether it's running, stopped, or encountering any issues.
sudo service zookeeper status

terminal

Step 4: Checking Zookeeper connectivity:

Run the following command, which will send the "stat" command to the ZooKeeper server running on localhost and display the server's status information as a response.
echo "stat" | nc localhost 2181 ; echo

terminal

You can also run `echo "stat" | nc <other_host’s_name/IP> 2181 ; echo` to check other stats of other zookeeper servers in the cluster.

Run the following command to display the contents of the zookeeper.out log file from the specified location. It contains the output generated by the ZooKeeper server, which can help diagnose issues or monitor ZooKeeper's behavior.
cat kafka/logs/zookeeper.out

terminal

The logs indicate:

The server with ID 1 is in the "LOOKING" state, which means it's participating in the leader election process. It eventually becomes a follower (FOLLOWING) and accepts the leadership of server 2 (LEADING), which is the leader.
Server 2 becomes the leader and is now in the "LEADING" state.
Server 3 is in the "FOLLOWING" state and is following server 2.

These logs indicate that the ZooKeeper servers have successfully elected a leader, and they are in a functioning ensemble. So, yes, your ZooKeeper servers appear to be connected and are functioning as expected.

Start a ZooKeeper shell and connect to a ZooKeeper server running on localhost at port 2181. The ZooKeeper shell allows you to interact with the ZooKeeper server to perform various operations, such as creating, deleting, and reading ZooKeeper znodes, which are like nodes or paths in a hierarchical data structure.
kafka/bin/zookeeper-shell.sh localhost:2181

terminal

Use the following command inside the Zookeeper shell to list or create a znode (node)
ls /
create /my-node "some data"
ls /
quit
Make sure all the zookeeper shell commands work exactly the same on each ZooKeeper machine and are properly synchronized.

Step 5: Attaching a new disk to Kafka brokers:

Attaching a new disk to Kafka brokers

Attach a new disk to your Kafka brokers for storing Kafka’s data.
The steps may vary depending on the cloud platform.

Step 6: Mounting the newly created disk to a specific path:

sudo su

lsblk

terminal

apt-get install -y xfsprogs

terminal

file -s /dev/sda

fdisk /dev/sda

mkfs.xfs -f /dev/sda

mkdir /home/azureuser/data/kafka

mount -t xfs /dev/sda /home/azureuser/data/kafka

chown -R azureuser:azureuser data/kafka/

df -h data/kafka

terminal

Step 7: Kafka configuration: (Follow the same steps on all the machines in the cluster)

To ensure that the user azureuser has the necessary permissions to work with the Kafka data and configuration files, change the ownership of the ~/data/kafka directory.
sudo chown -R azureuser:azureuser ~/data/kafka
Allow all users to have a higher limit for open files (file descriptors). This can be useful for applications like Kafka, which might require a large number of open file descriptors.
echo "* hard nofile 100000
* soft nofile 100000" | sudo tee --append /etc/security/limits.conf
Reboot your system to allow your configurations to take place.
sudo reboot
SSH back into your server.
Start the zookeeper service.
sudo service zookeeper start
Remove the default configuration file of Kafka.
rm config/server.properties
Create a new server.properties file for Kafka.
vim config/server.properties
Enter the following configurations in that file:
MAKE SURE TO USE ANOTHER BROKER ID AND `advertised.listeners` IN ALL THE MEMBERS OF THE CLUSTER
Launch Kafka - make sure things look okay
bin/kafka-server-start.sh config/server.properties

Step 8: Creating Kafka service: (Follow the same steps on all the machines in the cluster)

Install Kafka boot scripts
sudo vim /etc/init.d/kafka
Use the following script:
Make the /etc/init.d/kafka script executable. The system must run it as a service.
sudo chmod +x /etc/init.d/kafka
Change the ownership of the /etc/init.d/kafka script to the root user and root group.
sudo chown root: root /etc/init.d/kafka
Add the Kafka script to the default runlevels, which means that Kafka will start automatically when your system boots. The update-rc.d tool manages the symlinks in the /etc/rc*.d/ directories to control service execution during system startup and shutdown.
sudo update-rc.d kafka defaults
Start the Kafka service using the system's service management utility. Once configured and added to the default runlevels, you can start the service this way.
sudo service kafka start
Check the status of the Kafka service, indicating whether it's running, stopped, or encountering any issues.
sudo service kafka status
Verify that it is working
nc -vz localhost 9092

Step 9: Checking Kafka connectivity:

View the last few lines of the "server.log" file located in the "/home/azureuser/kafka/logs" directory.
tail /home/azureuser/kafka/logs/server.log
Open the zookeeper shell
kafka/bin/zookeeper-shell.sh localhost:2181
List the children of the /kafka/brokers/ids znode, which typically stores information about Kafka broker registrations.
ls /kafka/brokers/ids
Create a topic named "second_topic" with the specified replication factor and number of partitions on your Kafka cluster.
kafka/bin/kafka-topics.sh --bootstrap-server 10.1.0.5:9092,10.1.0.4:9092,10.1.0.6:9092 --create --topic second_topic --replication-factor 3 --partitions 3
Now, list all the topics that currently exist in your Kafka cluster.
kafka/bin/kafka-topics.sh --bootstrap-server 10.1.0.5:9092,10.1.0.4:9092,10.1.0.6:9092 --list
Now delete the `second_topic`
kafka/bin/kafka-topics.sh --bootstrap-server 10.1.0.5:9092,10.1.0.4:9092,10.1.0.6:9092 --delete --topic second_topic
List all the topics again
kafka/bin/kafka-topics.sh --bootstrap-server 10.1.0.5:9092,10.1.0.4:9092,10.1.0.6:9092 --list

Note: Follow these steps on all the machines inside the Kafka cluster to check the connectivity and synchronization of data.

Let's discuss your cloud challenges and see how CloudKeeper can solve them all!

Meet the Author

Jatin Rautela
DevOps Engineer
Jatin is a DevOps Engineer with expertise and multiple certifications in Azure and AWS.

0 Comment

Setting up a Kafka cluster with Zookeeper: A Step-by-Step Guide

ZooKeeper's Role: The Cluster's "Executive"

Kafka's Role: The "Operations Team"

Using Zookeeper: Understanding the files created

Zookeeper Architecture: Quorum sizing

Cluster Architecture Overview:

Step 1: Installation: (Follow the same steps on all the machines in the cluster)

Step 2: Zookeeper Configuration: (Follow the same steps on all the machines in the cluster except for the last step where myid is set)

Step 3: Creating zookeeper service: (Follow the same steps on all the machines in the cluster)

Step 4: Checking Zookeeper connectivity:

The logs indicate:

Step 5: Attaching a new disk to Kafka brokers:

Step 6: Mounting the newly created disk to a specific path:

Step 7: Kafka configuration: (Follow the same steps on all the machines in the cluster)

Step 8: Creating Kafka service: (Follow the same steps on all the machines in the cluster)

Step 9: Checking Kafka connectivity:

You may also like