How to Install Apache Kafka in CentOS/RHEL 7

Apache Kafka is a powerful messaging engine, which is widely used in BigData projects and the Data Analytics life cycle. It is an Open-source platform to build real-time data streaming pipelines. It is a distributed publish-subscribe platform with Reliability, Scalability, and Durability.

We can have Kafka as a standalone or as a cluster. Kafka stores the streaming data, and it can be categorized as Topics. The Topic will be having a number of partitions so that it can handle an arbitrary amount of data. Also, we can have multiple replicas for fault-tolerant as we are having in HDFS. In a Kafka cluster, the broker is a component that stores the published data.

Zookeeper is a mandatory service to run a Kafka cluster, as it is used for managing the co-ordinating of the Kafka brokers. Zookeeper plays a key role between producer and consumer where it is responsible for maintaining the state of all brokers.

In this article, we will explain how to install Apache Kafka in a single node CentOS 7 or RHEL 7.

Installing Apache Kafka in CentOS 7

1. First, you need to install Java on your system to run Apache Kafka without any errors. So, install the default available version of Java using the following yum command and verify the Java version as shown.

# yum -y install java-1.8.0-openjdk
# java -version
Verify Java Version
Verify Java Version

2. Next, download the most recent stable version of Apache Kafka from the official website or use the following wget command to download it directly and extract it.

# wget https://mirrors.estointernet.in/apache/kafka/2.7.0/kafka_2.13-2.7.0.tgz 
# tar -xzf kafka_2.13-2.7.0.tgz 

3. Create a symbolic link for kafka package, then add Kafka environment path to .bash_profile file and then initialize it as shown.

# ln -s kafka_2.13-2.7.0 kafka
# echo "export PATH=$PATH:/root/kafka_2.13-2.7.0/bin" >> ~/.bash_profile
# source ~/.bash_profile

4. Next, start the Zookeeper, which comes built-in with the Kafka package. Since it is a single node cluster, you can start the zookeeper with default properties.

# zookeeper-server-start.sh -daemon /root/kafka/config/zookeeper.properties

5. Validate whether the zookeeper is accessible or not by simply telnet to Zookeeper port 2181.

# telnet localhost 2181
Verify Zookeeper Port
Verify Zookeeper Port

6. Start the Kafka with its default properties.

# kafka-server-start.sh -daemon /root/kafka/config/server.properties
Start Kafka Service
Start Kafka Service

7. Validate whether the Kafka is accessible or not by simply telnet to Kafka port 9092

# telnet localhost 9092
Verify Kafka Port
Verify Kafka Port

8. Next, create a sample topic.

# kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic tecmint
Create Topic in Kafka
Create Topic in Kafka

9. List out the topic created.

# kafka-topics.sh --zookeeper localhost:2181 --list
List Topic in Kafka
List Topic in Kafka
Conclusion

In this article, we have seen how to install a Single node Kafka cluster in CentOS 7. We will see how to install a multinode Kafka Cluster in the next article.

Mohan Sivam
A Solution Architect and Bigdata infrastructure with over 10 years of experience in Information technology.

Each tutorial at TecMint is created by a team of experienced Linux system administrators so that it meets our high-quality standards.

Join the TecMint Weekly Newsletter (More Than 156,129 Linux Enthusiasts Have Subscribed)
Was this article helpful? Please add a comment or buy me a coffee to show your appreciation.

Got something to say? Join the discussion.

Thank you for taking the time to share your thoughts with us. We appreciate your decision to leave a comment and value your contribution to the discussion. It's important to note that we moderate all comments in accordance with our comment policy to ensure a respectful and constructive conversation.

Rest assured that your email address will remain private and will not be published or shared with anyone. We prioritize the privacy and security of our users.