How to install Apache Kafka on Centos8

Everything Linux, A.I, IT News, DataOps, Open Source and more delivered right to you.
Subscribe
"The best Linux newsletter on the web"

What is kafka?

Indroduction

  • Apache kafka is a distributed streaming platform.
  • It is a popular distributed message broker designed to efficiently handle large volumes of real-time data.
  • A Kafka cluster is not only highly scalable and fault-tolerant, but it also has a much higher throughput compared to other message brokers such as ActiveMQ and RabbitMQ.
  • It is generally used as a publish/subscribe messaging system, a lot of organizations also use it for log aggregation because it offers persistent storage for published messages.

Three Capabilities of Streaming platform

  • Publish and subscribe to the stream of records
  • Stores streams of records in a fault-tolerant durable way
  • Process streams of records

kafka is used for 2 types of Applications

  • Helps to build real-time streaming data pipelines that can get data between applications.
  • Helps to build real-time streaming applications that react to the streams of data.

5 Core API’s of kafka

Here is the lists of API’s in the Kafka system.

PRODUCER API

It allows the application to publish stream of records to one or more topics.

CONSUMER API

It allows an application to subscribe to topics and process the streams of records.

STREAMS API

It helps to transform the input stream to an output stream.It allows applications to act as a stream processor , consuming input streams from topics and producing output streams to output topics.

CONNECTOR API

Helps to build and run reusable producers or consumers that connect kafka topics to the existing applications.

ADMIN API

It helps to manage and inspect kafka objects such as topics , brokers etc.

PreRequisites

  • A Centos 7 server with sudo or root privileges.
  • Kafka requires a server with minimum 4GB of RAM to run.
  • Java should be installed on the server

Apache Kafka is a distributed streaming platform. It is useful for building real-time streaming data pipelines to get data between the systems or applications.

This tutorial will help you to install Apache Kafka CentOS 8 or RHEL 8 Linux systems.

Prerequisites

  • The newly installed system’s recommended to follow initial server setup.
  • Shell access to the CentOS 8 system with sudo privileges account.

Step 1 – Install Java

You must have Java installed on your system to run Apache Kafka. You can install OpenJDK on your machine by executing the following command. Also, install some other required tools.

sudo yum install java-11-openjdk.x86_64 -y

Once the package is installed , Check the installed Java version using the below command
java -version

Step 2 – Download Apache Kafk

Download the Apache Kafka binary files from its official download website. You can also select any nearby mirror to download.

wget http://www-us.apache.org/dist/kafka/2.7.0/kafka_2.13-2.7.0.tgz

Then extract the archive file

tar xzf kafka_2.13-2.7.0.tgz
mv kafka_2.13-2.7.0 /usr/local/kafka

Step 3 – Setup Kafka Systemd Unit Files

CentOS 8 uses systemd to manage its services state. So we need to create systemd unit files for the Zookeeper and Kafka service. Which helps us to manage Kafka services to start/stop.

First, create systemd unit file for Zookeeper with below command:

vim /etc/systemd/system/zookeeper.service

Add below contnet:

[Unit]
Description=Apache Zookeeper server
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
ExecStart=/usr/bin/bash /usr/local/kafka/bin/zookeeper-server-start.sh /usr/local/kafka/config/zookeeper.properties
ExecStop=/usr/bin/bash /usr/local/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Save the file and close it.

Next, to create a Kafka systemd unit file using the following command:

vim /etc/systemd/system/kafka.service

Add the below content. Make sure to set the correct JAVA_HOME path as per the Java installed on your system.

[Unit]
Description=Apache Kafka Server
Documentation=http://kafka.apache.org/documentation.html
Requires=zookeeper.service

[Service]
Type=simple
Environment="JAVA_HOME=/usr/lib/jvm/jre-11-openjdk"
ExecStart=/usr/bin/bash /usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties
ExecStop=/usr/bin/bash /usr/local/kafka/bin/kafka-server-stop.sh

[Install]
WantedBy=multi-user.target

Save the file and close it.

Reload the systemd daemon to apply changes.

systemctl daemon-reload

Step 4 – Start Kafka Server

Kafka required ZooKeeper so first, start a ZooKeeper server on your system. You can use the script available with Kafka to get start a single-node ZooKeeper instance.

sudo systemctl start zookeeper

Now start the Kafka server and view the running status:

sudo systemctl start kafka
sudo systemctl status kafka

All done. You have successfully installed Kafka on your CentOS 8. The next part of this tutorial will help you to create topics in the Kafka cluster and work with the Kafka producer and consumer service.

Step 5 – Creating Topics in Apache Kafka

Apache Kafka provides multiple shell script to work on it. First, create a topic named “testTopic” with a single partition with single replica:

cd /usr/local/kafka
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic testTopic

Created topic testTopic.

Everything Linux, A.I, IT News, DataOps, Open Source and more delivered right to you.
Subscribe
"The best Linux newsletter on the web"
Mel
Melhttps://unixcop.com
Unix/Linux Guru and FOSS supporter

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest articles

Join us on Facebook