Introduction
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications. Each time they are implemented there is a lot of work that goes into fixing the bugs and race conditions that are inevitable.
Because of the difficulty of implementing these kinds of services, applications initially usually skimp on them ,which make them brittle in the presence of change and difficult to manage. Even when done correctly, different implementations of these services lead to management complexity when the applications are deployed.
ZooKeeper’s architecture supports high availability through redundant services. In case if some instance of Zookeeper fails, then clients can ask another ZooKeeper leader instance. ZooKeeper nodes store their data in a hierarchical name space, much like a file system or a tree data structure. ZooKeeper is used by many well known companies including Yahoo, Odnoklassniki, Reddit, NetApp, eBay and many others.
This article focuses on how to install and configure Apache ZooKeeper cluster on Linux.
Other related Articles :
Apache kafka https://unixcop.com/how-to-install-apache-kafka-on-centos8/
Hardware requirements
- For reliable ZooKeeper service, you should deploy ZooKeeper in a cluster known as an ensemble. As long as a majority of the ensemble is up, the service will be available. Because Zookeeper requires a majority, it is best to use an odd number of machines. For example, with four machines ZooKeeper can only handle the failure of a single machine; if two machines fail, the remaining two machines do not constitute a majority. However, with five machines ZooKeeper can handle the failure of two machines.
- Apache recommending to deploy ZooKeeper on dedicated RHEL servers, with dual-core processors, 2GB of RAM, and 80GB IDE hard drives.
Software requirements
- CentOS 8/RHEL 64 bit Operating System.
- Java SE Development Kit 6 or greater.
Part | Description |
---|---|
Client | Clients, one of the nodes in our distributed application cluster, access information from the server. For a particular time interval, every client sends a message to the server to let the sever know that the client is alive.Similarly, the server sends an acknowledgement when a client connects. If there is no response from the connected server, the client automatically redirects the message to another server. |
Server | Server, one of the nodes in our ZooKeeper ensemble, provides all the services to clients. Gives acknowledgement to client to inform that the server is alive. |
Ensemble | Group of ZooKeeper servers. The minimum number of nodes that is required to form an ensemble is 3. |
Leader | Server node which performs automatic recovery if any of the connected node failed. Leaders are elected on service startup. |
Follower | Server node which follows leader instruction. |
Zookeeper – Installation
Create a User for ZooKeeper
- As root, create a user called zookeeper
$ useradd zookeeper
- Set password
$ passwd zookeeper
- Your ZooKeeper user is now ready. Log into it.
$ su - zookeeper
Download latest stable ZooKeeper
- Download the 3.7.0 release into user folder.
$ wget http://apache.spd.co.il/zookeeper/stable/zookeeper-3.7.0.tar.gz
- Unpack downloaded archive.
$ tar -xzf zookeeper-3.7.0.tar.gz
Configure the ZooKeeper Server
- The configuration below is relevant for each node in the cluster.
- Create in data directory
/var/zookeeper/
themyid
file with unique server identifier. For example,myid
of server 1 would contain the text"1"
and nothing more. - Create file
conf/zoo.cfg
with following configuration:
tickTime=2000
dataDir=/var/zookeeper
clientPort=2181
initLimit=5
syncLimit=2
server.=:2888:3888
server.=:2888:3888
. . .
Note: To explore nodes in the cluster ZooKeeper uses zoo.cfg file. In most cases the zoo.cfg file is the same on all nodes.
- As user root grant write permissions for
/var/zookeeper
folder
$ chmod -R 777 /var/zookeeper
Starting the server
$ bin/zkServer.sh start
ZooKeeper cluster status check
- To check the status of ZooKeeper cluster run the following command on each terminal
$ bin/zkServer.sh status
Test
Running the below will give a client interface to the service:
/opt/zookeeper/bin/zkCli.sh
To check which nodes are followers and which is the leader try issuing the below commands to each node:
echo srvr | nc localhost 2181
[root@devops bin]# netstat -ntlup |grep 2181
tcp6 0 0 :::2181 :::* LISTEN 9442/java
[root@devops bin]#
[root@devops bin]# ./zkCli.sh
Connecting to localhost:2181
2018-05-31 02:43:32,917 [myid:] – INFO [main:Environment@100] – Client environment:zookeeper.version=3.7.0-e5259e437540f349646870ea94dc2658c4e44b3b, built on 03/27/2018 03:55 GMT
…………………………..
Welcome to ZooKeeper!
JLine support is enabled
………………………………..
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0]
[zk: localhost:2181(CONNECTED) 0]
[root@devops bin]# ./zkServer.sh stop
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.12/bin/../conf/zoo.cfg
Stopping zookeeper … STOPPED
[root@sankar-devops bin]#
[root@devops ~]# jps
11064 QuorumPeerMain
11726 Kafka
12062 Jps