In layman terms, Zookeeper is a highly available configuration management system meant for distributed systems. Distributed systems solve a lot of problems but then they present their own challenges. What happens when a node in a cluster is not able to serve due to any reason? And what if that node was leader? Zookeeper helps in such situations. Zookeeper keeps track of all the available nodes and knows what lies where. In case a leader is down it will quickly help in electing another leader. The purpose of this post is to explain how to run multiple Zookeeper instances on a single node/machine. Such a setup is also known as Zookeper ensemble. Running multiple instances on a single node means single point of failure, so this is not advisable at all. You can have 3 nodes running 7 instances, 1 small node with one instance and two bit bigger nodes with 3 instances each. One would do to cut costs.
Let’s get into action.
Get the latest release from https://zookeeper.apache.org/releases.html#download, untar and keep the folder at one location, say /home/luser/zookeeper/
.
I will be setting up this on a Debian machine named deb and inside /opt
directory.
Now the most tough part(?)…configuring Zookeeper. Look for zoo.cfg
file inside configuration folder named conf
. Open the file in your favorite text editor and you need to configure bare minimum 3 items
dataDir
clientPort
server.n
where n is the server/instance number
There are other items which you may want to tweak based on your usage. In my case following was the pattern for 3 Zookeepers on one node
For Zookeeper 1
dataDir=/var/zookeeper/data-1
clientPort=2181
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
For Zookeeper 2
dataDir=/var/zookeeper/data-2
clientPort=2182
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
For Zookeeper 3
dataDir=/var/zookeeper/data-1
clientPort=2183
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
In case I was setting up these three on different nodes, the pattern would have been copy of each other.
For Zookeeper individual node
dataDir=/var/zookeeper/data
clientPort=2181
server.1=node1:2888:3888
server.2=node2:2888:3888
server.3=node3:2888:3888
The last thing to pay attention to is, the myid
file. This file goes inside dataDir
. So for my nth Zookeeper on single node it would have been inside /var/zookeeper/data-n/myid
. This file holds the server/instance number, so for nth instance the content of the file would have been value of n. So the myid
file for 3 instance will contain number 3. In case of multiple nodes, myid file on node-n will have value n.
Now that config part is bit clear, there are two ways to start Zookeeper instances. I changed the zoo.cfg
file for first instance and then copied the Zookeeper folder from home to /opt/zookeeper-1/
and followed this for other two instance. Then inside bin
dir I just did zkServer.sh start
. You can completely avoid this and create three zoo.cfg
files using pattern zoo-n.cfg
and then just cd inside bin
dir and run zkServer.sh start zoo-n.cfg
command to start servers. Once you have started servers you can verify the setup by connecting to individual server. For my first server I cd inside bin
dir and run zkCli.sh -server 127.0.0.1:2181
. For 2nd node this command was zkCli.sh -server 127.0.0.1:2182
and so on for 3rd. In case I had set up one node per instance, the command would have been zkCli.sh -server node-ip-or-node-host-name:2181
.
And that’s it. Your Zookeeper ensemble is up and running!