Gracefully take down a Zookeeper Node

zookeeper

I have had a hard disk begin to fail on one of my Zookeeper nodes in a cluster of 3 nodes. It is only a matter of time until the disk dies completely. Rather than waiting for this, I'd like to remove this node from the cluster gracefully as it is currently online in the cluster.

Turns out Zookeeper is not incredibly well documented; I cannot find out the safe/proper way to remove a node from a cluster via Google or the small amount of documentation I can find on Apache's site.

What steps or CLI commands should I use to gently take down this node such that my 2-node majority will be fine in the interim while I get the disk replaced on the dying node?

Best Answer

I believe you would have already worked your way around this but this came up in one of my search and so I wanted to share my inputs -

  • From the bin directory in ZK_HOME, please execute ./zkServer.sh stop and replace your drive or any other maintenance you require.

Since you have a 3 node cluster, having one node down is OK as remaining 2 is still a majority. Reads/writes should continue to happen as new leader would be elected automatically once you take the node down

Related Topic