How to get Cassandra to automatically rebalance

cassandra

I am looking at a cassandra cluster, but the administration effort seems to be quite high. Is there any way I can configure Cassandra to rebalance nodes automically as new machines are added, some are turned off or temporarily unavailable, etc?

Best Answer

Cassandra actually does rebalance nodes automatically as you add new ones; it's just not a very sophisticated approach. It picks the node with the highest "load" (see nodetool ring output) and places the new node on the ring to take over around half of the heaviest-loaded node's work. This doesn't perform a rebalancing of the cluster overall, but it does minimize the streaming load necessary for cluster expansion. This auto-balancing strategy tends to work best if you nearly double the cluster's size with each expansion.

If you need more nuanced rebalancing, you can move a node's position on the ring with the "nodetool move" command (which is really a wrapper for decommissioning and re-adding the node).

Related Topic