I'm trying to automate the process of horizontal scale up and scale down of elasticsearch nodes in kubernetes cluster.
Initially, I deployed an elasticsearch cluster (3 master, 3 data & 3 ingest nodes) on a Kubernetes cluster. Where, cluster.initial_master_nodes
was:
cluster.initial_master_nodes:
- master-a
- master-b
- master-c
Then, I performed scale down operation, reduced the number of master node 3 to 1 (unexpected, but for testing purpose). While doing this, I deleted master-c
, master-b
nodes and restarted master-a
node with the following setting:
cluster.initial_master_nodes:
- master-a
Since the elasticsearch nodes (i.e. pods) use persistant volume, after restarting the node, the master-a
slowing the following logs:
"message": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [TxdOAdryQ8GAeirXQHQL-g, VmtilfRIT6KDVv1R6MHGlw, KAJclUD2SM6rt9PxCGACSA], have discovered [] which is not a quorum; discovery will continue using [] from hosts providers and [{master-a}{VmtilfRIT6KDVv1R6MHGlw}{g29haPBLRha89dZJmclkrg}{10.244.0.95}{10.244.0.95:9300}{ml.machine_memory=12447109120, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 5, last-accepted version 40 in term 5" }
Seems like it's trying to find master-b
and master-c
.
Questions:
- How to overwrite cluster settings so that
master-a
won't search for these deleted nodes?
Best Answer
The
cluster.initial_master_nodes
setting only has an effect the first time the cluster starts up, but to avoid some very rare corner cases you should never change its value once you've set it and generally you should remove it from the config file as soon as possible. From the reference manual regardingcluster.initial_master_nodes
:Aside from that, Elasticsearch uses a quorum-based election protocol and says the following:
You have stopped two of your three master-eligible nodes at the same time, which is more than half of them, so it's expected that the cluster no longer works.
The reference manual also contains instructions for removing master-eligible nodes which you have not followed:
It goes on to describe how to safely remove the unwanted nodes from the voting configuration using
POST /_cluster/voting_config_exclusions/node_name
when scaling down to a single node.