I'm trying to set up a test Elastic Search cluster on 3 separate hosts, using the official 7.2.0 docker image
Each container is is configured with an elasticsearch.yml which looks like this
cluster.name: mytest
network.host: "0.0.0.0"
node.name: mytest-10.131.105.90
discovery.seed_hosts:
- "10.131.128.252:9300"
- "10.131.129.28:9300"
- "10.131.105.90:9300"
cluster.initial_master_nodes:
- mytest-10.131.128.252
- mytest-10.131.129.28
- mytest-10.131.105.90
Once each node has started up, it's unable to discover the other nodes, reporting this
{
"type": "server",
"timestamp": "2019-07-04T18:42:18,751+0000",
"level": "WARN",
"component": "o.e.c.c.ClusterFormationFailureHelper",
"cluster.name": "mytest",
"node.name": "mytest-10.131.105.90",
"message": "master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [mytest-10.131.128.252, mytest-10.131.129.28, mytest-10.131.105.90] to bootstrap a cluster: have discovered []; discovery will continue using [10.131.128.252:9300, 10.131.129.28:9300, 10.131.105.90:9300] from hosts providers and [{mytest-10.131.105.90}{qZqV5-4RSduwKNYIOWVB9A}{_nCNwrToRoeNAiWBO1DbGg}{134.209.178.145}{134.209.178.145:9300}{ml.machine_memory=2090500096, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0"
}
Just to repeat that long error with word wrapping…
master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [mytest-10.131.128.252, mytest-10.131.129.28, mytest-10.131.105.90] to bootstrap a cluster: have discovered []; discovery will continue using [10.131.128.252:9300, 10.131.129.28:9300, 10.131.105.90:9300] from hosts providers and [{mytest-10.131.105.90}{qZqV5-4RSduwKNYIOWVB9A}{_nCNwrToRoeNAiWBO1DbGg}{134.209.178.145}{134.209.178.145:9300}{ml.machine_memory=2090500096, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
It doesn't seem to be a networking issue. From inside the container, I can use curl to verify access to ports 9200 and 9300 on the other nodes.
Suspect it's something subtle about the node names, and I was hoping that in writing this question, I'd hit upon the answer. Alas, not.
addendum – docker run
My docker run
looks like this, simplified a little (${IP}
is the host machine's IP address).
docker run --rm --name elasticsearch \
-p ${IP}:9200:9200 -p ${IP}:9300:9300 \
--network host \
my-elasticsearch:7.2.0 \
/usr/local/bin/start-clustered-es.sh
Each container is running on a separate machine. start-clustered-es.sh
simply writes the elasticsearch.yml
file as outlined above, so each node starts with same config. Once the file is written, it calls the base container's startup script with exec /usr/local/bin/docker-entrypoint.sh eswrapper
I tried --network host
as the config uses the IP of the host machine. From inside the containers, I can reach port 9200/9300 of the other machines, so it doesn't seem to be a network issue.
Any pointers most welcome…
Best Answer
One idea is to limit
transport.profiles.default.port
akatransport.port
or set-p
on docker run to the full default range of9300-9400
.According to the documentation at https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-transport.html
transport.profiles.default.port
defaults to9300-9400
.Further
discovery.seed_hosts
lists that port relates totransport.profiles.default.port
. https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-settings.htmlHope this siggestion helps, as it is already some time ago when I formed the last cluster using version 6.x, needing some
discovery.zen
values with docker.