Etcd2 fails on the CoreOS with the very simple setup

coreosdiscoveryetcd

I want to setup a 3 nodes cluster running my own discovery service and I am following this simple guide for static discovery. I know my 3 machines IP addresses and here is when I try to start etcd2 on the first machine:

etcd2 -name infra0 -initial-advertise-peer-urls http://10.0.0.1:2380 
-listen-peer-urls http://10.0.0.1:2380 
-listen-client-urls http://10.0.0.1:2379,http://127.0.0.1:2379 
-advertise-client-urls http://10.0.0.1:2379 
-initial-cluster-token etcd-cluster-1 
-initial-cluster infra0=http://10.0.0.1:2380,infra1=http://10.0.0.2:2380,infra2=http://10.0.0.3:2380 
-initial-cluster-state new

But it fails, it just prints out some strange output:

2015/10/20 15:16:50 etcdmain: etcd Version: 2.1.2
2015/10/20 15:16:50 etcdmain: Git SHA: ff8d1ec
2015/10/20 15:16:50 etcdmain: Go Version: go1.4.2
2015/10/20 15:16:50 etcdmain: Go OS/Arch: linux/amd64
2015/10/20 15:16:50 etcdmain: setting maximum number of CPUs to 1, total number of available CPUs is 1
2015/10/20 15:16:50 etcdmain: no data-dir provided, using default data-dir ./infra0.etcd
2015/10/20 15:16:50 etcdmain: the server is already initialized as member before, starting as etcd member...
2015/10/20 15:16:50 etcdmain: listening for peers on http://10.0.0.1:2380
2015/10/20 15:16:50 etcdmain: listening for client requests on http://10.0.0.1:2379
2015/10/20 15:16:50 etcdmain: listening for client requests on http://127.0.0.1:2379
2015/10/20 15:16:50 etcdserver: name = infra0
2015/10/20 15:16:50 etcdserver: data dir = infra0.etcd
2015/10/20 15:16:50 etcdserver: member dir = infra0.etcd/member
2015/10/20 15:16:50 etcdserver: heartbeat = 100ms
2015/10/20 15:16:50 etcdserver: election = 1000ms
2015/10/20 15:16:50 etcdserver: snapshot count = 10000
2015/10/20 15:16:50 etcdserver: advertise client URLs = http://10.0.0.1:2379
2015/10/20 15:16:50 etcdserver: restarting member 7ebe4414520dd95e in cluster 7ef0605c00fad3ab at commit index 3
2015/10/20 15:16:50 raft: 7ebe4414520dd95e became follower at term 128
2015/10/20 15:16:50 raft: newRaft 7ebe4414520dd95e [peers: [], term: 128, commit: 3, applied: 0, lastindex: 3, lastterm: 1]
2015/10/20 15:16:50 etcdserver: starting server... [version: 2.1.2, cluster version: to_be_decided]
2015/10/20 15:16:50 etcdserver: added local member 7ebe4414520dd95e [http://10.0.0.1:2380] to cluster 7ef0605c00fad3ab
2015/10/20 15:16:50 etcdserver: added member 8c9ced5da49597eb [http://10.0.0.2:2380] to cluster 7ef0605c00fad3ab
2015/10/20 15:16:50 etcdserver: added member 992dd2c84a457838 [http://10.0.0.3:2380] to cluster 7ef0605c00fad3ab
2015/10/20 15:16:51 rafthttp: failed to dial 8c9ced5da49597eb on stream MsgApp v2 (dial tcp 10.0.0.2:2380: i/o timeout)
2015/10/20 15:16:51 rafthttp: failed to dial 8c9ced5da49597eb on stream Message (dial tcp 10.0.0.3:2380: i/o timeout)
2015/10/20 15:16:51 rafthttp: failed to dial 992dd2c84a457838 on stream MsgApp v2 (dial tcp 10.0.0.1:2380: i/o timeout)
2015/10/20 15:16:51 rafthttp: failed to dial 992dd2c84a457838 on stream Message (dial tcp 10.0.0.1:2380: i/o timeout)
2015/10/20 15:16:52 raft: 7ebe4414520dd95e is starting a new election at term 128
2015/10/20 15:16:52 raft: 7ebe4414520dd95e became candidate at term 129
2015/10/20 15:16:52 raft: 7ebe4414520dd95e received vote from 7ebe4414520dd95e at term 129
2015/10/20 15:16:52 raft: 7ebe4414520dd95e [logterm: 1, index: 3] sent vote request to 992dd2c84a457838 at term 129
2015/10/20 15:16:52 raft: 7ebe4414520dd95e [logterm: 1, index: 3] sent vote request to 8c9ced5da49597eb at term 129
2015/10/20 15:16:53 rafthttp: failed to write 992dd2c84a457838 on pipeline (dial tcp 10.0.0.1:2380: i/o timeout)
2015/10/20 15:16:53 rafthttp: failed to write 8c9ced5da49597eb on pipeline (dial tcp 10.0.0.3:2380: i/o timeout)
2015/10/20 15:16:53 raft: 7ebe4414520dd95e is starting a new election at term 129
2015/10/20 15:16:53 raft: 7ebe4414520dd95e became candidate at term 130
2015/10/20 15:16:53 raft: 7ebe4414520dd95e received vote from 7ebe4414520dd95e at term 130
2015/10/20 15:16:53 raft: 7ebe4414520dd95e [logterm: 1, index: 3] sent vote request to 8c9ced5da49597eb at term 130
2015/10/20 15:16:53 raft: 7ebe4414520dd95e [logterm: 1, index: 3] sent vote request to 992dd2c84a457838 at term 130
2015/10/20 15:16:54 raft: 7ebe4414520dd95e is starting a new election at term 130
2015/10/20 15:16:54 raft: 7ebe4414520dd95e became candidate at term 131
2015/10/20 15:16:54 raft: 7ebe4414520dd95e received vote from 7ebe4414520dd95e at term 131
2015/10/20 15:16:54 raft: 7ebe4414520dd95e [logterm: 1, index: 3] sent vote request to 8c9ced5da49597eb at term 131
2015/10/20 15:16:54 raft: 7ebe4414520dd95e [logterm: 1, index: 3] sent vote request to 992dd2c84a457838 at term 131
^C2015/10/20 15:16:54 osutil: received interrupt signal, shutting down...
2015/10/20 15:16:54 rafthttp: failed to dial 992dd2c84a457838 on stream MsgApp v2 (net/http: request canceled while waiting for connection)
2015/10/20 15:16:54 rafthttp: failed to dial 992dd2c84a457838 on stream Message (net/http: request canceled while waiting for connection)
2015/10/20 15:16:54 rafthttp: failed to dial 8c9ced5da49597eb on stream MsgApp v2 (net/http: request canceled while waiting for connection)
2015/10/20 15:16:54 rafthttp: failed to dial 8c9ced5da49597eb on stream Message (net/http: request canceled while waiting for connection)

What does this really mean?

Best Answer

The line the server is already initialized as member before, starting as etcd member... indicates that you previously has this machine as a member of an etcd cluster. It's using this state to contact the previous cluster's members in an attempt to join the cluster.

You can remove /var/lib/etcd2/* to get rid of all traces of this old cluster. Then start etcd2 again.

Related Topic