Can’t setup 3 nodes MongoDB recplica set

configurationmongodbreplication

I just follow instructions in MongoDB document

Replica Sets – Basics

to setup a 3-node Replica set. Everything goes fine when I do the initiate and add first node in the primary.

[foo@host-a mongodb]$ bin/mongo localhost
MongoDB shell version: 1.8.2
connecting to: localhost
> rs.initiate()
{
        "info2" : "no configuration explicitly specified -- making one",
        "info" : "Config now saved locally.  Should come online in about a minute.",
        "ok" : 1
}
> rs.add("host-b")
{ "ok" : 1 }

So far so good, but when I try to add third node

myset:PRIMARY> rs.addArb("host-c")
Sun Aug  7 22:57:09 MessagingPort recv() errno:104 Connection reset by peer 127.0.0.1:27017
Sun Aug  7 22:57:09 SocketException: remote:  error: 9001 socket exception [1]
Sun Aug  7 22:57:09 DBClientCursor::init call() failed
Sun Aug  7 22:57:09 query failed : local.$cmd { count: "system.replset", query: {}, fields: {} } to: 127.0.0.1
Sun Aug  7 22:57:09 Error: error doing query: failed shell/collection.js:150
Sun Aug  7 22:57:09 trying reconnect to 127.0.0.1
Sun Aug  7 22:57:09 reconnect 127.0.0.1 ok

As result, the current primary became secondary, and the host-b was marked as dead, but actually, it is still alive.

myset:SECONDARY> rs.status()
{
        "set" : "myset",
        "date" : ISODate("2011-08-08T04:03:23Z"),
        "myState" : 2,
        "members" : [
                {
                        "_id" : 0,
                        "name" : "host-a:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "optime" : {
                                "t" : 1312775799000,
                                "i" : 1
                        },
                        "optimeDate" : ISODate("2011-08-08T03:56:39Z"),
                        "self" : true
                },
                {
                        "_id" : 1,
                        "name" : "host-b",
                        "health" : 0,
                        "state" : 6,
                        "stateStr" : "(not reachable/healthy)",
                        "uptime" : 0,
                        "optime" : {
                                "t" : 0,
                                "i" : 0
                        },
                        "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                        "lastHeartbeat" : ISODate("2011-08-08T04:03:22Z"),
                        "errmsg" : "still initializing"
                }
        ],
        "ok" : 1
}

How could this happen? I just follow the guide in the document, did I do something wrong? Moreover, I can't do anything on current secondary server. It doesn't allow me to reconfig on the secondary node, but the problem is there is no primary node.

myset:SECONDARY> rs.reconfig({})
{
        "errmsg" : "replSetReconfig command must be sent to the current replica set primary.",
        "ok" : 0
}

Any ideas?

Best Answer

What I would do:

  1. On the system you want to stay as secondary, set the priority to 0
  2. On the problem system, check its log. Be sure it is listening on the expected port
  3. Confirm that from all systems you have connectivity to the problem system on the port it is listening on
  4. Drop the problem system from the config, then re add it

Hth!