I am running sharded mongodb in a kubernetes environment, with 3 shards, and 3 instances on each shard. for some reasons, my mongodb instance have been rescheduled to another machine.

the problem is when a mongodb instance have ben rescheduled to another instance, its replica config will be invalidated. resulting to this error below.

            > rs.status()
                "state" : 10,
                "stateStr" : "REMOVED",
                "uptime" : 2110,
                "optime" : Timestamp(1448462710, 6),
                "optimeDate" : ISODate("2015-11-25T14:45:10Z"),
                "ok" : 0,
                "errmsg" : "Our replica set config is invalid or we are not a member of it",
                "code" : 93

this is the config

            > rs.config().members
                    "_id" : 0,
                    "host" : "mongodb-shard2-service:27038",
                    "arbiterOnly" : false,
                    "buildIndexes" : true,
                    "hidden" : false,
                    "priority" : 1,
                    "tags" : {

                    "slaveDelay" : 0,
                    "votes" : 1
                    "_id" : 1,
                    "host" : "shard2-slave2-service:27039",
                    "arbiterOnly" : false,
                    "buildIndexes" : true,
                    "hidden" : false,
                    "priority" : 1,
                    "tags" : {

                    "slaveDelay" : 0,
                    "votes" : 1
                    "_id" : 2,
                    "host" : "shard2-slave1-service:27033",
                    "arbiterOnly" : false,
                    "buildIndexes" : true,
                    "hidden" : false,
                    "priority" : 1,
                    "tags" : {

                    "slaveDelay" : 0,
                    "votes" : 1

and a sample of db.serverStatus() of a rescheduled mongodb instance

            > db.serverStatus()
                "host" : "mongodb-shard2-master-ofgrb",
                "version" : "3.0.7",
                "process" : "mongod",
                "pid" : NumberLong(8),

For those who want to use the old way of setting up mongo (using ReplicationControllers or Deployments instead of PetSet), the problem seems to be in the hostname assignment delay of kubernetes Services. The solution is to add a 10 seconds delay in the container entrypoint (before starting the actual mongo):

    - name: mongo-node1
      image: mongo
      command: ["/bin/sh", "-c"]
      args: ["sleep 10 && mongod --replSet rs1"]
        - containerPort: 27017
        - name: mongo-persistent-storage1
          mountPath: /data/db

related discussion:

