MongoDB Replica-Set with Replication Lag on one node only

database-replicationlagmongodbreplication

we experience a strange behaviour in our MongoDB Replica-Set, setup of 3 Nodes (all Xeon Quad-Core-Class CPUs, 16GB of RAM for one, 24GB for the other two nodes)
The one node with less RAM is normal secondary with priority 0, other two priority 1. Recently we experienced a Replication-Lag of about 60 seconds every 3 to 4 hours, self disappearing after 2-3minutes (Nagios Checks!)

We have almost no traffic on those machines, just some databases with a size of 0,3GB and one is 5GB. And we have one collection which has about 65000 entries but also an id index.

The Strange thing is, that the 16gb-secondary has no lag, but only the secondary from the two larger machines. i just changed it to be primary to see if the old primary (now secondary) also has this behaviour.

Does anyone know what we can do or check? Because we have no clue.

I checked the Load and processes of those machines, the network connectivity and routing, disk states – everyhtings fine.

Best Answer

A few quick checks:

Are you running on 2.0 or below? Replication got a major overhaul in 2.2
Do you have any capped collections? A missing index on _id in a capped collection can cause this kind of lag
You mention that the hosts are not too busy - if you have gaps in your new ops, the math used to calculate lag can falsely report lag when no ops are happening
How are you calculating the lag? I would definitely try to confirm any lag from the shell - last optime from the entries in rs.status() would be a good start
Double check on the network side of things, latency spikes and/or intermittent packet loss could cause this and be transient enough to be hard to detect (take a look at netstat --statistics before and after a lag spike for example - see if retransmits or erorrs are increasing)
If you are running 2.2, see if switching the host the lagging secondary is syncing from, somewhat confusingly revealed by the [syncingTo][3] field in rs.status(). This is done using the rs.syncFrom() command.
If it's not there already, get the set into MMS and see if anything is spiking on/around the same time as the lag spike to point you in the right direction.

If, after all that, you still don't know what's causing this, then it may be beyond answering on serverfault in a reasonable way (would need to look at logs, stats etc.) - I'd recommend the mongodb-user Google group as the next step.

MS Clustering

The decision of where to put the public, heart-beat, inter-node communication, and quorum drive is significant. Also cluster architecture makes a difference; you pick different quroum options if the two nodes are in adjacent racks than if they were in completely different datacenters.

Put the heartbeat on the same interface/subnet as the public interface

This theory holds that if you lose your public interface, you want the heartbeat to fail because this node is effectively down to users.

Put the heartbeat on it's own private interface/subnet

This theory holds that something outside of the cluster is arbiting who is doing what role, and unnecessary node-death is to be avoided.

Put the WFS on the heartbeat network

If the two nodes are in the same overall network (the same set of switches is supporting the non-public networks for both nodes) then putting the WFS on the heartbeat network doesn't introduce any new vulnerabilities.

If the two nodes are in different network fault domains (such as different datacenters), this is a bad idea. The heartbeat network provides the 'node majority' quorum option, and the WFS provides the 'File Share Majority' quorum option. You really want both options to be in separate fault domains.

Your revised diagram makes sense if both nodes are in the same data-center, though I myself would but the heartbeat on the public side.

MongoDB

MongoDB is a bit simpler. With even numbers of nodes, you absolutely want a third node to act as tie-breaker. They're pretty clear about that. However, your diagram states:

Up to 12 replica members (7 can vote).

7 is an odd number. You don't require an Arbiter.

Unlike Microsoft clusters, Mongo's cluster voting doesn't care about multiple avenues of network to break voting deadlocks. Because of this, separate arbitration and cluster-internal networks do not provide any meaningful increase in robustness. The only reason you'd want a separate arbitration network is if replication traffic was expected to be so heavy that election-packets (the heartbeat, actually) would get pushed so far down the stack that it would miss the 10 second timeout.

When does the replication from primary to secondary happen in mongodb

Mongo syncs instantly, so there is something wrong with your Replica set.

MongoDB replica sets are something that you need to get right from when they are first set up. If they aren't set up correctly, they can be difficult to fix.

Configuration of the Replica Set should (normally) be done from the master only. If your Set isn't live yet, the best option may be to re-create it.

Also, not sure what robomongo is, but you're probably better off using the native mongo client to find out what is going on.

The rs.status() command should give you output like this

rs0:SECONDARY> rs.status()
{
"set" : "rs0",
"date" : ISODate("2014-03-10T10:42:27Z"),
"myState" : 2,
"syncingTo" : "mongo-master:27017",
"members" : [
    {
        "_id" : 0,
        "name" : "mongo-master:27017",
        "health" : 1,
        "state" : 1,
        "stateStr" : "PRIMARY",
        "uptime" : 3008469,
        "optime" : Timestamp(1394448146, 1),
        "optimeDate" : ISODate("2014-03-10T10:42:26Z"),
        "lastHeartbeat" : ISODate("2014-03-10T10:42:26Z"),
        "lastHeartbeatRecv" : ISODate("2014-03-10T10:42:26Z"),
        "pingMs" : 1
    },
    {
        "_id" : 3,
        "name" : "mongo-slave3:27017",
        "health" : 1,
        "state" : 2,
        "stateStr" : "SECONDARY",
        "uptime" : 3012206,
        "optime" : Timestamp(1394448146, 1),
        "optimeDate" : ISODate("2014-03-10T10:42:26Z"),
        "self" : true
    },
    {
        "_id" : 4,
        "name" : "mongo-slave4:27017",
        "health" : 1,
        "state" : 2,
        "stateStr" : "SECONDARY",
        "uptime" : 890533,
        "optime" : Timestamp(1394448146, 1),
        "optimeDate" : ISODate("2014-03-10T10:42:26Z"),
        "lastHeartbeat" : ISODate("2014-03-10T10:42:26Z"),
        "lastHeartbeatRecv" : ISODate("2014-03-10T10:42:26Z"),
        "pingMs" : 0,
        "syncingTo" : "mongo-master:27017"
    }
],
"ok" : 1
}

Best Answer

Related Solutions

Sql-server – How to network this Windows Failover Cluster and MongoDB Replica Set? (diagram inside)

MS Clustering

MongoDB

When does the replication from primary to secondary happen in mongodb

Related Topic