Java – Mongo Replica Set behind Firewall

javamongodb

Given you run a current version (3.2) of MongoDB as a replica set in your network consisting of 3 nodes:

mongo1.local
mongo2.local
mongoarbiter.local

Now those nodes should be available via public internet (restricted via FW). mongo1 and mongo2 will get a VIP on the firewall and some valid A-Records:

mongo1.example.com
mongo2.example.com

The arbiter is not exposed.

Now some client implementations just work fine (python) if you pass the external DNS names in via connection string. But others (Java) will fail to connect since the replica set only knows its internal names. The clients will parse the list of nodes provided by the rs, notice that the externel name it has connected to is not in the list and fail:

Monitor thread successfully connected to server with description ServerDescription{address=mongo1.example.com:27017, type=REPLICA_SET_PRIMARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 0, 14]}, minWireVersion=0, maxWireVersion=3, maxDocumentSize=16777216, roundTripTimeNanos=5305689, setName='mongo-rs', canonicalAddress=mongo1.local:27017, hosts=[mongo2.local:27017, mongo1.local:27017], passives=[], arbiters=[mongo3.local:27017], primary='mongo1.local:27017', tagSet=TagSet{[]}, electionId=5821da77ccc118202cd2b75d, setVersion=3}

Is there any solution to this other than messing with /etc/hosts on the clients system?

BTW: this does the trick with the js client lib but looks a bit dirty as well:

replSet.connectWithNoPrimary

Best Answer

Official MongoDB drivers implement a Server Discovery and Monitoring (SDAM) specification, which is available on GitHub in the mongodb/specifications repository. The SDAM spec goes into more detail on expected behaviour and rationale for drivers.

The current expectation is that clients will always use the hostnames listed in the replica set config, not the seed list provided in a connection string. The primary motivation for doing so is to enable automatic failover and reconfiguration based on an agreed replica set configuration (which includes hostnames and ports).

Is there any solution to this other than messing with /etc/hosts on the clients system?

If you do not require failover you could connect to a single server rather than using a replica set connection. A standalone/direct connection should not implement any server discovery.

However, if you are connecting to anything other than a standalone server there aren't any workarounds at the moment outside of fiddling your hostname resolution to match the replica set config or extending your networking perimeter (eg. using a VPN).

A relevant feature suggestion to upvote/watch is: SERVER-1889: Support different networks / nics for client & replication traffic. This could allow separation of the internal network communication for the replica set from the client connections.

Related Topic