You can only attach an EBS device to a single instance, so you're going to have to detach it when moving. I'm assuming that you want to avoid data intensive processes like creating an EBS snapshot or rsyncing the data to a new instance. I'm also assuming that you're using RAID1.
The safest option will require a couple minutes of downtime. You would start up a new instance, and install and configure the software necessary (e.g. NFS server). Then on the old instance, unmount the filesystem, stop the array and detach the two EBS devices. Then attach the EBS devices to the new instance, start the array and mount the filesystem. Get the webservers to mount NFS from the new instance. Starting the array should just be a case of running the mdadm
command you describe, but I'd definitely test this first.
The second option potentially has lower downtime (assuming that you can operate in read only mode for a while), but is more dangerous. You'd start the new instance as above. On the old instance, remount the filesystem in read only mode. Then fail one of the RAID devices, detach this EBS device and attach it to the new instance. Start the array in degraded mode on the new instance, mount the filesystem, and get the webservers to mount NFS from the new instance (the site should be fully available at this stage). Then stop the array on the old instance, detach the EBS device and attach it to the new instance, and add it to the array. This may however trigger a full resync, so again, test this first.
Whatever you do, make sure to test the process first so you know exactly how to perform it, and make sure you have backups in case it goes horribly wrong. (Also, consider storing your media on S3.)
First, if you take a snapshot, it will include the oplog - the oplog is just a capped collection living in the local database. Snapshots will get back to a point in time, and assuming you have journaling enabled (it is on by default), you do not need to do anything special for the snapshot to function as a backup.
The only absolute requirement is that the EBS snapshot has to be recent enough to fall within your oplog window - that is the last (most recent) operation recorded in the snapshot backup oplog must also still be in the oplog of the current primary so that they can find a common point. If that is the case it will work something like this:
- You restore a secondary from an EBS snapshot backup
- The
mongod
starts, looks for (and applies) any relevant journal files
- Next, the secondary connects to the primary and finds a common point in the two oplogs
- Any subsequent operations from the primary are applied on the RECOVERING secondary
- Once the secondary catches up sufficiently, it moves to the SECONDARY state and the backup is complete
If the snapshot is not recent enough, then it can be discarded - without a common point in the oplog, the secondary will have to resync from scratch anyway.
To answer your specific questions:
Do I need to record oplogs and use those in conjunction to restore
after a failure?
As explained above, if you snapshot, you already are backing up the oplog
Should I spin up another instance within the replica set specifically
for backups and snapshot that vs. taking snapshots of primary and
secondary? If so, we're back to the oplog issue aren't we?
There's no oplog issue beyond the common point/window one I mentioned above. Some people do choose to have a Secondary (usually hidden) for this purpose to avoid adding load to a normal node. Note: even a hidden member gets a vote, so if you added one for backup purposes you can remove the arbiter from your config, you would still have 3 voting members.
Should I snapshot each replica volume and rely on on the replica set
completely to cover the time between failure and the last snapshot?
Every member of a replica set is intended to be identical - the data is the same, any secondary can become primary etc. - these are not slaves, every replica set member contains the full oplog and all the data.
So, taking multiple snapshots (assuming you trust the process) is going to be redundant (of course you may want that redundancy). And yes, the whole intention of the replica set functionality is to ensure that you don't need to take extraordinary measures to use a secondary in this way (with the caveats above in mind, of course).
Best Answer
I guess the short answer is: Cannot do this.
If you want to do this, you need to run your own MySQL instances instead of using RDS. Lame answer, and I am surprised that Amazon has decided to not support multiple region replication given their alleged dedication to scalable, redundant and fault tolerant infrastructure.
Oh well :\