AWS EC2 Mailserver Failover Strategies done right

amazon ec2amazon s3amazon-web-servicesemail-serverfailover

I'm researching in this topic really hard the last few days and i just want to discuss this with a few specific questions – i did not find any suitable thread here that is covering my needs and especially, that is quite actual – the most posts about this topic are around 2010 when, i guess, the last time AWS had a big failure (a whole region in murica was down when i remember right)

The current state:

We're running a Mailserver based on Ubuntu with Postfix/Dovecot/Horde, reading all mailbased configs out of a MySQL database. This is running as an EC2 instance with an EBS Storage where the OS and currently also the mails are stored. So far so good, but we're a startup and not just a private person who needs this server – so it is a Mailservice for our customers, super critical and verry important for us. After a few fails and downtimes in the first year, i will dramatically improve the setup – so i thought about "redundancy", basically..

The requirement:

The server must be "redundant" in some way, a fail of a single EC2 instance should not break the whole service anymore.

My research so far and options i see to solve:

  • Copy the instance into another region for example and build a "real" redundancy, a little bit old fashioned but that's what i learned back in school – using the new server as an MX-Backup configured through a second MX-Entry in DNS with lower priority. Problems here: Solving the data-redundancy -> i need to use rsync and db-replication for example to sync both servers. Not the option i want to implement because it can be super-tricky…

  • Service-Driven Solution, just using the AWS Possibilities right. I should use RDS for database and S3 for storage. So, if i have all the mails in the storage cloud (S3) and all the config-database-data in the db-cloud (RDS) -> the instance itself gets super flexible. This will give me the possibility to run several instances of that type in the same moment – so i can use ELB of EC2 to handle the load, starting new instances and detect failovers if one instance dies!! On the other side, my critical data spots, db and mailstorage would be service-driven, so i have not to think about failovers, downtime and most important, about scaleability anymore! So far the absolutely best solution i can imagine, but i see some serious problems.

Final Questions:

  • I never saw a good integration of S3 directly into the filesystem of Ubuntu – the experience i made is, that after few days of permanent run, the mount can disappear suddenly and with no reason and on the other side, multiple mounted S3 "drives" will replicate their data very very slow – i can understand that because it's a global cloud service but… How should this work? Imagine multiple running mailserver-instances, each using the same S3-drive -> so it is a requirement to replicate the maildata in an instant! So how we can "implement" a service-driven mailstorage that is really working with AWS? Has anyone ever made something like this? I just read everywhere "yeah so, you have to use aws services to solve that" but i can't find real implementations of that with mail.

  • Would an EBS-Based solution be better? So each running instance will have its own, dedicated drive to store, super-available and fast and again i will make an rsync setup to sync each other… Big contra here, huge costs.. each instance must have a huge EBS because everyone have to store ALL mails -> bullshit ^^

Is there any other failover scenario with AWS which i don't know yet? Sorry for the long text but i wanted to share all my thoughts so far… Thanks for reading if anyone does! 🙂

Best Answer

Ok i have not much time, the business is not waiting for geeks and solution-finding :) So i searched further on my own and i found a really, really interesting project -> for me personally this is filling some of the biggest gaps in aws: https://objectivefs.com/

With that, you have shared directories, synced in seconds, usable from all your running instances simultaneously and best one, on S3, directly integrated in your Unix-FS!

I'm not sure but for me it seems that that is the always-wanted missing piece for all scenarios like this, and i think it's funny once again, that it needs a third-party-developement to solve a scenario "right" that is so much propagated by aws-dudes... :)

I will give it a try and i'm almost sure, that this will perfectly solve my most-wanted scenario...

Related Topic