I have setup Elastic load balancing with 5 EC2 instance registered with the load balancer. To our website users upload their data(images), we store these images in network attached storage (NAS). We have the NAS mounted on all the instances.
We are planning a move to introduce Amazon AutoScaling and also move out of Network Attached storage.
-
Is GlusterFS a good solution to share data across all the instances in the Autoscaling group?
-
Does Gluster ensure there is no loss of data ?
-
What will happen if all the instances in Autoscaling are terminated, will I lose user data ?
-
What happens if a user uploads a image and the server processing the request goes down ?
-
Is there an impact on IO if clients go down ? (What exactly does Gluster do?)
Best Answer
Possibly.. The only way you'll get a definitive answer is with your own tests, however. In the past, I've set up a 4 node webserver cluster on Linode instances, using GlusterFS to distribute/share the assets directory of images and so on.
We found 2 main problems with this approach:
Purely anecdotal evidence, but I'd not run GlusterFS on a virtual machine with SAN/shared storage ever again.
It can... In Gluster 3.0, there's a better recognition of "replication pools" where you can define how many copies of the data exists throughout the cluster. Setting a replication level of 2, means that there's 2 copies on the entire cluster.. This effectively halves your storage capacity, but means that you've got greater resilience to node failure.
Importantly, it also means that you have to add more nodes as multiples of the replication level, in this case, pairs of nodes.
If the instances are only using ephemeral instance storage, yes. If they're EBS based, or using mounted EBS instances, then no.
That greatly depends on how your application is designed. I strongly suspect that the user would lose their data (almost certain in a naively architected solution.)
See above.. If the client goes down because of backend storage problems, it can easily destroy the performance of the cluster entirely.