Multiple AWS EC2 running Auto Scaling and Distributed Memcache

amazon ec2autoscalingload balancingmemcachememcached

I'm planning to use distributed Memcache across a series of Linux web servers on Amazon EC2. These EC2s currently run auto scaling so will increase and decrease with load.

I have used this post to guide the initial setup.

best practice with memcache/php – multi memcache nodes

And another link it references in a reply.

http://techgurulive.com/2009/07/22/a-brief-to-memcached-hash-types/

I'm a little unsure on how the web servers local AWS IPs will be updated in the list of nodes within the Memcache client code as and when the servers increase and decrease.

I am currently using PHP PECL – Memcache on the client.

Does anyone know the best implementation to manage this with the AWS environment?

Thanks in advance for any suggestions.

Dave

Best Answer

It sounds like you are designing a fairly large system to handle quite a bit of load. If so, you likely want to have separate instances for Memcache and web nodes, as their performance characteristics are quite different. By separating the two you can use an instance type more suited to each.

Memcache uses almost no CPU under normal circumstances, so using a high memory instance makes sense. Likewise, web servers are generally processor intensive. Use high CPU instances for them.

Keep in mind that memcached itself is very simple. It sits and waits for a command saying "get me this key" or "set this key to this value" (there are a few others, but that's the basics). Any hashing/etc is purely done in the client memcached library.

With this in mind, auto-scaling of memcached is quite ambitious, although it certainly can be done. If you are constantly bringing up and decommissioning servers, you will either have to accept that some cached data will periodically be lost, presumably putting more load on your databases, or ensure that all data stored in a memcached instance is first copied to another instance.