Disclaimer: You'd be mad to listen to me without doing a tonne of testing AND getting a 2nd opinion from someone qualified - I'm new to this game.
The efficiency improvement idea proposed in this question won't work. The main mistake that I made was to think that the order that the memcached stores are defined in the pool dictates some kind of priority. This is not the case. When you define a pool of memached daemons (e.g. using session.save_path="tcp://192.168.0.1:11211, tcp://192.168.0.2:11211"
) you can't know which store will be used. Data is distributed evenly, meaning that a item might be stored in the first, or it could be the last (or it could be both if the memcache client is configured to replicate - note it is the client that handles replication, the memcached server does not do it itself). Either way will mean that using localhost as the first in the pool won't improve performance - there is a 50% chance of hitting either store.
Having done a little bit of testing and research I have concluded that you CAN share sessions across servers using memcache BUT you probably don't want to - it doesn't seem to be popular because it doesn't scale as well as using a shared database at it is not as robust. I'd appreciate feedback on this so I can learn more...
Ignore the following unless you have a PHP app:
Tip 1: If you want to share sessions across 2 servers using memcache:
Ensure you answered Yes to "Enable memcache session handler support?" when you installed the PHP memcache client and add the following in your /etc/php.d/memcache.ini
file:
session.save_handler = memcache
On webserver 1 (IP: 192.168.0.1):
session.save_path="tcp://192.168.0.1:11211"
On webserver 2 (IP: 192.168.0.2):
session.save_path="tcp://192.168.0.1:11211"
Tip 2: If you want to share sessions across 2 servers using memcache AND have failover support:
Add the following to your /etc/php.d/memcache.ini
file:
memcache.hash_strategy = consistent
memcache.allow_failover = 1
On webserver 1 (IP: 192.168.0.1):
session.save_path="tcp://192.168.0.1:11211, tcp://192.168.0.2:11211"
On webserver 2 (IP: 192.168.0.2):
session.save_path="tcp://192.168.0.1:11211, tcp://192.168.0.2:11211"
Notes:
- This highlights another mistake I made in the original question - I wasn't using an identical
session.save_path
on all servers.
- In this case "failover" means that should one memcache daemon fail, the PHP memcache client will start using the other one. i.e. anyone who had their session in the store that failed will be logged out. It is not transparent failover.
Tip 3: If you want to share sessions using memcache AND have transparent failover support:
Same as tip 2 except you need to add the following to your /etc/php.d/memcache.ini
file:
memcache.session_redundancy=2
Notes:
- This makes the PHP memcache client write the sessions to 2 servers. You get redundancy (like RAID-1) so that writes are sent to n mirrors, and failed
get's
are retried on the mirrors. This will mean that users do not loose their session in the case of one memcache daemon failure.
- Mirrored writes are done in parallel (using non-blocking-IO) so speed performance shouldn't go down much as the number of mirrors increases. However, network traffic will increase if your memcache mirrors are distributed on different machines. For example, there is no longer a 50% chance of using localhost and avoiding network access.
- Apparently, the delay in write replication can cause old data to be retrieved instead of a cache miss. The question is whether this matters to your application? How often do you write session data?
memcache.session_redundancy
is for session redundancy but there is also a memcache.redundancy
ini option that can be used by your PHP application code if you want it to have a different level of redundancy.
- You need a recent version (still in beta at this time) of the PHP memcache client - Version 3.0.3 from pecl worked for me.
If your keys have unequal access patterns you will see unequal traffic to each memcached node. e.g. If you have 2 keys, one of which a
is get/set 500 times per second and one b
which is get/set 250 times per second then the node which contains a
will have twice as much traffic as the node which contains b
.
In my case, we had 8 memcached nodes with a few thousand keys. One of those keys was doing about 800 gets/sec at peak traffic and almost every other key was doing less than 1 get/sec. The memcached node which had the busy key exhibited significantly higher traffic than the others.
If you want to balance the traffic equally to each of your memcached nodes then you either need to:
- Play games with your keying to make sure that your busy keys are spread out properly.
- Switch to using repcached or Membase to replicate the keys across multiple nodes
Best Answer
Nearly all of the "distributed" part of memcached is handled on the client side.
If you have multiple memcached servers defined in your config (I see the php tag on your post, so I'm guessing you're using pecl/memcache, but I think the syntax is similar for pecl/memcached)
the client will determine which server to put the data in using a hash of the key. There's an option of the addServer method (retry_interval = -1) where if a memcached server goes down, your PHP will not continue to try it.
There is some information about how you can do "replication" in Memcache, but from my experience, it's not really worth the effort or the "wasted" memory (you'd have to store all caches on all servers, where if you just use the built-in distribution mechanism, it'll just have to be stored on one. Obviously, if one of your servers die, you're going to get cache misses until that data is stored on another server, but you shouldn't be using Memcache as a persistent store, anyway). The Memcache client protocol is pretty smart. ;)
Original link to https://blogs.oracle.com/trond/entry/replicate_your_keys_to_multiple removed as it no longer exists.