Azure – Memcached session manager in Azure: Connection gets forcibly closed

azurememcachedsticky-sessionstomcat

I am using Memcached Session Manager to handle Tomcat sessions in non-sticky mode. My deployment in Azure consists of a Worker Role with two instances which connect to an Azure VM running my Memcached server.

Everything works pretty well, my session is persisted and retrieved by any of the two instances transparently. The problem arises when the session is idle for about 4 minutes; everything points out that the Azure Loadbalancer is closing the spymemcached connection to the VM after some period of inactivity.

My MSM configuration is this:

<Manager className="de.javakaffee.web.msm.MemcachedBackupSessionManager"
    memcachedNodes="n1:my-azure-vm.cloudapp.net:11211"
    sticky="false"
    sessionBackupAsync="false"
    sessionBackupTimeout="10000"
    lockingMode="uriPattern:/path1|/path2"
    requestUriIgnorePattern=".*\.(ico|png|gif|jpg|css|js|ttf|eot|svg|woff)$"           
    transcoderFactoryClass="de.javakaffee.web.msm.serializer.kryo.KryoTranscoderFactory"
    customConverter="de.javakaffee.web.msm.serializer.kryo.HibernateCollectionsSerializerFactory"/>

The stacktrace printed by the spymemcached client is this:

INFO net.spy.memcached.MemcachedConnection:  Reconnecting due to 
exception on {QA sa=/10.194.132.206:13000, #Rops=1, #Wops=0, #iq=0, 
topRop=net.spy.memcached.protocol.binary.StoreOperationImpl@1d95da8, 
topWop=null, toWrite=0, interested=1} 
java.io.IOException: An existing connection was forcibly closed by the 
remote host 
    at sun.nio.ch.SocketDispatcher.read0(Native Method) 
    at sun.nio.ch.SocketDispatcher.read(Unknown Source) 
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source) 
    at sun.nio.ch.IOUtil.read(Unknown Source) 
    at sun.nio.ch.SocketChannelImpl.read(Unknown Source) 
    at net.spy.memcached.MemcachedConnection.handleReads 
(MemcachedConnection.java:303) 
    at net.spy.memcached.MemcachedConnection.handleIO 
(MemcachedConnection.java:264) 
    at net.spy.memcached.MemcachedConnection.handleIO 
(MemcachedConnection.java:184) 
    at net.spy.memcached.MemcachedClient.run(MemcachedClient.java:1298) 

Given this idle time limitation in Azure, is there any other way to make MSM work in the azure cloud?

Best Answer

There's nothing available to solve this out of the box. But you could subclass MemcachedBackupSessionManager and use the backgroundProcess method (that's invoked by tomcat every second or every 10 secs, not sure about this) to ping your configured memcacheds. A very simple implementation looks like this:

package de.javakaffee.web.msm;

public class MyMsm extends MemcachedBackupSessionManager {

    @Override
    public void backgroundProcess() {
        super.backgroundProcess();
        final MemcachedNodesManager nodesManager = _msm.getMemcachedNodesManager();
        // got through all configured node ids and ping each memcached
        // with a dummy key.
        // _msm.newSessionId("ping") generates e.g. ping-n1 for a nodeId n1
        // so this will be routed the related memcached node
        for (String nodeId : nodesManager.getPrimaryNodeIds()) {
            // use async here so that no error handling is needed
            _msm.getMemcached().asyncGet(_msm.newSessionId("ping"));
        }
    }
}

Then you jar this class, place the jar in $CATALINA_HOME/lib besides msm jars and change the Manager classname to className="de.javakaffee.web.msm.MyMsm".

If you like you can also fork msm and place a pull request with an addition that makes this configurable :-)

Related Topic