Magento – Magento varien session start very slow on category pages with MEMCACHE session storage

magento-1.9memcachememcachedsession

I am using memcache for session storage and on category pages I have noticed in new relic transactions where varien session start can take over 30 seconds.

It can possibly be something to do with session locking, but I thought this wasn't really an issue when using memcache.

Anybody ever faced this or have ideas what could be causing this.

Best Answer

I've seen this quite a lot on New Relic as well.

From what I've seen there are a few different causes, I don't have a complete understanding of this issue but it is something I've been looking into recently. Here's my findings.

Sessions in Magento, Locking, and New Relic

Every controller action in Magento uses the session, whether it needs to or not. The session is eagerly instantiated in Mage_Core_Controller_Varien_Action::preDispatch

If you have session locking enabled, this means that for the duration of the request your session is locked down until the request completes. I haven't found the bit of code that releases the session lock yet, but I'm pretty sure it's in there somewhere.

Ultimately this means if you fire off multiple concurrent requests to Magento controller actions from the one location using the same session, you will have to wait for some of those requests to complete and unlock the session to proceed. I usually see this as a slow transaction on new relic stuck at Mage_Core_Model_Session_Abstract_Varien::start for ~30 seconds (my session lock wait timeout I think).

This report on New Relic has multiple downsides as I see it

  • Slows down the total average response time, because these requests are slower than they otherwise should have been.
  • New Relic records a sample of the slowest transactions, if I have performance bottlenecks that take for example 20 seconds New Relic will not report them automatically for me if the same URL is plagued by session locking timeouts. The timeouts are hiding the useful data.

Causes

I've seen a few common causes for this, not a definitive list by any means

Bots

Crawlers like Baidu and Yandex being a being a bit rude and battering the website. They're being run from one location firing off numerous requests, using the same session, and tripping up the session locking mechanism, hence showing slow transactions in New Relic.

Ajax calls to Magento controller actions

With varnished websites customer specific data must be loaded with care, some websites manage this by using ajax calls to the Magento backend to get the required data. I have also seen some websites using ajax calls to the backend to get product specific information, such as the amount left in stock when an item is on sale.

If a single page triggers multiple ajax calls to the backend on page load, it can potentially trigger the session locking mechanism. The more ajax calls back to the Magento backend the more likely you are to experience locking.

Varnish ESI

The same as above really, except instead of using ajax calls it uses Edge Side Includes which seem to be new calls to the backend.

My plan

I have not actioned this yet so it's still purely theoretical, but it's something i'm looking into doing over the next few months.

I brought this problem up during the Mage Titans UK 2016 conference and Fabrizio Branca pointed me towards the following module: https://github.com/AOEpeople/Aoe_BlackHoleSession.

Based on a regular expression the module will prevent Bots from creating real sessions, this should have the benefit that no session lock will be hit, and that your session resources won't be battered by rude bots. Bots should no longer pollute your New Relic readings.

For ajax/ESI calls to get customer data there on cached pages there's nothing you can do that I can see. You need access to the session in order to retrieve customer specific data.

However, for ajax/ESI calls to get catalog specific data (such as limited stock) I don't see any need for a session to exist on that request at all. My plan for the future is to trial out an extension to the Aoe_BlackHoleSession module so that I can silo off requests to a specific URL as being sessionless.

I'm less familiar with the internals of ESI, so sadly I don't have too much to comment there.

An alternative

During the conference Fabrizio Branca said he was able to disable session locking completely without any ill effects, test at your own risk.