Magento – AWS Elastic Load Balancer Creating Cache Misses

amazon-web-servicescachecloudee-1.13.1.0

Forgive me if my title is way off the mark, here. I am by no means an expert in DevOps or cloud architecture.

I've inherited a client store hosted on Amazon EC2, which runs a total of 10 instances (5 web, 1 separate admin, 1 solr, 1 monitoring, etc). The 5 web servers are managed through a single load balancer.

I don't know much about the methods employed by AWS to balance traffic load, nor how to tune Magento's session/cache management mechanisms to deal with this, but what I can observe is some clear latency between page loads in a seemingly random order.

For example, if I refresh a page 10 times, 5 out of those 10 may be instantaneous, confirmed to be served from FPC. However, the other 5 seem to have no cache out of which to serve the page, and so I'm left waiting for Magento to generate from scratch.

My guess is that it has to do with the load balancer bouncing me around to different instances where cache has not yet been primed. How do I confirm this?

Further, is there a guide which describes the correct configuration of a load balancer in the context of Magento's cache and session management?

Useful notes: Magento EE 1.13.1.0, FPC enabled, Redis backend cache, Redis session management.

Update: Here also is a snippet of local.xml for this instance (URLs changed):

    <!-- backend cache -->
    <cache>
        <backend>Cm_Cache_Backend_Redis</backend>
        <backend_options>
            <server>some.server.cache.amazonaws.com</server>
            <port>6379</port>
            <database>0</database>
            <password></password>
            <force_standalone>0</force_standalone>
            <connect_retries>1</connect_retries>
            <automatic_cleaning_factor>0</automatic_cleaning_factor>
            <compress_data>1</compress_data>
            <compress_tags>1</compress_tags>
            <compress_threshold>20480</compress_threshold>
            <compression_lib>gzip</compression_lib>
            <persistent>1</persistent>
        </backend_options>
    </cache>

    <!-- full-page caching -->
    <full_page_cache>
        <backend>Cm_Cache_Backend_Redis</backend>
        <backend_options>
            <server>some.server.cache.amazonaws.com</server>
            <port>6379</port>
            <database>1</database>
            <password></password>
            <force_standalone>0</force_standalone>
            <connect_retries>1</connect_retries>
            <automatic_cleaning_factor>0</automatic_cleaning_factor>
            <!-- FPC data is already gzipped -->
            <compress_data>0</compress_data>
            <compress_tags>1</compress_tags>
            <compress_threshold>20480</compress_threshold>
            <compression_lib>gzip</compression_lib>
            <lifetimelimit>43200</lifetimelimit>
            <persistent>2</persistent>
        </backend_options>
    </full_page_cache>

    <!-- session caching -->
    <session_save>db</session_save>
    <redis_session>
        <host>some.server.cache.amazonaws.com</host>
        <port>6379</port>
        <db>3</db>
        <password></password>
        <timeout>2.5</timeout>
        <compression_threshold>2048</compression_threshold>
        <compression_lib>gzip</compression_lib>
        <log_level>1</log_level>
        <max_concurrency>6</max_concurrency>
        <break_after_frontend>5</break_after_frontend>
        <break_after_adminhtml>30</break_after_adminhtml>
        <bot_lifetime>7200</bot_lifetime>
        <persistent>3</persistent>
    </redis_session>

Best Answer

What I believe you will be seeing is what I have seen when I initially set up a single Redis instance for both cache and session. I was seeing a random 2.5 second timeout when accessing the session data and found a couple of great articles from Colin Mollenhour and Fabrizio Branca on these sorts of issues.

TL;DR - use different Redis instances for cache, session and FPC.

This is for a couple of reasons:

  1. Redis is single threaded, if you have multiple databases being accessed by multiple processes you are more likely to have contention and waits for access to the data. See Colin's "Re: Using Redis as a Cache Backend in Magento" for more information on this point
  2. You can create different configurations for each instance, allowing you to persist the sessions to disk, without persisting the cache data to disk. See Fabrizio's Redis Optimization post for more information on this.

edit: spelling mistake

Related Topic