Magento – mass_action index process

ceindexingmagento-1.9massaction

We are having the problem that the mass_action index process never seems to execute. This has the side effect that the job data of this job grows substantially over time.

In our case over the course of several days the job data grows to several MB.

mysql> select type, entity, count(*), avg(length(new_data)), max(length(new_data)) from index_event group by type, entity;
+-----------------------+--------------------------------+----------+-----------------------+-----------------------+
| type                  | entity                         | count(*) | avg(length(new_data)) | max(length(new_data)) |
+-----------------------+--------------------------------+----------+-----------------------+-----------------------+
| catalog_reindex_price | catalog_product                |     1368 |              442.7982 |                   443 |
| mass_action           | catalog_product                |        1 |          6176981.0000 |               6176981 |
| save                  | awaffiliate_withdrawal_request |        6 |              444.3333 |                   445 |
| save                  | cataloginventory_stock_item    |     4439 |              607.9685 |                   608 |
| save                  | catalog_category               |       71 |             1040.3662 |                  3395 |
| save                  | catalog_eav_attribute          |        3 |              424.6667 |                   476 |
| save                  | catalog_product                |     1368 |             1269.0899 |                  1922 |
+-----------------------+--------------------------------+----------+-----------------------+-----------------------+

As this data is unserialized and then serialized for an update as well as transfered from and to the DB server an update to the mass_action entry takes currently around 3s to complete. This affects code that triggers an update of this index.

As I understand it the triggering an update of the following indexes will update mass action

mysql> select ip.indexer_code, ipe.process_id, ipe.event_id, ipe.status, ie.type, ie.created_at from index_process_event as ipe left join index_event as ie on ipe.event_id = ie.event_id  left join index_process as ip on ip.process_id = ipe.process_id where ie.type  = 'mass_action';
+---------------------------+------------+----------+--------+-------------+---------------------+
| indexer_code              | process_id | event_id | status | type        | created_at          |
+---------------------------+------------+----------+--------+-------------+---------------------+
| catalog_product_attribute |          1 |     9074 | new    | mass_action | 2016-11-03 23:18:06 |
| catalog_product_price     |          2 |     9074 | new    | mass_action | 2016-11-03 23:18:06 |
| catalogsearch_fulltext    |          7 |     9074 | new    | mass_action | 2016-11-03 23:18:06 |
| cataloginventory_stock    |          8 |     9074 | new    | mass_action | 2016-11-03 23:18:06 |
| tag_summary               |          9 |     9074 | new    | mass_action | 2016-11-03 23:18:06 |
| catalog_product_flat      |         19 |     9074 | new    | mass_action | 2016-11-03 23:18:06 |
| catalog_category_product  |         21 |     9074 | new    | mass_action | 2016-11-03 23:18:06 |
+---------------------------+------------+----------+--------+-------------+---------------------+

We have all indexes set to manual and run cronjobs at various times of the day to update the indexes.

mysql> select * from index_process;
+------------+-------------------------------+-----------------+---------------------+---------------------+--------+
| process_id | indexer_code                  | status          | started_at          | ended_at            | mode   |
+------------+-------------------------------+-----------------+---------------------+---------------------+--------+
|          1 | catalog_product_attribute     | require_reindex | 2016-11-15 00:40:10 | 2016-11-15 00:10:24 | manual |
|          2 | catalog_product_price         | require_reindex | 2016-11-15 00:45:09 | 2016-11-15 00:15:44 | manual |
|          7 | catalogsearch_fulltext        | require_reindex | 2016-11-14 23:51:23 | 2016-11-14 12:12:30 | manual |
|          8 | cataloginventory_stock        | require_reindex | 2016-11-15 00:47:01 | 2016-11-15 00:45:09 | manual |
|          9 | tag_summary                   | require_reindex | 2016-11-14 23:54:01 | 2016-11-14 23:54:01 | manual |
|         12 | awaffiliate_affiliate_balance | pending         | 2016-11-14 23:54:01 | 2016-11-14 23:54:03 | manual |
|         18 | catalog_url                   | require_reindex | 2016-11-14 23:54:03 | 2016-11-14 21:02:53 | manual |
|         19 | catalog_product_flat          | require_reindex | 2016-11-15 00:49:02 | 2016-11-15 00:10:10 | manual |
|         20 | catalog_category_flat         | pending         | 2016-11-15 00:51:01 | 2016-11-15 00:51:04 | manual |
|         21 | catalog_category_product      | require_reindex | 2016-11-15 00:53:01 | 2016-11-15 00:06:04 | manual |
|         22 | ampgrid_qty_sold              | require_reindex | 2016-11-15 00:06:04 | 2016-11-14 12:21:18 | manual |
+------------+-------------------------------+-----------------+---------------------+---------------------+--------+

Re-index cron schedule:

0-59/15 *  *   *   *  cd html && /usr/bin/php  /www/sites/files/html/shell/indexer.php --reindex catalog_product_price > /dev/null 2>&1
2-59/15 *  *   *   *  cd html && /usr/bin/php  /www/sites/files/html/shell/indexer.php --reindex cataloginventory_stock > /dev/null 2>&1
4-59/15 *  *   *   *  cd html && /usr/bin/php  /www/sites/files/html/shell/indexer.php --reindex catalog_product_flat > /dev/null 2>&1
6-59/15 *  *   *   *  cd html && /usr/bin/php  /www/sites/files/html/shell/indexer.php --reindex catalog_category_flat > /dev/null 2>&1
8-59/15 *  *   *   *  cd html && /usr/bin/php  /www/sites/files/html/shell/indexer.php --reindex catalog_category_product > /dev/null 2>&1
10-59/15 *  *   *   *  cd html && /usr/bin/php  /www/sites/files/html/shell/indexer.php --reindex catalog_product_attribute > /dev/null 2>&1

10 4  *   *   *    cd html && /usr/bin/php  /www/sites/files/html/shell/indexer.php --reindex catalogsearch_fulltext > /dev/null 2>&1
20 4  *   *   *    cd html && /usr/bin/php  /www/sites/files/html/shell/indexer.php --reindex ampgrid_qty_sold > /dev/null 2>&1
30 4  *   *   *    cd html && /usr/bin/php  /www/sites/files/html/shell/indexer.php --reindex awaffiliate_affiliate_balance > /dev/null 2>&1
40 4  *   *   *    cd html && /usr/bin/php  /www/sites/files/html/shell/indexer.php --reindex tag_summary > /dev/null 2>&1

50  */6   *   *  *  cd html && /usr/bin/php  /www/sites/files/html/shell/indexer.php --reindex catalog_url > /dev/null 2>&1

Index timings:

$ time php shell/indexer.php --reindexall
Product Attributes index was rebuilt successfully in 00:00:21
Product Prices index was rebuilt successfully in 00:00:32
Search Index index was rebuilt successfully in 00:02:31
Stock Status index was rebuilt successfully in 00:00:00
Tag Aggregation Data index was rebuilt successfully in 00:00:00
Affiliates Balance index was rebuilt successfully in 00:00:02
Catalog URL Rewrites index was rebuilt successfully in 00:10:08
Product Flat Data index was rebuilt successfully in 00:01:54
Category Flat Data index was rebuilt successfully in 00:00:04
Category Products index was rebuilt successfully in 00:00:18
Qty Sold index was rebuilt successfully in 00:00:15

real    16m9.562s
user    8m23.551s
sys     0m19.705s

My assumption is that running a full re-index would process the mass_action process and clear it out of the table. Unfortunately this is not the case. Does anyone know what the conditions are that clear out that process?

Best Answer

Did you notice the indexation time on some of your indexes? It varies from seconds to mostly hours. Depending on how you configured your cronjobs, your store might be busy indexing all day, continously. This could be your problem as it's not able to finish the indexations or at least keep up with it. Having a locked index at all times could cause some trouble and is known for these type of issues.

I have to make some assumtpions here, correct me if needeed. Specify some more information on your setup if you think this might be related to your problem. I have been working on a project that should help developers clean out their core_url_rewrite table because it grows substantially over time in certain scenario's. I think yours is having this problem too since it ran for almost 3 hours, and the issue you describe could be related to it. I see similair things at test cases.

Is your problem specific to just the mass_action event object? or does it happen with the other event types as well? (save, delete, reindex etc. (Mage_Index_Model_Event)). If not, I would say it's related to the indexes not being indexed properly. Given the fact there could be locks on tables, that are required for processing, i'm not sure of this. You could easily check for active locks using something like:

 $indexes = Mage::getSingleton('index/indexer')->getProcessesCollection()->load();
    foreach($indexes as $index){
        if($index->isLocked(){
            echo "Index" . $index->getIndexerCode() . " with state " . $index->getStatus() . " is locked since " . $index->getStartedAt() . "!";
        }
    }

Or use my gist, don't forget to remove it when your done. It's not for production usage.

One-page index & lock status overview

So i think that when you fix your index times, your problem might go away and the store could run way more smoothly. In the case of the core_url_rewrite table, the overhead is generated by Magento itself in an effort to have unique URL's but ends up duplicating them. This has complications on the SEO and performance side. The solution to this problem is making the URL's unique and clearing out all the overhead, without damaging your SEO scores. When they're clean, you'll notice a big difference in index times. Some of my cases started generating sitemaps again after months.

Cleaning it can be tricky but the magerun bundle I put together from scripts I used can help you with at least the rewrite table. Currently it's a proof of concept, so make sure to have backups. When it's proven something useful I will rebuild it.

Magerun bundle with commands for cleaning core_url_rewrite

As for the other tables, I have to assume that there's something similar causing overhead since i have no other info from which i can relate an issue. Maybe you could add some more info on things like the size of your catalog, server specs, store scope configurations, they are all related to your index performance. You might also want to check your table to make sure it's not missing constraints etc.

Magento DB Repair

There's one stack post containing a great collection of information on Magento's indexes, just in case you haven't seen it already.

Stack post on indexes

I hope this is of any value to you, good luck