Magento 2 Mview – Understanding Indexing with MySQL Triggers

indexindexingmagento2mviewmysql-triggers

First of all what I know :

Index management is useful for increasing store performance.

EAV has one disadvantage.it will store data in different tables. So that retrieving data is time-consuming.

So that we will store data into a single table.
when data is changed we will update this single table (nothing but indexing update)

mysql trigger:
perform some query actions based on some table insert/update/delete.

So Magento using a trigger for example when the price is updating it will store entity_id into the changelog table.

there is a statement in devdocs for implementing triggers magento2 using Magento/Framework/Mview.

can you anyone explain the flow of this functionality.

I mean what is view, action, processor etc?

Best Answer

In official documentation: https://devdocs.magento.com/guides/v2.3/extension-dev-guide/indexing.html there is the stement:

`Allows tracking database changes for a certain entity (product, category and so on) and running change handler.
Emulates the materialized view technology for MySQL using triggers and separate materialization process (provides executing PHP code instead of SQL queries, which allows materializing multiple queries).`

MView stands for Materialized View which is a snapshot of the database at a point in time. https://en.wikipedia.org/wiki/Materialized_view Why would we need to duplicate tables. Indexers are costly to run, especially when there is traffic on category pages, customers place orders and admins save products. On product save the cache gets invalidated (off topic). In case of stock indexer, before it ends the execution, it sends the entity ids affected as cache tags to be cleaned (full page cache type). In Magento 2.0 categories ids of purchased products are sent. In Magento 2.1 the product ids are sent.

There are 2 MySQL tables that keep indexer codes and statuses:

  • indexer_state
  • mview_state

mview_state works with Update by Schedule in Admin > System > Indexer Management

Update by Schedule makes the indexers to be run in cron.

There are 3 entries in Magento_Indexer/etc/contab.xml:

<group id="index">
    <job name="indexer_reindex_all_invalid" instance="Magento\Indexer\Cron\ReindexAllInvalid" method="execute">
        <schedule>* * * * *</schedule>
    </job>
    <job name="indexer_update_all_views" instance="Magento\Indexer\Cron\UpdateMview" method="execute">
        <schedule>* * * * *</schedule>
    </job>
    <job name="indexer_clean_all_changelogs" instance="Magento\Indexer\Cron\ClearChangelog" method="execute">
        <schedule>0 * * * *</schedule>
    </job>
</group>
  • indexer_reindex_all_invalid is run on indexer_state. There are is still the need to run 'normal' indexers in cron
  • indexer_update_all_views is run on mview_state
  • indexer_clean_all_changelogs - clears changelogs used by mview_state

Note that cron indexer group tasks run in a separate php process, as declared in etc/contab_groups.xml: <use_separate_process>1</use_separate_process>.

Changelog tables are: [indexer name]_cl (suffixed with _cl). e.g. cataloginventory_stock_cl. If you have indexers set to Update by Schedule and save a product in admin you'll see the entity_id of that product in this table. It's a big circle, I'm thinking place order or create shipment will add here an entry too.

Someone provided an example in official devdoc on how to create new materialized views and what are the interface methods required (disregard the above statement about orders in the snippet bellow):

<?php
<VendorName>\Merchandizing\Model\Indexer;
class Popular implements \Magento\Framework\Indexer\ActionInterface,   \Magento\Framework\Mview\ActionInterface
{
public function executeFull(); //Should take into account all placed orders in the system
public function executeList($ids); //Works with a set of placed orders (mass actions and so on)
public function executeRow($id); //Works in runtime for a single order using plugins
public function execute($ids); //Used by mview, allows you to process multiple placed orders in the "Update on schedule" mode
}

This will make sense: //public function execute($ids); Used by mview, allows you to process multiple **entities** in the "Update on schedule" mode } Where $ids parameter has the entities ids from *_cl tables.

What is the link between cache invalidation and indexers. Categories pages are now full page cached (built-in full page cache or through Varnish).

There is \Magento\Indexer\Model\Processor\InvalidateCache::afterUpdateMview:

/**
 * Update indexer views
 *
 * @param \Magento\Indexer\Model\Processor $subject
 * @return void
 * @SuppressWarnings(PHPMD.UnusedFormalParameter)
 */
public function afterUpdateMview(\Magento\Indexer\Model\Processor $subject)
{
    if ($this->moduleManager->isEnabled('Magento_PageCache')) {
        $this->eventManager->dispatch('clean_cache_after_reindex', ['object' => $this->context]);
    }
}

Back to Magento\Indexer\Cron\UpdateMview::execute():

/**
 * Regenerate indexes for all invalid indexers
 *
 * @return void
 */
public function execute()
{
    $this->processor->updateMview();
}

Magento\Indexer\Model\Processor::updateMview():

/**
 * Update indexer views
 *
 * @return void
 */
public function updateMview()
{
    $this->mviewProcessor->update('indexer');
}

In app/etc/di.xml there is:

<preference for="Magento\Framework\Mview\ProcessorInterface" type="Magento\Framework\Mview\Processor" />


/**
 * Materialize all views by group (all views if empty)
 *
 * @param string $group
 * @return void
 */
public function update($group = '')
{
    foreach ($this->getViewsByGroup($group) as $view) {
        $view->update();
    }
}

Magento\Framework\Mview\ViewInterface

/**
 * Materialize view by IDs in changelog
 *
 * @return void
 * @throws \Exception
 */
public function update();

app/etc/di.xml

 <preference for="Magento\Framework\Mview\ViewInterface" type="Magento\Framework\Mview\View" />

In Magento\Framework\Mview\View::update() there is:

$action = $this->actionFactory->get($this->getActionClass());
$this->getState()->setStatus(View\StateInterface::STATUS_WORKING)->save();
..
$action->execute($ids);
..

If you search in vendor/ directory for Magento\Framework\Mview\ActionInterface you'll find for example this:

In \Magento\CatalogInventory\Model\Indexer:

class Stock implements \Magento\Framework\Indexer\ActionInterface, \Magento\Framework\Mview\ActionInterface

In this class there is:

/**
 * Execute materialization on ids entities
 *
 * @param int[] $ids
 *
 * @return void
 */
public function execute($ids)
{
    $this->_productStockIndexerRows->execute($ids);
}

And it looks like it goes back to 'normal' class of indexers' execute` method which is used by MView.

About cache cleaning after Stock Indexer. When an order is placed on checkout, the quantities are subtracted using this observer: \Magento\CatalogInventory\Observer\SubtractQuoteInventoryObserver

$itemsForReindex = $this->stockManagement->registerProductsSale(
    $items,
    $quote->getStore()->getWebsiteId()
);

Further, another observer triggers indexer (but not directly on Mview/Indexer by Schedule): \Magento\CatalogInventory\Observer\ReindexQuoteInventoryObserver

if ($productIds) {
    $this->stockIndexerProcessor->reindexList($productIds);
}

In Mview case, when the new quantities are subtracted in SubtractQuoteInventoryObserver, the MySQL trigger (created for Mview) will insert a row in cataloginventory_stock_cl, marking that a reindex (stock & fulltext) needs to be done to those purchased product ids. There are many MySQL triggers created for Mview. See them all with SHOW TRIGGERS;.

When a product gets out of stock after checkout you'll see 2 rows inserted in that table (Magento saves 2 times stock item in these 2 observers).

When cron runs stock indexer in Mview mode the affected product ids (in M2.1) or categories ids (in M2.0) are sent to cache clean as cache tags. By cache I mean full page cache type. Example: catalog_product_99 or other cache tag format depending on the Magento version. The same when Mview is not enabled.

\Magento\CatalogInventory\Model\Indexer\Stock\AbstractAction::_reindexRows

...
$this->eventManager->dispatch('clean_cache_by_tags', ['object' => $this->cacheContext]);

And Magento_PageCache has observer \Magento\PageCache\Observer\FlushCacheByTags that will clean full page cache type by tags. It does it for buil-in full page cache. Varnish related code is in \Magento\CacheInvalidate\Observer\InvalidateVarnishObserver.

There is a free extension that will deny cache clean on still in stock products after customer checkout:

https://github.com/daniel-ifrim/innovo-cache-improve

Cache cleaning only on out of stock products after checkout was introduced in default Magento 2.2.x. See \Magento\CatalogInventory\Model\Indexer\Stock\CacheCleaner.

I'm thinking the cron execution for indexer in Admin > Stores > Configuration > Advanced > System > Cron configuration options for group: index should be set to much more than 1 minute.

Related Topic