Magento2.2.5 – Role and Functionality of Indexing Tables Explained

indexerindexingmagento2

I've read the documentation about indexing in Magento 2, but it's very much on the surface. What I am wondering is how do the internals of the indexing mechanism work. Can somebody explain this to me? If I look for example at the Category/Product relation I can determine the following:

The following tables are involved in my setup:

  • catalog_category_product
  • catalog_category_product_cl
  • catalog_category_product_index
  • catalog_category_product_index_replica
  • catalog_category_product_index_store1
  • catalog_category_product_index_store1_replica
  • catalog_category_product_index_store2
  • catalog_category_product_index_store2_replica
  • catalog_category_product_index_tmp

So far, I understand the following about the indexing process:

  • Category/Product relations are stored in catalog_category_product.
  • Whenever the is a change to this relation, a new version is pushed to the changelog (catalog_category_product_cl) to be picked up by the Materialized View implementation by comparing versions in the mview_state-table.
  • catalog_category_product_index_tmp is populated during indexing and swapped at the end the process with catalog_category_product_index.

But here are my questions I still have left:

  • What is the purpose of the _replica-table?
  • Why are there indexes per store even though the catalog_category_product_index has a store_id-table?
  • Not sure if this is a bug, but why are there differences between the products in catalog_category_product_index::store_id=1 and catalog_category_product_index_store1? To clarify: In catalog_category_product_index I have a product that is assigned to 6 category ID's, whereas the same product in the dedicated _store1-index is only assigned to 4 category ID's. Could be a bug though.

Any help/explanation from core Magento developers or more experienced Magento developers is appreciated.

Best Answer

catalog_category_product_index_tmp is used to store intermediate data for a batch product in the process of indexation

*_replica tables are used for full indexation process to not affect the Store Front in process of indexation

*_store* tables added in 2.2.5 to split index per store and table catalog_category_product_index become outdated

Related Topic