Magento – Magento 1: Performance optimizations to delete entities

best practicecollection;magento-1memory leakperformance

I'm currently trying to improve a couple of modules regarding performance.

Some of you may know the usage of the walk() method on collection which is very useful to avoid looping through products directly.

On top of that and thanks to @Vinai, one can also use the collection delete() method.

But I noticed that Magento 1 native files don't always use any of those methods for deletion.

One of the worst code I've seen is the massDelete() method from app/code/core/Mage/Adminhtml/controllers/Catalog/ProductController.php where products are loaded in a loop before deletion.

foreach ($productIds as $productId) {
    $product = Mage::getSingleton('catalog/product')->load($productId);
    Mage::dispatchEvent('catalog_controller_product_delete', array('product' => $product));
    $product->delete();
}

So I made some performance tests, added some logging calls to check the time taken and the memory usage for 100 products deletion.

Test 1: walk method

I replaced the original code pasted above with this code:

$collection = Mage::getResourceModel('catalog/product_collection')
                        ->addAttributeToSelect('entity_id')
                        ->addIdFilter($productIds)
                        ->walk('delete');

And my results are the following on my crappy dev server (average based on 10 tests):

  • Original code: 19.97 seconds, 15.84MB used
  • Custom code: 17.12 seconds, 15.45MB used

So for 100 products deletion, my custom code is 3 seconds faster and uses 0.4MB less.

Test 2: Using the collection delete() method

I replaced the original code with this one:

$collection = Mage::getResourceModel('catalog/product_collection')
                        ->addAttributeToSelect('entity_id')
                        ->addIdFilter($productIds)
                        ->delete();

And mind blown here are the results:

  • Original code: 19.97 seconds, 15.84MB used
  • Custom code: 1.24 seconds, 6.34MB used

So for 100 products deletion, my custom code is 18 seconds faster and uses 9MB less.

As stated in the comments, it seems like this method does not trigger the Magento events (after load, after delete) nor the index/cache flush.

Question

So my question is: is there a reason why the Magento core team did not use the walk('delete') or event better the collection delete() method instead of loading products in a loop (which we all know is a very very bad practice) ?

Main goal is to be aware of such key points in case of a module development: is there any particular cases where one can't use the walk / collection delete() method ?

EDIT: the reason is definitely not because of the catalog_controller_product_delete event being dispatched as the same code can be found in several places (check the massDelete methods) in Magento core. I've used the example of products to highlight the performance as they are usually the biggest entities

Best Answer

So I made some performance tests, added some logging calls to check the time taken and the memory usage for 100 products deletion

Side note, but you should look at using the Varien Profiler for this!

my custom code is 2seconds faster and uses 0.4MB less

While I don't doubt that your change would improve performance, it would be useful to provide the "before" results to compare the improvements.

is there a reason why the Magento core team did not use the walk('delete') instead of loading products in a loop (which we all know is a very very bad practice) ?

Well, we know from other questions on this forum the following:

  • The Magento codebase has developed and evolved over many years
  • It's had a lot of developers working on it
  • The Magento core development workflow processes have improved dramatically over the time they've worked on the platform, catching up with modern best practices and techniques to the point where Magento 2 now exhibits many leading modern application design practices

So I would suggest that the example you've found is probably one of potentially many gems hidden in the code that were written either a long time ago, and/or by a less experienced developer. Like much of the core code (and community code!) it would have been tested on a small data set and not battle tested, so performance may not have been closely monitored.

Is your improvement beneficial, and more closely aligned to best practice than the original code? Yes. You as a community Magento [1.x] developer however have no ability to contribute suggested improvements though like you do with Magento 2, so my suggestion would be to implement this in a local module if you require it for performance in one of your stores, or ignore it if it's not affecting you but you've noticed it while doing some research.

As an update to your question edit, I'm sure you're aware that the walk method in Varien_Data_Collection accepts an arbitrary callback, so you would be free to use it for anything you wanted to probably. For dispatching the event in the original example you could do that with the walk function, as well as the delete.

The only reason I could imagine that loading the product before deleting it would be of use may be that the observers attached to that event may need a full data set not available without loading the product first. If that is the case, it would explain why they use a singleton rather than a model to at least minimise the object overheads.

Related Topic