Magento – How to fully load customers when returned in a customer collection

address-attributecollection;customermagento-1magento-1.9

First, my sincere apologies if this is duplication, I've searched and nausium however all I get are results pertaining to loading a single customer (by ID, by Email etc).

My issue is that I am loading all the customers that belong to a certain group, as so :

$collection = Mage::getModel('customer/customer')
                  ->getCollection()
                  ->joinTable(
                      array('groups' => 'customer/customer_group'),
                      'customer_group_id = group_id',
                      array('group_name' => 'customer_group_code'))
                  ->addAttributeToSelect('*')
                  ->addFieldToFilter('group_name', $desiredGroupName);

Absolutely no problems here! I get the customers (Mage_Customer_Model_Customer) and can iterate them (which internally calls the collections load method). However I have extended the Customer Model with a small snippet that loads an address attribute from the default shipping address, as so (shortened and anonymised for readability):

class My_Customer_Model_Customer extends Mage_Customer_Model_Customer
{

    protected function _afterLoad()
    {

        parent::_afterLoad();

        $address = Mage::getModel('customer/address')->load($this->getDefaultShipping());

        if ( $address ) {
            $this->_data['my_value'] = $address;
        }
    }
}

Now as we (may or may not) know, when Customer Models are loaded by the collection, their _beforeLoad, load and _afterLoad methods are not fired, so while we get some customer models in a collection, they are not complete models, they have not been loaded classically and thus no addresses yet linked for me to seek out my_value.

The desired outcome is to load my collection, then iterate and store to an array the my_value values, as so :

$myValues = array();
foreach ($collection as $item) {
    var_dump($item); //debug
    if ( !empty($item->getMyValue()) ){
        $myValues[$item->getId()] = $item->getMyValue();
    }

}

As we learnt earlier the _afterLoad is never fired for the objects and thus $item->_data['my_value'] is never set. I did try calling $item->load(); within the loop (more by luck than anything) but load requires the id as its first parameter.

Likewise I tried $item->load($item->getId()) and this works, however in a collection of 1,000 customers its super expensive, it works out as ; 1 heavily joind EAV query to get the customer collection + 1000 heavily joined queries to load each individual customer within the loop.

Does anyone have a suggestion as to how to make the customers returned in the collection load properly? Some ideas are:

  • To overload the Mage_Customer_Model_Resource_Customer_Collection->load() method but it'll involve making the load a lot more expensive and this base collection is used all over Magento.
  • To extend my filtered customer collection with the relevant joins however because of the default address thing I'd have to join customer to get the default shipping address id, then join this to the address_entity table, then join this to the EAV address attribute. resulting in a similarly spaghetti like query
  • Pre load the relevant attributes (default_shipping from customer entity and my_value from address entity), this helps short cut the query somewhat as we can then cheat and do our joins with a.value = b.value AND b.attribute_id = [whatever] which makes them better, but I still am not a fan of using an already supposedly EAV ready collection and extending it with the 3/4 joins necessary to reload more EAV data

Does anyone have any good ideas or suggestions apart from creating a "maniac's query"?

PS> To avoid stale and unanswered questions, I'll probably end up going down the maniac's query route and will post what I end up using by way of example and self answer.

Edit 1

In response to Marius' kind reply I just wanted to explain what I'm trying to achieve as his answer while correct has missed my intention for the question!

The point is I don't actually have any address id's to work with!

The code itself is instrumental in organising and applying updates and inserts attached to an import function.

The way the system works is that every delivery address for all customers has a code that has many uses, to assist delivery routing in factory, identify the matching account in the business systems and also identify a related customer entity. This is a requirement of the client so I can't just change the code etc.

Now the customer relationship is strictly many to one, in that many customers are linked to one "manager" for want of a better word. The "Manager's" are customers belonging the previously mentioned customer group.

When the customer import runs I am given their main code, I then use this code to look up the "Manager" thus validating that the manager exists. The managers code is also stored in the default shipping address in order to match the model above.

This is the aim, I will simplify as so :

  • Many Customers may be imported at once, these customers are imported to the [1] => 'Customers' group
  • Within the import data I am given an ID/Code on import (example M22)
  • I need to look up the relevant Customer belonging to the [2] => 'Managers' group. This customer has the M22 Code attached to their default shipping address
  • Because of many customers being imported and relatively few managers I am attempting to preload the managers collection (as described above) and am trying to build a simple array of Code => CustomerId for the managers
  • In order to retrieve the Managers code, I need to load :

    1. Load the default address id within the collection
    2. Load the relevant attribute value attached to the default address object indicated by the defaultShippingID
    3. Iterate and populate code to customer ID array

Thanks @Marius for the suggestion though, I will probably wind up loading a more generic collection and linking the attributes via joins, pretty much like how customers are loaded anyway (aka the "Maniac's Query").

As the more I think about it the more I am convinced that Magento's business logic is centred around expensive load all attributes for all objects that belong to a bunch of entities, it’s always load all for specific entity or load some (just enough) for a list of entities…

Best Answer

You don't want your afterLoad event to fire and trigger your method because you will be in a similar case as calling load for each customer. I see that you method calls load on the address model. For a 1k items collection you will get 1k load calls. Not cool.
This may not be the best idea but it should work with 1 query and a few loops through the customer collection.

Why not collect all the address ids you need, then retrieve a collection, then match that collection to the customers from the customer collection.

Let's say that you customer collection is $collection.
You applied filters and everything to it.

$addressIds = array();
$customersByDefaultShipping = array();
foreach ($collection as $item) {
    $defaultShipping = $item->getDefaultShipping();
    if ($defaultShipping) {
         $addressIds[] = $item->getDefaultShipping();
         $customersByDefaultShipping[$defaultShipping] = $item;
    }
}
//get all the addresses.  
$addressCollection = Mage::getModel('customer/address')->getCollection()
    ->addAttributeToSelect('*') //or replace with an array with the attributes you need
    ->addAttributeToFilter('entity_id', $addressIds);
//now loop through the address collection and attach the addresses to the customers;  

foreach ($addressCollection as $address) {
    $customersByDefaultShipping[$address->getId()]->setData('my_value', $address);
}

The beauty of it is that, since the customers are objects, they are passed by reference so now you can loop through the customers collection and the ones that have an address will have it attached to it even if I have used arrays in the code above.

Notice: Untested code. Watch out for typos and even logic errors.