PHP Data Structures – When to Use Arrays vs Objects

arraydata structuresperlPHPstorage

PHP is a mixed paradigm language, allowing to use and return non-object data types, such as arrays. I pose a question to try to clarify some guidelines for selection of arrays vs objects when deciding upon what programming construct to use in a particular situation.

This is really a question about ways to encode data using PHP language constructs and when one way is to be more likely picked over another for data passing purposes (i.e. Service-Oriented Architecture or web services).

Example

Suppose you have an item type consisting of {cost, name, part_number, item_count}. Your program calls for display of several such item types, to where you decide to use an array as an outer container to hold each of the item types. [You can also use PHP's ArrayObject for OO paradigm, but my question is not about that (outer) array]. My question is about how to encode the item type data, and about which paradigm to use. PHP allows you to use PHP Native Arrays or PHP Objects.

I can encode such data, in two ways here, like so:

//PHP's associative arrays:
$ret = array(
    0 => array(
        'cost' => 10.00, 
        'name' => 'item1',
        'part_number' => 'zyz-100', 
        'item_count' => 15
        ),
    1 => array(
        'cost' => 34.00, 
        'name' => 'item2', 
        'part_number' => 'abc-230', 
        'item_count' => 42
        ),
  );

vs

//here ItemType is encapsulated into an object
$ret = array(
  0 => new ItemType(10.00, 'item1', 'zyz-100', 15),
  1 => new ItemType(34.00, 'item2', 'abc-230', 42),
);

class ItemType
{
    private $price;
    private $name;
    private $partNumber;
    private $itemCount;

    function __construct($price, $name, $partNumber, $itemCount) {..}
}

What I am thinking

Array encoding is light-weight, and more JSON-ready, but can be easier to mess up. Misspell one of the associative array keys and you may have an error that is more difficult to catch. But it is also easier to change on a whim. Say I don't want to store item_count anymore, I can use any text-processing software to easily remove all item_count instances in the array and then update other functions that use it accordingly. It may be a more tedious process, but it is simple.

Object oriented encoding calls upon IDE and PHP language facilities and makes it easier to catch any errors beforehand, but is harder to program and code up in the first place. I say harder, because you have to think a bit about your objects, think ahead, and OO coding takes a bit higher cognitive load than typing up array structures. That said, once it is coded up, some changes maybe easier to implement, in a sense, that removing item_count, for example, will require changing less lines of code. But changes themselves may still require a higher cognitive load in comparison with the array method, since higher-level OO facilities are involved.

Question

In some cases it is clear, like cases where I will need to perform manipulations on the data. But in some cases, where I need to just store a few lines of "Item Type" data, I don't have clear guidelines or considerations to lean on when trying to decide whether to use arrays or whether to construct objects. It seems I can just toss a coin and pick one. Is that the case here?

Best Answer

The way I see this, it depends on what you intend to do with the data afterwards. Based on a few simple checks you can determine which of the two data structures is better for you:

  1. Does this data have any logic associated with it?

    For example, is $price stored as an integer number of cents, so a product with a price of $9.99 would have price = 999 and not price = 9.99? (Probably, yes) Or does partNumber need to match a specific regex? Or, do you need to be able to easily check if the itemCount is available in your inventory? Will you need to do these these functions in the future? If so, then your best bet is to create a class now. This means that you can define constraints and logic built into the data structure: private $myPrice is set to 999 but $item->getPriceString() returns $9.99 and $item->inStock() is available to be called in your application.

  2. Are you going to be passing this data to multiple PHP functions?

    If so, then use a class. If you're generating this data once to perform some transformations on it, or just to send as JSON data to another application (JavaScript or otherwise) then an array is an easier choice. But if you have more than two PHP functions which accept this data as a parameter, use a class. If nothing else, that lets you define someFunction(MyProductClass $product) { and it's very clear what your functions expect as input. As you scale out your code and have more functions it will be much easier to know what type of data each function accepts. Seeing someFunction($someArrayData) { is not nearly as clear. Also, this does not enforce type consistency and means that (as you said) the flexible structure of the array can cause development pain later on

  3. Are you building a library or shared code base?

    If so, use a class! Think about some new developer who is using your library, or another developer somewhere else in the company who has never used your code before. It will be much easier for them to look at a class definition and understand what that class does, or see a number of functions in your library which accept objects of a certain class, than to try and guess what structure they need to generate in a number of arrays. Also, this touches on the data consistency issues with #1: if you're developing a library or shared code base, be nice to your users: give them classes which enforce data consistency and protect them from making errors with the design of your data.

  4. Is this a small part of an application or just a transformation of data? Do you not fit into any of the above?

    A class might be too much; use an array if it suits you and you find it easier. As mentioned above, if you're just generating structured data to send as JSON or YAML or XML or whatever, don't bother with a class unless there's a need to. If you are writing a small module in a larger application, and no other modules/teams need to interface with your code, maybe an array is sufficient.

Ultimately, consider the scaling needs of your code and consider than a structured array might be a quick fix, but a class is a much more resilient and scalable solution.

Also, consider the following: if you have a class, and you want to output to JSON, there's no reason you can't define a json_data() method of your class which returns a JSON-ifiable array of the data in the class. This is what I did in my PHP applications where I needed to send class data as JSON. As an example:

class Order {
    private $my_total;
    private $my_lineitems;

    public function getItems() { return $this->my_lineitems; }
    public function addItem(Product $p) { $this->my_lineitems[] = $p; }
    public function getTotal() { return $this->my_total; }

    public function forJSON() {
        $items_json = array();
        foreach($this->my_lineitems as $item) $items_json[] = $item->forJSON();
        return array(
            'total' => $this->getTotal(),
            'items' => $items_json
        );
    }
}

$o = new Order();
// do some stuff with it
$json = json_encode($o->forJSON());
Related Topic