How Preallocating Cell Arrays Improves Performance in MATLAB

MATLABperformance

I was reading this article on MathWorks about improving MATLAB performance and you will notice that one of the first suggestions is to preallocate arrays, which makes sense. But it also says that preallocating Cell arrays (that is arrays which may contain different, unknown datatypes) will improve performance.

But how will doing so improve performance because the datatypes are unknown so it doesn't know how much contiguous memory it will require even if it knows the shape of the cell array, and therefore it can't preallocate the memory surely? So how does this result in any improvement in performance?

I apologise if this question is better suited for StackOverflow than Programmers but it isn't asking about a specific problem so I thought it fit better here, please let me know if I am mistaken though.

Any explanation would be greatly appreciated 🙂

Best Answer

I don't know details of MATLAB's memory handling, so this is just an educated guess: Cell arrays are implemented so that the array itself contains only references (pointers) to the cells, which actually live in the heap. It definitely can't allocate memory for the actual cells in advance because, as you wrote, their size is unknown. However, it can pre-allocate the pointer array itself, since the size of the pointers is known.

When you think about it, it would be quite difficult to implement an array whose element size wouldn't be constant: how would you know where in the memory X[1234] lives, if the size of each element can be different? Therefore a layer of indirection (store constant-sized pointers pointing to the actual data) is quite useful. An alternative would be some sort of linked list, a different kind of trade-off.

Related Topic