How to Design Performance Comparison Between Data Structures

data structuresperformance

I want to compare the performance of two search trees of integers (an AVL tree and a RedBlack tree). How should I design/engineer the tests to accomplish this? For example, let's consider the insert operation; what steps should I follow in order to state that on average this operation is faster in the RB case? What considerations should I take to correctly measure CPU time? Should both implementations be optimized, or may I compare an optimized implementation of AVL vs a straightforward implementation of RB?

Any links or papers would be very helpful.

Best Answer

It highly depends on what you plan to do with the data structure. If you will end up filling it with a certain structured input, then you should also test it that way.

If you don't know anything about your future inputs and want to measure average performance, then remember that complexity theory calculates average performance based on randomized inputs (using a normal distribution). Hence, average-case performance test should include many runs with varying random inputs.

Depending on the data structures themselves, you may also be interested in comparing certain input structures that are known to be very good/bad for one of the data structures. Nevertheless, your future application of the data structure may almost never create such inputs, in which case you may well ignore the performance comparison of these cases. (Intuitive Example: comparing sort algorithms in a context, where you often try to sort an already sorted sequence could throw many quicksort implementations off.)

As for your optimization point, the answer again is: It depends. Are you aiming for using this data structure in exactly that one project right now? Then I'd go for optimized versions. Are you aiming at a comparison to get a general idea on which one might be more suitable for a planned project? Then try to compare reference implementations, but do not waste time on creating super-efficient implementations. Of course, the context in which you execute the tests has to be comparable, so don't try to f.ex. compare implementations in different programming languages to each other. Probably obvious, but I just thought I'd make sure to mention it.

Related Topic