Algorithm Performance – Expected Running Time vs. Average Running Time

algorithmscomplexityperformanceterminology

Let's say we want to analyze running time of algorithms. Sometimes we say that we want to find the running time of an algorithm when the input size is n and for the worst possible case it is denote it by O(n). Sometimes though I see books/papers saying that we need to find the expected time of an algorithm. Also sometimes the average running time is used .

What is "expected time"? In which cases is it useful to find expected time instead of worst case time?

Edit: I think there is a subtle difference between expected running time and average running time but I am not sure . Through this post I want to know the exact difference if any .

Best Answer

Expected time is just the average, expected, running time of the algorithm using the intended input.

Say you've got some few million user records and want to sort them, you might want to use an algorithm which is the most suitable for your input, and as such gives the best expected running time, as opposed to an algorithm which has better worst-case running time but worse expected running time.

Sometimes for example the constant factors for the time-complexity of an algorithm are so high that it makes sense to use algorithms with worse time-complexity but smaller constant factors, as it gives you better expected running time with small input, even though it would get horribly outperformed with bigger input.

Perhaps a better example would be the classic quicksort algorithm, which has worst-case running time of O(n²) but expected average running time of O(n log n), regardless of input. This is because the algorithm uses(or rather, may use, depending on the implementation) randomization. So it's a so-called randomized algorithm. It runs a bit differently with every invocation even with the same input. As such, thre's no universal worst-case input for the implementation, because the worst-case input depends on the way the algorithm chooses the pivot for dividing the given input. And as such, one can't just supply some pre-defined input causing the worst-case running time. This is often the case with randomized algorithms, which aim for better expected, average running time regardless of the input.

It's all about using the right algorithm for the input at hand.