Merge sort versus quick sort performance

algorithmscomplexityperformancesorting

I have implemented merge sort and quick sort using C (GCC 4.4.3 on Ubuntu 10.04 running on a 4 GB RAM laptop with an Intel DUO CPU at 2GHz) and I wanted to compare the performance of the two algorithms.

The prototypes of the sorting functions are:

void merge_sort(const char **lines, int start, int end);

void quick_sort(const char **lines, int start, int end);

i.e. both take an array of pointers to strings and sort the elements with index i : start <= i <= end.

I have produced some files containing random strings with length on average 4.5 characters. The test files range from 100 lines to 10000000 lines.

I was a bit surprised by the results because, even though I know that merge sort has complexity O(n log(n)) while quick sort is O(n^2), I have often read that on average quick sort should be as fast as merge sort. However, my results are the following.

Up to 10000 strings, both algorithms perform equally well. For 10000 strings, both require about 0.007 seconds.
For 100000 strings, merge sort is slightly faster with 0.095 s against 0.121 s.
For 1000000 strings merge sort takes 1.287 s against 5.233 s of quick sort.
For 5000000 strings merge sort takes 7.582 s against 118.240 s of quick sort.
For 10000000 strings merge sort takes 16.305 s against 1202.918 s of quick sort.

So my question is: are my results as expected, meaning that quick sort is comparable in speed to merge sort for small inputs but, as the size of the input data grows, the fact that its complexity is quadratic will become evident?

Here is a sketch of what I did.
In the merge sort implementation, the partitioning consists in calling merge sort recursively, i.e.

merge_sort(lines, start, (start + end) / 2);
merge_sort(lines, 1 + (start + end) / 2, end);

Merging of the two sorted sub-array is performed by reading the data from the array lines and writing it to a global temporary array of pointers (this global array is allocate only once). After each merge the pointers are copied back to the original array. So the strings are stored once but I need twice as much memory for the pointers.

For quick sort, the partition function chooses the last element of the array to sort as the pivot and scans the previous elements in one loop. After it has produced a partition of the type

start ... {elements <= pivot} ... pivotIndex ... {elements > pivot} ... end

it calls itself recursively:

quick_sort(lines, start,          pivotIndex - 1);
quick_sort(lines, pivotIndex + 1, end);

Note that this quick sort implementation sorts the array in-place and does not require additional memory, therefore it is more memory efficient than the merge sort implementation.

So my question is: is there a better way to implement quick sort that is worthwhile trying out? If I improve the quick sort implementation and
perform more tests on different data sets (computing the average of the running times on different data sets) can I expect a better performance
of quick sort wrt merge sort?

EDIT

Thank you for your answers.

My implementation is in-place and is based on the pseudo-code I have found
on wikipedia in Section In-place version:

function partition(array, 'left', 'right', 'pivotIndex')

where I choose the last element in the range to be sorted as a pivot, i.e. pivotIndex := right.
I have checked the code over and over again and it seems correct to me.
In order to rule out the case that I am using the wrong implementation
I have uploaded the source code on github (in case you would like
to take a look at it).

Your answers seem to suggest that I am using the wrong test data. I will look into it and try out different test data sets. I will report as soon as I have some results.

Best Answer

If you look at your code for swapping you:

// If current element is lower than pivot
// then swap it with the element at store_index
// and move the store_index to the right.

But, ~50% of the time that string you just swapped needs to be moved back, which is why faster merge sorts work from both ends at the same time.

Next if you check to see if the first and last elements are the same before doing each of the recursive call you avoid wasting time calling a function only to quickly exit it. This happens 10000000 in your final test which does add noticeable amounts of time.

Use,

if (pivot_index -1 > start) quick_sort(lines, start, pivot_index - 1);

if (pivot_index + 1 < end) quick_sort(lines, pivot_index + 1, end);

You still want an outer function to do an initial if (start < end) but that only needs to happen once so that function can just call an unsafe version of your code without that outer comparison.

Also, picking a random pivot tends to avoid N^2 worst case results, but it's probably not a big deal with your random data set.

Finally, the hidden problem is QuickSort is comparing strings in ever smaller buckets that are ever closer together,

(Edit: So, AAAAA, AAAAB, AAAAC, AAAAD then AAAAA, AAAAB. So, strcmp needs to step though a lot of A's before looking the useful parts of the strings.)

but with Merge sort you look at the smallest buckets first while they are vary random. Mergsorts final passes do compare a lot of strings close to each other, but it's less of an issue then. One way to make Quick sorts faster for strings is to compare the first digits of the outer strings and if there the same ignore them when doing the inner comparisons, but you have to be careful that all strings have enough digits that your not skipping past the null terminator.

Related Solutions

Java and .NET: Why different sorting algorithms are used by default

As deterministic as computers themselves are, computer engineering is not an exact science. Two people, given the same problem domain, will perform an analysis and develop two different solutions that satisfy all constraints of the problem. It may be difficult or impossible to empirically determine which of these is "better" in the general case.

My guess is that the .NET QuickSort is layered on top of something in the MFCs or Windows API, and is probably inherited from much older versions of Windows where the multi-threadable advantage of MergeSort wouldn't have even been considered for the computers of the day. (EDIT: it's not, though Microsoft developers have been QuickSort fanboys for a long time, as evidenced by this choice of sorting implementation since MS-DOS).

Java, which can't use any platform-specific implementation because Java was designed from scratch to be completely platform-independent, went a different way. Who knows why MergeSort came out on top; my wild guess is that the implementation won some sort of performance competition versus some other sorts the developers came up with, or else that an O(n)-space MergeSort just looked best on paper in terms of best-case and worst-case performance (MergeSort has no Achilles' heel relating to element selection like QuickSort, and its best-case is a near-sorted list while that's often QuickSort's worst). I doubt multithreading benefits were initially considered, but the current implementation may well be multithreaded.

Sublinear Extra Space MergeSort

To merge 2 blocks of size M you only need M extra space:

If you have a block from i to j and a block from j to k you first copy the first block to the additional space so i to j is free to receive the sorted array.

Then the merge is implemented as:

extra = new array[j-i]
arraycopy(arr,i,extra,0,j-i)//copy from arr to extra
p =0
while i<k
    if(arr[j]<extra[p])
       arr[i++]=arr[j++]
    else
       arr[i++]=extra[p++]
    if(j==k)break;
    if(i==j)break;
end while
arraycopy(extra,p,arr,i,extra.length-p)//copy remaining from extra

you can see that i only increments without j up to j-i times so i is never larger than j

Best Answer

Related Solutions

Java and .NET: Why different sorting algorithms are used by default

Sublinear Extra Space MergeSort

Related Topic