Merge Sort Recursion – Understanding Its Use

algorithmsrecursion

I dont want to put too much code so I'll just put the code that involves recursion. This algorithm is fairly well known so I think everybody knows the basic code.

void mergeSort(int array[], int l, int r) 
{
    if (l < r) 
    {
        int m = l + (r - l) / 2;
        mergeSort(array,l,m);
        mergeSort(array,m+1,r);

        merge(array,l,m,r);
    }
}

where the merge function is the typical function which compares two sets and organises them.

Now, I only understand the recursion here by "brute force". With this I mean that I have to actually compute each step along the way to see that it works. But I'm completely 100% sure it would never had occurred to me to code things up this way.

I was hoping that someone could provide me with insight as to how I should think about these two recursion calls, and how its obvious it had to be this way. I know this might seem broad, but I was hoping anyone could provide me with insight so that I shouldn't have to use the "brute force" approach I mentioned above. Perhaps you could tell me how you think about these recursive calls (i dont think you compute each step like me since that is extremely tedious).

Best Answer

The recursive MergeSort is a "divide and conquer" algorithm. The code you provided in your example basically does this (in plain English):

Find the middle point
Sort the left half,
Sort the right half.
Merge the two halves back together.

The key to understanding how this works recursively is that, when it goes to sort the left half, it divides that into two parts again, just like it did with the first two halves, and so on. Visually, you can think of it as a tree-like structure, with the root at the top and progressively smaller branches extending downward.

This process of dividing continues until the code encounters an exit condition. Once that exit condition is encountered, the code begins returning back up the branches of the tree, merging the branches together as each recursive call returns up the tree.

Recursive functions always have three elements in common (usually in this order):

The exit condition,
The recursive call, and
The work that is done in this recursion.

This is a better pseudocode representation from Wikipedia. See if you can spot the three elements:

function merge_sort(list m)
    // Base case. A list of zero or one elements is sorted, by definition.
    if length of m ≤ 1 then
        return m

    // Recursive case. First, divide the list into equal-sized sublists
    // consisting of the even and odd-indexed elements.
    var left := empty list
    var right := empty list
    for each x with index i in m do
        if i is odd then
            add x to left
        else
            add x to right

    // Recursively sort both sublists.
    left := merge_sort(left)
    right := merge_sort(right)

    // Then merge the now-sorted sublists.
    return merge(left, right)

The merge function merges the lists together while returning up the call tree:

function merge(left, right)
    var result := empty list

    while left is not empty and right is not empty do
        if first(left) ≤ first(right) then
            append first(left) to result
            left := rest(left)
        else
            append first(right) to result
            right := rest(right)

    // Either left or right may have elements left; consume them.
    // (Only one of the following loops will actually be entered.)
    while left is not empty do
        append first(left) to result
        left := rest(left)
    while right is not empty do
        append first(right) to result
        right := rest(right)
    return result

Penjee's blog has some excellent animations that help you visualize this process. Here is one that actually draws a tree like the one I mentioned:

Related Solutions

Recursion – Is it ‘Divide and Conquer’ or ‘Code Reuse’

what are the "problem patterns" that call for the solution of recursion

I wouldn't say there's such a thing like a problem pattern for the use of recursion. Every function that can be implemented with recursion can also be implemented iteratively, often by pushing and popping a stack.

It's a matter of expression and also of performance. Iterative algorithms often times have a better performance and are easier to optimize. However, recursive algorithms benefit from a clearer expression and thus are often easier to read, understand and implement.

Some things even cannot be expressed without recursion, infinite lists for example. The so called functional languages heavily rely on recursion, as it's their natural way of expression. The saying is: "Recursive programming is functional programming done right".

is recursion a form of "divide & conquer" strategy or a form of "code reuse" -- or, is a design pattern in its own right

I would not call it a design pattern. It's a matter of expression. Sometimes a recursive expression is simply more powerful and more expressive and thus leads to better and cleaner code.

can you give us an example of a real world problem where recursion comes to mind as an immediate solution

Anything that needs to traverse trees will be properly expressed by a recursive algorithm.

Merge sort versus quick sort performance

If you look at your code for swapping you:

// If current element is lower than pivot
// then swap it with the element at store_index
// and move the store_index to the right.

But, ~50% of the time that string you just swapped needs to be moved back, which is why faster merge sorts work from both ends at the same time.

Next if you check to see if the first and last elements are the same before doing each of the recursive call you avoid wasting time calling a function only to quickly exit it. This happens 10000000 in your final test which does add noticeable amounts of time.

Use,

if (pivot_index -1 > start) quick_sort(lines, start, pivot_index - 1);

if (pivot_index + 1 < end) quick_sort(lines, pivot_index + 1, end);

You still want an outer function to do an initial if (start < end) but that only needs to happen once so that function can just call an unsafe version of your code without that outer comparison.

Also, picking a random pivot tends to avoid N^2 worst case results, but it's probably not a big deal with your random data set.

Finally, the hidden problem is QuickSort is comparing strings in ever smaller buckets that are ever closer together,

(Edit: So, AAAAA, AAAAB, AAAAC, AAAAD then AAAAA, AAAAB. So, strcmp needs to step though a lot of A's before looking the useful parts of the strings.)

but with Merge sort you look at the smallest buckets first while they are vary random. Mergsorts final passes do compare a lot of strings close to each other, but it's less of an issue then. One way to make Quick sorts faster for strings is to compare the first digits of the outer strings and if there the same ignore them when doing the inner comparisons, but you have to be careful that all strings have enough digits that your not skipping past the null terminator.

Best Answer

Related Solutions

Recursion – Is it ‘Divide and Conquer’ or ‘Code Reuse’

Merge sort versus quick sort performance

Related Topic