Merge Sort Recursion – Understanding Its Use

algorithmsrecursion

I dont want to put too much code so I'll just put the code that involves recursion. This algorithm is fairly well known so I think everybody knows the basic code.

void mergeSort(int array[], int l, int r) 
{
    if (l < r) 
    {
        int m = l + (r - l) / 2;
        mergeSort(array,l,m);
        mergeSort(array,m+1,r);

        merge(array,l,m,r);
    }
}

where the merge function is the typical function which compares two sets and organises them.

Now, I only understand the recursion here by "brute force". With this I mean that I have to actually compute each step along the way to see that it works. But I'm completely 100% sure it would never had occurred to me to code things up this way.

I was hoping that someone could provide me with insight as to how I should think about these two recursion calls, and how its obvious it had to be this way. I know this might seem broad, but I was hoping anyone could provide me with insight so that I shouldn't have to use the "brute force" approach I mentioned above. Perhaps you could tell me how you think about these recursive calls (i dont think you compute each step like me since that is extremely tedious).

Best Answer

The recursive MergeSort is a "divide and conquer" algorithm. The code you provided in your example basically does this (in plain English):

  1. Find the middle point
  2. Sort the left half,
  3. Sort the right half.
  4. Merge the two halves back together.

The key to understanding how this works recursively is that, when it goes to sort the left half, it divides that into two parts again, just like it did with the first two halves, and so on. Visually, you can think of it as a tree-like structure, with the root at the top and progressively smaller branches extending downward.

This process of dividing continues until the code encounters an exit condition. Once that exit condition is encountered, the code begins returning back up the branches of the tree, merging the branches together as each recursive call returns up the tree.

Recursive functions always have three elements in common (usually in this order):

  1. The exit condition,
  2. The recursive call, and
  3. The work that is done in this recursion.

This is a better pseudocode representation from Wikipedia. See if you can spot the three elements:

function merge_sort(list m)
    // Base case. A list of zero or one elements is sorted, by definition.
    if length of m ≤ 1 then
        return m

    // Recursive case. First, divide the list into equal-sized sublists
    // consisting of the even and odd-indexed elements.
    var left := empty list
    var right := empty list
    for each x with index i in m do
        if i is odd then
            add x to left
        else
            add x to right

    // Recursively sort both sublists.
    left := merge_sort(left)
    right := merge_sort(right)

    // Then merge the now-sorted sublists.
    return merge(left, right)

The merge function merges the lists together while returning up the call tree:

function merge(left, right)
    var result := empty list

    while left is not empty and right is not empty do
        if first(left) ≤ first(right) then
            append first(left) to result
            left := rest(left)
        else
            append first(right) to result
            right := rest(right)

    // Either left or right may have elements left; consume them.
    // (Only one of the following loops will actually be entered.)
    while left is not empty do
        append first(left) to result
        left := rest(left)
    while right is not empty do
        append first(right) to result
        right := rest(right)
    return result

Penjee's blog has some excellent animations that help you visualize this process. Here is one that actually draws a tree like the one I mentioned:

enter image description here

Related Topic