Merge sort and O(n log n) thestery

algorithmsarraybig osorting

I read every explanation here but still not convinced. I think mergesort is n * n and I know I am wrong but not sure where. Here is what I think:

Suppose we are sorting 8 elements and this is the algorithm (assuming I have the right idea for it):

doSort(int begin, int end, int[] arr) {
    if(end != begin) {
        int mid = begin + (end - begin)/2;
        doSort(begin,mid);
        doSort(mid + 1, end);
        merge(arr, begin, mid, mid + 1, end);
    }  
}
merge(int[] arr,int i_begin, i_end, j_begin, j_end) {
// Do the comparison and all that
}

Now as per my understanding merge() itself has complexity O(n). Now let's see how many times doSort() and hence merge() is called. doSort() is called for the following indices:

  1. 0-7
  2. 0-3
  3. 4-7
  4. 0-1
  5. 2-3
  6. 4-5
  7. 6-7

Which is 7 times which is O(n) given we are sorting 8 elements. Similarly for 16 elements merge() is called 15 times and so on. So although we divide the array into half each time, we are not eliminating the other half by any means. Compare this to a BST search where I eliminate one half of the tree at each step because I know the other half is useless to me, which according to me is true log(n). And in case of merge sort we are calling merge n-1 times and each time merge needs o(n) operations, so O(n*n).

Where have I gone wrong? Any suggestion would be appreciated.

Best Answer

Sorting say 1,024 numbers, you perform one merge of two 512 element arrays. You perform 2 merges of two 256 element arrays, 4 merges of two 128 element arrays, 8 tims 64 elements, ..., 512 merges of two 1 element arrays.

You just checked the time for the single largest merge (1 x 512 elements), and the total number of merges (1,023 merges), but you didn't realise that half the merges were just 1 element arrays, half the remaining merges were just 2 element arrays, and so on. Each set of merges takes n = 1024 steps (one merge with 1024 element result, 2 merges with 512 element results, ..., 512 merges with 2 element results), and there are log2 (n) = 10 sets of merges, for a total of 10 x 1024 or (n log2 n) steps.

Related Topic