Java Sorting – Insertion Sort vs Insert at Sorted Position for Mostly Sorted Arrays

datadata structuresjavasorting

I'm trying to figure out the most efficient way of maintaining a sorted ArrayList. I'm inserting items into the array one at a time. I'm deciding between two methods: inserting the item at the sorted position or inserting the item at the end of the end of the list and using insertion sort to sort the data because of its superior performance on mostly sorted arrays. I would think that inserting the item at the sorted position is more efficient, but I may be wrong.

Which method would be more efficient for this particular situation?

Best Answer

When inserting an item in the middle of the sorted array, you first need to find the correct location (e.g. via binary search), and you need to shift all the elements on the right of the insertion position one place to the right in order to make space.

  • Comparisons: O(log n) per insertion, O(n log n) for all elements
  • Movements: O(n) per insertion, O(n²) for all elements

When inserting an item at the end and then using an insertion sort pass, you do not need to know the insertion position up front. However, the insertion sort pass will compare all larger elements, and move all larger elements one step to the right.

  • Comparisons: O(n) per insertion, O(n²) for all elements
  • Movements: O(n) per insertion, O(n²) for all elements

Insertion pseudocode:

elements.add(newElement);  // at the end

// Move the new element forward in the array
// until it is at the correct position.
// This loop has O(n) average time complexity.
//
// Example ordering for ints:
//    boolean isOrdered(int a, int b) {
//      return a <= b;
//    }
int i = elements.size() - 1;
while (i > 0 && !isOrdered(elements[i - 1], elements[i])) {
  swap(elements, i - 1, i);
  i--;
}

This means using insertion sort instead of inserting the element at a position found through binary search involves the same number of movements, but requires far more comparisons. So inserting the new element directly at the sorted position is more efficient if the necessary comparisons are expensive, but both algorithms are essentially O(n²) (i.e. quadratic) over the course of all insertions.

If you want to be more efficient, you have to think about how you are using the data.

  • If you don't need the items to be always sorted: First insert all elements, then sort them with an O(n log n) sorting algorithm.

  • If you need to access the sorted items before all elements are available:

    • If you only need access to the maximal or minimal element, use a Heap.
    • If you need access to all current items in their order, use a self-balancing binary search tree, e.g. as implemented in Java's TreeMap.

Inserting elements into an ArrayList might still be the correct option, e.g. if the number of elements is very small.

Related Topic