Algorithms – Generating Sort Keys When Reordering Items

algorithms

We have a number of items which the end user will be able to organize into a desired order. The set of items is unordered, but each item contains a sort key which can be modified.

We're looking for an algorithm that would allow generating a new sort key for an item that is added or moved to be either the first item, last item, or between any two items. We're hoping to only have to modify the sort key of the item being moved.

An example algorithm would be to have each sort key be a floating point number, and when placing an item between two items, set the sort key to be their average. Placing an item first or last would take the outermost value +- 1.

The problem here is that floating point precision could cause the sorting to fail. Using two integers to represent a fractional number could similarly have the numbers become so large that they couldn't be accurately represented in regular numerical types (e.g. when transferring as JSON). We wouldn't want to use BigInts.

Is there a suitable algorithm for this that would work, for example, using strings, which wouldn't be affected by these shortcomings?

We're not looking to support huge numbers of moves, but the algorithm described above could fail on a double-precision floating points number after about 50 moves.

Best Answer

As a summary of all comments and answers:

TL;DR - Using double-precision floating point numbers with the initially proposed algorithm should be sufficient for most practical (at least manually-ordered) needs. Maintaining a separate ordered list of the elements should be considered as well. Other sort key solutions are somewhat cumbersome.

The two problematic operations are inserting items at the beginning / end over and over again, and repeatedly inserting or moving items to the same spot (e.g. with three elements repeatedly moving the third element between the first two, or repeatedly adding new elements as the second element).

From a theoretical point of view (i.e. allowing infinite reordering), the only solution I can think of is using two unlimited size integers as a fractional a/b. This allows infinite precision for the mid-inserts, but the numbers can become increasingly large.

Strings may be able to support a large number of updates (though I'm still having trouble figuring out the algorithm for both operations), but not infinite, because you can not add infinitely many at the first position (at least using regular string sort comparison).

Integers would require choosing an initial spacing for the sort keys, which limits how many mid-inserts you can perform. If you initially space sort keys 1024 apart, you can only perform 10 worst-case mid-inserts before you have adjacent numbers. Choosing a larger initial spacing limits how many first / last inserts you can perform. Using a 64-bit integer, you are limited to ~63 operations either way, which you need to split up between mid-inserts and first/last inserts a priori.

Using floating point values removes the need for selecting spacing a priori. The algorithm is simple:

  1. The first element inserted has a sort key 0.0
  2. An element inserted (or moved) first or last has the sort key of the first element - 1.0 or last element + 1.0, respectively.
  3. An element inserted (or moved) between two element has a sort key equal to the average of the two.

Using a double-precision float allows 52 worst-case mid-inserts and effectively infinite (about 1e15) first/last inserts. In practice when moving items around the algorithm should self-correct itself, as every time you move an item first or last it expands the range that can be used.

Double-precision floats also have the benefit that they are supported by all platforms and easily stored and transported by practically all transport formats and libraries. This was what we ended up using.

Related Topic