Detect Order Changes in a List

algorithms

Let's say you have a list of values that are:

The order on the list might be changed.

How would you detect this change? We needed something like this when we were building a revision history page for our online form building tool. We found a quick and dirty solution, but I am looking for a more elegant solution for this. Here is what the revision descriptions look like:

jotform revisions page

Here is the tough part: Elements can be removed/added from/to anywhere on the list. So, the algorithm should be able to ignore those changes.

Best Answer

There are two approaches:

Try to find a way to apply the diff algorithm to your model. This gets easier when there is a underlying textual representation (e.g. HTML source or another DSL).
Don't have your form builder emit complete states, only have it emit differences which you can easily log. Instead of having the before and after lists, you'd log the “move field from x to y” operation.

Related Solutions

C++ – How to extract the active code path from a complex algorithm

I think it depends on what you want to achieve... Do you want to improve the code? parallelize the code? clean it? just understand it?

Besides the great comment given by @Calphool, what I've done in similar cases (but not with 100K lines code to be honest) is this:

Look for whoever wrote the code. Or has use it. Asked them what I needed to know, that saves you a lot of time. This may seem stupid, but is not.
I made a graph of the dependencies. Take a look at this for an example.
Depending on what you need to do, you could measure the execution time of some (or all) function.
Start playing with it... but with modern tools, like git. If possible, started to add some tests.

If you want to see what functions get called, you could just print the functions that are called (take a look at this question). You could add a printf at each function using a script, but I don't think that is a good idea. Also, you have to think how are you going to go through the generated output.

After you know what you want, and before making my implementations, I try to isolate the part I need to work on. Meaning, I cleanup the code a little, put in a different file if needed, compile it and test it. Only then I proceed to actually modify the code adding functionality or whatever needs to be done. This may also include port the code to use modern building tools if needed.

My two cents.

Efficient algorithm to merge n successive sorted arrays in place

If you repeatedly merge the 2 first sequences you will get a runtime that is much worse than merge sort as you compare the first elements far too many times (n log(n)^2???) if you merge the 2 (or more) smallest (adjacent) sequences each time you should approach merge sorts efficiency.

Finding the smallest adjacent sequences might be done by building a tree with roughly half the sequence in each branch recursively and then merging the lowest branches first.

---- Edit

First version:
Essential a mergesort where the separators are the implicit partition of the mergesort algorithm.

Order separators according to their index in the array

MergeIt(first, last) {
    if (only one or zero separator)
       return first;

    split = separator containing the middle separator (first, last)

    return inplace_merge(MergeIt(first, split), MergeIt(split, end));
}

This ensures that you only do the minimum number of merges, but not the minimum number of compares as larger sequences might be merge with the current minimum sequence.

Version two:
Still basically a mergesort where we now take into consideration the length of the sequences.

Order separators according to their index in the array

MergeIt(first, last) {
    if (only one or zero separator)
       return first;

    split = separator containing the middle element of array(first, last) // not the middle separator

    return inplace_merge(MergeIt(first, split), MergeIt(split, end));
}

The split insures that the smaller sequences merge first as the larger stop higher up in the call tree. This still does not insure the least number of compares as there still is the chance that larger sequences are merge with the current minimum length, though this is still better than version one as the large sequences here are smaller.

Version three:
Merge the adjacent sequences which span the least number of elements

Make heap of pairs of adjacent sequences, sort after minimum length of the pairs, for each sequence only add the shortest of its prev and next.
// each sequence will appear max twice except the first and last.

while (heap.size() > 1) {
    min = heap.pop

    remove the possible other occurrence of min.first and min.second

    sequence = inplace_merge(min.first, min.second)

    insert the minimum of pair(prev(sequence), sequence) and pair(sequence, next(sequence)) in heap.
}

The overhead might make this slower than the second version but the compares in made by inplace_merge should now be the minimum.

Best Answer

Related Solutions

C++ – How to extract the active code path from a complex algorithm

Efficient algorithm to merge n successive sorted arrays in place

Related Topic