Java Design Patterns – How to Handle Sorting of Complex Objects

design-patternsjavasorting

How would one sort a list of objects that have more than one sortable element?

Suppose you have a simple object Car and car is a defined as such:

class Car {
    public String make;
    public String model;
    public int year;
    public String color;
    // ... No methods, just a structure / container
}

I designed a simple framework that would allow for multiple SortOptions to be provided to a Sorter that would then sort the list.

interface ISorter<T> {
    List<T> sort(List<T> items);
    void addSortOption(ISortOption<T> option);
    ISortOption<T>[] getSortOptions();
    void setSortOption(ISortOption<T> option);
}

interface ISortOption<T> {
    String getLabel();
    int compare(T t1, T t2);
}

Example use

class SimpleStringSorter extends MergeSorter<String> {
    {
        addSorter(new AlphaSorter());
    }

    private static final class AlphaSorter implements ISortOption<String> {
        // ... implementation of alpha compare and get label
    }
}

The issue with this solution is that it is not easily expandable. If car was to ever receive a new field, say, currentOwner. I would need to add the field, then track down the sorter class file, implement a new sort option class then recompile the application for redistribution.

Is there an easier more expandable/practical way to sort data like this?

Best Answer

Actually you can use a comparator which has a method compare(a,b) which you can implement.

Then you can pass it in for the compare step (this is supported in nearly all standard libraries of most languages).

For example in java you can call

Collections.sort(fooList, new Comparator<Car>(){
    public int compare(Car a,Car b){
        return a.getModel().compareTo(b.getModel());
        // or compare what you want return -1, 0 or 1 
        // for less than, equal and greater than resp.
    }
});

To sort your lists according to a custom comparator

In java 8 there is a lambda syntax to create the Comparator in a single line.

This means there will be only one sorting algorithm to maintain and a bunch of comparators which can remain in the same class as what it is comparing, (or near the place where the comparing is taking place).

This also allows for a "tiered" sort, you can implement something like:

public static Comparator<T> createTieredComparator(final Comparator<? super T> comp1, final Comparator<? super T> comp2){
    return new Comparator<T>(){
        public int compare(T a,T b){
            int res = comp1.compare(a,b);
            if(res!=0)
                return res;
            else
                return comp2.compare(a,b);
        }
    };
}

This will prefer the comparison made by comp1 and only return the result of comp2 when they would be considered equal according to comp1.

Related Solutions

PHP – Best Methods for Function Parameter Validation

The general solutions to this problem are type safety (so that values are valid by construction) and encapsulation (so that values cannot be invalidated after construction). If your inputs and outputs have meaningful types, then the constructors of those types can enforce the properties you want. If validation is centralised, you don’t have to repeat it.

Let’s talk in pseudocode for a moment. As a contrived example, consider a function area(w, h) that computes the area of a rectangle. If you type the function as:

int area(int w, int h)

Then there is no guarantee that any of the invariants hold:

w and h are lengths
Being lengths, they must be non-negative
The result is an area

To enforce input constraints, you can always add validation to the function body:

int area(int w, int h) {
    assert(w >= 0);
    assert(h >= 0);
    return w * h;
}

Not only is this cumbersome, but it remains the responsibility of the caller to validate the result. If we use types that represent our units:

Area area(Length w, Length h) {
    return w * h;
}

Then nobody can give us a Volume when we expected a Length, and since a length cannot be negative, we don’t need to check for that.

PHP doesn’t enforce types statically, but you can prevent the construction of invalid objects by throwing exceptions from constructors that receive invalid inputs, and using immutable objects or accessors to prevent later invalidation.

Java MVC – How to Handle Basic Objects with Interfaces

Having over-engineered one or two systems in my career, I will try to answer this from a pragmatic point of view.

My first approach would be to establish the convention, that data must not be modified in the presentation layer. If you work alone you know your own conventions anyway. In a small team you should be able to communicate them easily. Yes, this does not prevent violations at compile time, but unintentional violations should be rare and easy to spot (call hierarchy of a certain setter, for example).

If you absolutely need/want to implement this rule in code, putting the controllers and their respective data objects into a single package each, is the next best thing. Also I would not use protected, but default/package scope (i.e. omit the visibility keyword). Of the three options you presented, this is the most light weight (and therefore best maintainable and scalable) approach. You shouldn't be too worried about "...making it to difficult to expand the program.", you mereley have to live with potentially a lot of classes in one package. Maybe you can mitigate this by cleverly grouping certain controllers with some objects into their own package.

If this approach also fails, immutable objects are what you want. You could implement them plain (constructor and copy constructor, no setters), with factories/factory methods, or even use the Builder Pattern. Just keep in mind, that the overhead (i.e. boiler plate code) increases significantly with this solution and you should ask yourself if the gain (less time spent hunting/fixing bugs) is really greater than the cost (writing, testing and maintaining boiler plate code). In fact I would only recommend this solution if you can guarantee, that the objects never need to be modified, and therefore you can ditch the copy code. This does not seem to be the case for your example, though.

Introducing a special getter-interface for each(!) data-class means more overhead than protected/default setters and is only half as strict, as you mentioned yourself: One could always downcast to the actual data-class. To improve this, you could give your data-classes protected or package visibility and put them in the same package as the controllers, but then you are back at square one, with additional interfaces and complexity.

Keeping the controllers and the data-model separated is a good idea. For most applications the model is much more likely to change than the controller logic. And you want to keep that changing part contained, since every change potentially introduces a bug and means additional adaptations in the test- and client code. By moving and/or wrapping the getters directly into the controller (with or without special interfaces) you also increase the scope of the change. You'd suddenly need to maintain the test- and client code of the controller, instead of just the model, if you want to add/change one field. Another, simpler problem is, if you need a collection of non-primitive data in your model.

Any software design decision boils down to the question of which solution lets you, your colleagues and users work most efficiently. These three groups will likely have contradictory needs and so it is hard to find the absolutely best solution and trade-offs are almost always a part of it.

Best Answer

Related Solutions

PHP – Best Methods for Function Parameter Validation

Java MVC – How to Handle Basic Objects with Interfaces

Related Topic