R – Critique this late night, noob Haskell code

functional programminghaskellsorting

I'm working through Real World Haskell, and at the moment doing the exercises at the end of Chapter 3.

I'm taking an unusual approach: even though I know there are some language features they haven't covered yet that would help me, I am trying to do these exercises using only things they have explicitly covered. Why? Just kind of for fun really. It feels like it is forcing me to give my brain some extra practice with recursion.

So I just completed the exercise posed as follows: "Create a function that sorts a list of lists based on the length of each sublist. (You may want to look at the sortBy function from the Data.List module.)"

Now, they threw in the hint about the Data.List module. But they didn't say a word about where reference doc can be found, about how to import stuff, etc. So I decided to roll my own sort just to see if I could do it. I used Bubble Sort since it's the simplest algorithm.

The result is below. I'd like to get you Haskell gurus to critique it please … but with the following caveat in mind: If you suggest improvements, please base them on language features covered through chapter 3 of Real World Haskell (or what you guess those features might be without going to the trouble of looking it up). I know there are tons of awesome language features waiting for me that will let me make this code better, but right now the specific challenge was to do it with the "primitive" features covered so far.

I'm sure there are cases where I'm reaching around my shoulder to scratch my elbow, cases where I'm using explicit control flow when recursion and pattern matching could do more for me, etc. I'm sure the code could be made much shorter and more readable too. I bet there are good idioms I don't know about that can be used with the primitive language features I'm limiting myself to. Those are the kinds of tips I'd love to receive.

This is probably the ugliest code I'm proud of in any language (at least, that I can remember). My first stab, in a functional language, at something beyond "Hello, world" type stuff. And now you are going to beat the crap out of it 🙂 . Be gentle, please, but I'm looking forward to some meaty insight. Thanks.

areListsEqual :: (Eq a) => [a] -> [a] -> Bool

areListsEqual [] [] = True
areListsEqual [] _  = False
areListsEqual _ []  = False

areListsEqual xs ys = (head xs == head ys)  && (areListsEqual (tail xs) (tail ys))

charlieSort :: (Eq a) => [[a]] -> [[a]]

charlieSort [] = []
charlieSort (x:xs) | null xs = [x]
charlieSort xs | (length xs) >= 2 = if(not (areListsEqual xs wip))
                    then charlieSort wip 
                    else wip
                    where
                      first = head xs
                      second = head (tail xs)
                      theRest = drop 2 xs
                      swapPairIfNeeded a b = if(length a >= length b) 
                          then [second, first]
                          else [first, second]
                      modifiedPair = swapPairIfNeeded first second
                      wip = (take 1 modifiedPair) ++ charlieSort ( (drop 1 modifiedPair) ++ theRest)

Best Answer

I would first and foremost start using pattern-matching.

areListsEqual :: Eq a => [a] -> [a] -> Bool
areListsEqual [    ] [    ] = True
areListsEqual [    ] _      = False
areListsEqual _      [    ] = False
areListsEqual (x:xs) (y:ys) = x == y && areListsEqual xs ys

Note how much more readable this is, when head and tail is avoided.

charlieSort :: Eq a => [[a]] -> [[a]]
charlieSort    [                    ] = []
charlieSort    [x                   ] = [x]
charlieSort xs@(first:second:theRest)
  | areListsEqual xs wip              = wip
  | otherwise                         = charlieSort wip
  where
  swapPairIfNeeded a b
    | length a >= length b = [second,first]
    | otherwise            = [first,second]
  modifiedPair = swapPairIfNeeded first second
  wip = take 1 modifiedPair ++ charlieSort (drop 1 modifiedPair ++ theRest)

I changed the if-then-else to a guard for slightly improved readability (YMMV). Instead of checking that the list has at least two elements with a call to length we use pattern-matching, which also allows us to name first,second,theRest directly. The name @ pattern pattern both matches the input against pattern and names the whole input as name.

Now, I want to avoid using take and drop for extracting the two elements of modifiedPair, so the last two lines are changed into

  [shorter,longer] = swapPairIfNeeded first second
  wip = [shorter] ++ charlieSort ([longer] ++ theRest)

where you could write the last line as

  wip = shorter : charlieSort (longer : theRest)

if you preferred. But why should swapPairIfNeeded return the shorter and the longer of the first and second list in a list ? Why not use a pair like

  swapPairIfNeeded a b
    | length a >= length b = (second,first)
    | otherwise            = (first,second)
  (shorter,longer) = swapPairIfNeeded first second

? In most circumstances, it is better to use tuples for fixed number of values (possibly of differing types) and to use lists for variable number of values (necessarily of same type). But it seems strange that swapPairIfNeeded compares its arguments a and b, but then returns first and second anyway. In this case, instead of letting it return a and b in a pair, I will remove swapPairIfNeeded altogether instead :

  (shorter,longer)
    | length first >= length second = (second,first)
    | otherwise                     = (first,second)

"unfolding" the body of swapPairIfNeeded into the definition of (shorter,longer).

So now the code of charlieSort looks like

charlieSort :: Eq a => [[a]] -> [[a]]
charlieSort    [                    ] = []
charlieSort    [x                   ] = [x]
charlieSort xs@(first:second:theRest)
  | areListsEqual xs wip              = wip
  | otherwise                         = charlieSort wip
  where
  (shorter,longer)
    | length first >= length second = (second,first)
    | otherwise                     = (first,second)
  wip = shorter : charlieSort (longer : theRest)

Finally, I should remark that charlieSort doesn't really implement bubble-sort, because the recursive call to charlieSort will not only make one "bubble-up" pass along the list, but also fully sort the list longer : theRest, so that all that has to be done after this recursive call (before returning one "level up") is to possibly percolate shorter to its rightful place.

Related Topic