Object-oriented – What are some techniques I can use to refactor Object Oriented code into Functional code

functional programminggame developmentobject-oriented

I've spent about 20-40 hours developing part of a game using JavaScript and HTML5 canvas. When I started I had no idea what I was doing. So it started as a proof of concept and is coming along nicely now, but it has no automated tests. The game is starting to become complex enough that it could benefit from some automated testing, but it seems tough to do because the code depends on mutating global state.

I'd like to refactor the whole thing using Underscore.js, a functional programming library for JavaScript. Part of me thinks I should just start from scratch using a Functional Programming style and testing. But, I think refactoring the imperative code into declarative code might be a better learning experience and a safer way to get to my current state of functionality.

Problem is, I know what I want my code to look like in the end, but I don't know how to turn my current code into it. I'm hoping some people here could give me some tips a la the Refactoring book and Working Effectively With Legacy Code.

For example, as a first step I'm thinking about "banning" global state. Take every function that uses a global variable and pass it in as a parameter instead.

Next step may be to "ban" mutation, and to always return a new object.

Any advice would be appreciated. I've never taken OO code and refactored it into Functional code before.

Best Answer

Having authored several projects in functional style JavaScript, allow me to share my experience. I will focus on high level considerations, not specific implementation concerns. Remember, there are no silver bullets, do what works best for your specific situation.

Pure Functional vs Functional Style

Consider what you want to gain from this functional refactoring. Do you really want a pure functional implementation, or just some of the benefits a functional programming style can bring, such as referential transparency.

I have found the latter approach, what I call functional style, to mesh well with JavaScript and is usually what I want. It also allows selective non functional concepts to be carefully used in code where they make sense.

Writing code in a more purely functional style is also possible in JavaScript, but is not always the most appropriate solution. And Javascript can never be purely functional.

Functional Style Interfaces

In functional style Javascript, the most important consideration is the boundary between functional and the imperative code. Clearly understanding and defining this boundary is essential.

I organize my code around functional style interfaces. These are collections of logic that behave in a functional style, ranging from individual functions to entire modules. The conceptual separation of interface and implementation is important, a functional style interface may be implemented imperatively, as long as the imperative implementation does not leak across the boundary.

Unfortunately in Javascript, the burden of enforcing functional style interfaces falls entirely on the developer. Also, language features, such as exceptions, make a truly clean separation impossible.

Since you will be refactoring an existing codebase, start by identifying existing logical interfaces in your code that can be cleanly moved to a functional style. A surprising amount of code may already be functional style. Good starting points are small interfaces with few imperative dependencies. Everytime you move over code, existing imperative dependencies are eliminated and more code can be moved over.

Keep iterating, gradually working outwards to push large parts of your project onto the functional side of the boundary, until you are happy with the results. This incremental approach allows testing individual changes and makes identifying and enforcing the boundary easier.

Rethink Algorithms, Data Structures, and Design

Imperative solutions and design patterns often don't make sense in functional style. Especially for data structures, direct ports of imperative code to functional style can be ugly and slow. A simple example: would you rather copy every element of a list for a prepend or cons the element onto the existing list?

Research and evaluate existing functional approaches to problems. Functional programming is more than just a programming technique, it is a different way of thinking about and solving problems. Projects written in the more traditional functional programming languages, such as lisp or haskell, are great resources in this regard.

Understand the Language and Builtins

This is an important part of understanding the functional imperative boundary and will effect interface design. Understand what features can be safely used in functional style code and which can not. Everything from which builtins may throw exceptions and which, like [].push, mutate objects, to objects being passed by reference and how prototype chains work. A similar understanding of any other libraries you consume in your project is also required.

Such knowledge allows making informed design decisions, like how to handle bad input and exceptions, or what happens if a callback function does something unexpected? Some of this can be enforced in code, but some just requires documentation on the proper usage.

Another point is how strict your implementation should be. Do you really want to try enforcing the correct behavior on maximum call stack errors or when the user redefines some builtin function?

Using Non Functional Concepts Where Appropriate

Functional approaches are not always appropriate, especially in a language like JavaScript. The recursive solution may be elegant until you start getting maximum call stack exceptions for larger inputs, so perhaps an iterative approach would be better. Similarly, I use inheritance for code organization, assignment in constructors, and immutable objects where they make sense.

As long as the functional interfaces are not violated, do what makes sense and what will be easiest to test and maintain.

Example

Here is an example of converting some very simple real code to functional style. I don't have a very good large example of the process, but this small one touches on a number of interesting points.

This code builds a trie from a string. Ill start with some code based on the top section of John Resig's trie-js build-trie module. Many simplifications / formatting changes have been made and my intent is not to comment on the quality of original code (There are much clearer and cleaner ways to build a trie, so why this c style code comes up first in google is interesting. Here's a quick implementation that uses reduce).

I like this example because I don't have to take it all the way to a true functional implementation, just to functional interfaces. Here is the starting point:

var txt = SOME_USER_STRING,
    words = txt.replace(/\n/g, "").split(" "),
    trie = {};

for (var i = 0, l = words.length; i < l; i++) {
    var word = words[i], letters = word.split(""), cur = trie;
    for (var j = 0; j < letters.length; j++) {
        var letter = letters[j];
        if (!cur.hasOwnProperty(letter))
            cur[letter] = {};
        cur = cur[ letter ];
    }
    cur['$'] = true;
}

// Output for text = 'a ab f abc abf'
trie = {
    "a": {
        "$":true,
        "b":{
            "$":true,
            "c":{"$":true}
        "f":{"$":true}}},
    "f":{"$":true}};

Let's pretend this is the entire program. This program does two things: convert a string to a list of words and build a trie from the list of words. These are the existing interfaces in the program. Here is our program with the interfaces more clearly expressed:

var _get_words = function(txt) {
    return txt.replace(/\n/g, "").split(" ");
};

var _build_trie_from_list = function(words, trie) {
    for (var i = 0, l = words.length; i < l; i++) {
        var word = words[i], letters = word.split(""), cur = trie;
        for (var j = 0; j < letters.length; j++) {
            var letter = letters[j];
            if (!cur.hasOwnProperty(letter))
                cur[letter] = {};
            cur = cur[ letter ];
        }
        cur['$'] = true;
    }
};

// The 'public' interface
var build_trie = function(txt) { 
    var words = _get_words(txt), trie = {};
    _build_trie_from_list(words, trie);
    return trie;
};

Just by defining our interfaces, we can move _get_words to the functional style side of the boundary. Neither replace or split modifies the original string. And our public interface build_trie is functional style too, even though it interacts with some very imperative code. In fact, this is a fine stopping point in most cases. More code would get overwhelming, so let me overview a few other changes.

First, make all the interfaces functional style. This is trivial in this case, just have _build_trie_from_list return an object instead of mutating one.

Handling Bad input

Consider what happens if we call build_trie with an array of characters, build_trie(['a', ' ', 'a', 'b', 'c', ' ', 'f']). The caller assumed this would behave in a functional style, but instead it throws an exception when .replace is called on the array. This may be the intended behavior. Or we could do explicit type checks and make sure the inputs are as we expected. But I prefer duck typing.

Strings are just arrays of characters and arrays are just objects with a length property and non-negative integers as keys. So if we refactored and wrote new replace and split methods that operates on array-like objects of strings, they don't even have to be strings, our code could just do the right thing. (String.prototype.* will not work here, it converts the output to a string). Duck typing is a totally separate from functional programming, but the larger point is that bad input should always be considered.

Fundamental redesign

There are also more fundamental considerations. Assume that we want to build the trie in functional style as well. In the imperative code, the approach was to construct the trie one word at a time. A direct port would have us copy the entire trie every time we need to make an insertion to avoid mutating objects. Clearly that is not going to work. Instead, the trie could be constructed node by node, from the bottom up, so that once a node is completed, it never has to be touched again. Or another approach entirely may be better, a quick search shows many existing functional implementations of tries.

Hope the example clarified things a bit.

Overall, I find writing functional style code in Javascript is a worthwhile and fun endeavor. Javascript is a surprisingly capable functional language, albeit too verbose in its default state.

Good luck with your project.

A few personal projects written in functional style Javascript:

parse.js - Parser Combinators
Nu - Lazy streams
Atum - Javascript interpreter written in functional style Javascript.
Khepri - Prototype programming language I use for functional Javascript development. Implemented in functional style Javascript.

Related Solutions

Programming Paradigms – Is Functional Programming a Superset of OOP?

Functional programming isn't a layer above OOP; it's a completely different paradigm. It's possible to do OOP in a functional style (F# was written for exactly this purpose), and on the other end of the spectrum you have stuff like Haskell, which explicitly rejects the principles of object orientation.

You can do encapsulation and abstraction in any language advanced enough to support modules and functions. OO provides special mechanisms for encapsulation, but it's not something inherent to OO. The point of OO is the second pair you mentioned: inheritance and polymorphism. The concept is formally known as Liskov substitution, and you can't get it without language-level support for object-oriented programming. (Yes, it's possible to fake it in some cases, but you lose a lot of the advantages that OO brings to the table.)

Functional programming doesn't focus on Liskov substitution. It focuses on increasing the level of abstraction, and on minimizing the use of mutable state and routines with "side effects", which is a term functional programmers like to use to make routines that actually do something (as opposed to simply calculating something) sound scary. But again, they're completely separate paradigms, that can be used together, or not, depending on the language and the skill of the programmer.

Ring of numbers where adjacent entries sum up to a prime

I just wrote an implementation in Clojure, a functional Lisp-I, which you may find below. Because of the controversy in the commments to this answer, I will try to explain the code and the techniques used to achieve the result as good as possible, especially for those who are not Clojurists. Forgive me if I don't touch on everything in great detail and length.

This version does not mutate state. All of Clojures values are immutable, e. g. a vector [1 2] can not be altered once it has been constructed. When you want to concat that vector with another vector [3 4] you may invoke a function like concat like this (concat [1 2] [3 4]). Concat will return a new vector [1 2 3 4] and the two input vectors will be forgotten. The great thing is that behind the scenes, structural sharing is happening: The new vector [1 2 3 4] consists of the same data [1 2] and [3 4] consisted of. Also, if we had bound the vector [1 2] to a symbol a instead, after (concat a [3 4]) a would still be [1 2]. So while you are always generating new immutable values, in memory only the delta will be allocated.

One other feature of Clojure I am taking advantage of is laziness. E. g. you can build a sequence of all number 0 ... n with (range). The call to range returns an infinite sequence like (0 1 2 3 4 5...), but it is not realized until elements of it are consumed. Finally, many transformation functions return new lazy-sequences from lazy input sequences, so (filter even? (range)) would give me a lazy sequence of all numbers >= 0 for which the predicate even? (a boolean function) returns true - however, the filtering only happens when I actually consume elements from that sequence, e. g. (take 3 (filter even? (range))) will return (0 2 4): Until here, filter has been invoked five times (as many times as necessary to produce three elements or have the input sequence consumed).

The prime ring generator below utilizes this laziness. If you invoke (rings-of-primes 100) it immediately returns a lazy sequence of all possible prime rings with 100 elements from 1 to 100. Nothing is calculated, so if I wanted to have only three rings, I could do (take 3 (rings-of-primes 100)) and calculation would only happen for the production of three rings.

The core to the solution below is the function helper-fn which is defined within rings-of-primes. It takes two immutable values as arguments and refers to two other immutable values, length (the argument to rings-of-primes) and n-range (defined in rings-of-primes as seq [1 .. length]) in its body.

The two arguments to helper-fn are used, an immutable set of numbers that have been used in the possible ring passed as the second arg, acc (short for accumulator).

One of the great advantages of Clojure is that when you are dealing with immutable values, it's quite easy to write pure functions. helper-fn is one of those pure functions. It is defined as follows: It takes the input of a set of used numbers, and a vector of numbers which is supposed to be a partial ring of primes, and always returns a sequence of possible rings of primes that begin with the numbers in acc and have a size of length. If no rings are possible, helper-fn returns (), an empty list (not the same as nil in Clojure).

          (let [last-elem (peek acc)]
            (if (and (= length (count acc))
                     (is-prime?
                      (+ last-elem
                         (first acc))))
              [acc]

In the above code-snippet (the first lines of helper-fn), the symbol last-elem is bound to the last value of acc. Then within the conditional statement if two conditions are tested: 1. Has accumulator the user-desired length ((= length (count acc)))?, 2. Do the last and the first element add up to a prime number. If both conditions are matched, helper-fn returns [acc], a vector with exactly one element: The accumulator, a valid prime-ring. The nesting in an additional vector is needed because our function is supposed to return a sequence of valid prime-rings, not just a sequence of numbers.

Now what happens if the passed value is not a prime-ring, like when the function is invoked the first time within rings-of-primes: (helper-fn #{1} [1]). As you might have noticed, #{1} is the notation for a set, and within helper-fn, used will now be bound to #{1} and acc will be bound to [1].

Given that we have asked for a length of 4, the test-clause in the code snippet above will fail. What happens then?

              (let [parity-filter (if (even? last-elem)
                                    odd?
                                    even?)]
                (->> n-range
                     (filter #(and (parity-filter %)
                                   (not (used %))
                                   (is-prime? (+ %
                                                 last-elem))))
                     (lazy-mapcat #(helper-fn 
                                    (conj used %)
                                    (conj acc %))))))))]

The parity-filter symbol will be bound to either the predicate function odd? or to the predicate function even?, depending on whether the last-element of acc is even or odd. The reason is that you can divide even numbers by two, and adding two even numbers or two odd numbers always results in an even number which can't be a prime number then. So we only want the numbers with the opposite parity of the last element of acc, last-elem.

->> is an awesome macro. It does not do anything except for reordering the code you put into it. That is one of the advantages of Clojure being a lisp, that all your code is data that can be manipulated before evaluation. ->> reorders your code so (->> x (doY) (doZ baz)) => (doZ baz (doY x)). In other words, the first argument is appended to the end of the second form, which is appended to the end of the third form and so on.

So in this case, n-range, which was generated at the beginning of rings-of-primes, another immutable value, will be threaded with ->>. n-range will always be a sequence [1 .. length]. Now this sequence must be filtered. We define a small function, a lambda, on the fly as the first argument to filter:

                             #(and (parity-filter %)
                                   (not (used %))
                                   (is-prime? (+ %
                                                 last-elem)))

In this case #(...) creates a function with one argument to which we refer as %. E. g. #(+ % %) would create a function that takes one argument and doubles it. Our lambda is a predicate function that tests the argument for three conditions: 1. It must pass the parity-filter, 2. It may not be in the set used (you can invoke sets like functions and they return whether they contain the argument), 3. Added to the last-elem of acc, it must be a prime-number. If this three conditions match, it is a possible candidate for the next iteration of the prime ring we are building, and it will be part of the lazy sequence returned by the filter expression.

After that, a function lazy-mapcat that comes with this source, is invoked. It takes a transformation function and a sequence of elements (our filtered candidates), transforms each input element with the transform function and concats the returned value (a sequence) to the other returned values. The transformation function may return any kind of sequence, also the empty sequence. Let's have a closer look at the transformation function:

                                 #(helper-fn 
                                    (conj used %)
                                    (conj acc %))

Yes! It's a call to helper-fn itself. conj takes a collection and an argument and returns a new collection with that argument in it. Here we put the value taken from our filtered sequence into a new version of used, also create a new version of acc with that number in it, and pass them to helper-fn. Now go back to line 1. helper-fn will then (after many more recursive steps) return a sequence of possible prime-rings and because of mapcat we do that for each candidate that may be the next valid number in the current prime-ring. All the resulting sequences of rings will be concatenated to one sequence of rings. Of course, in some cases helper-fn will return an empty list, and on a higher level, that empty list will be concataned with other lists - also within a mapcat expression.

As you can see, no state is mutated. The resulting function is pure and can also be invoked with odd lengths or negative input values, in which case it will (mathematically correctly) return an empty list as a result.

You could say that the backtracking happens with acc passed to helper-fn, but much more I'd like to think of this implementation of a pure function call that results in the self assembly of possible prime-rings through a series of recursive function calls.

;; Clojure implementation of http://coj.uci.cu/24h/problem.xhtml?abb=1196
;; This version generates all possible rings of primes for any given n
;; Leon Grapenthin, Nov. 14 2013

(def is-prime?
  "Returns true if n is a prime number. Memoized."
  (memoize              ;; remember all results returned by this function
                        ;; so that a second call with the same n does not
                        ;; require calculation
   (fn is-prime?
     [n]
     (cond
      (even? n) (= 2 n) ;; if n is even, return whether
                        ;; its 2, the only even prime number
      (= 1 n) false     ;; if n is one, return false
      :otherwise
      (->> (range 2 n)  ;; build a sequence from 2 to n
           (filter (comp zero?
                         (partial mod n))) ;; filter those were dividing
                                           ;; n by them has a rest of 0
           empty?))))) ;; none could be found? => It's a prime number.

(defn lazy-mapcat
  ;; stolen from http://clojurian.blogspot.de/2012/11/beware-of-mapcat.html
  [f coll]
  (lazy-seq
   (if (not-empty coll)
     (concat
      (f (first coll))
      (lazy-mapcat f (rest coll))))))

(defn rings-of-primes
  "Returns a lazy sequence of possible prime rings for length."
  [length]
  (let [n-range (range 1 (inc length))
        helper-fn
        (fn helper-fn
          ;; tries to build the seq until acc has length and the ring
          ;; condition is fulfilled, otherwise returns nil
          [used acc]
          (let [last-elem (peek acc)]
            (if (and (= length (count acc))
                     (is-prime?
                      (+ last-elem
                         (first acc))))
              [acc]
              (let [;; slight optimization: a prime number (+ x y) with
                    ;; (not= x y) can only be built when the parities of
                    ;; x and y differ:
                    parity-filter (if (even? last-elem)
                                    odd?
                                    even?)]
                (->> n-range
                     (filter #(and (parity-filter %)
                                   (not (used %))
                                   (is-prime? (+ %
                                                 last-elem))))
                     (lazy-mapcat #(helper-fn 
                                    (conj used %)
                                    (conj acc %))))))))]
    (helper-fn #{1} [1])))

;; (time
;;  (first (rings-of-primes 100)))
;; "Elapsed time: 57.316798 msecs"
;; [1 2 3 4 7 6 5 8 9 10 13 16 15 14 17 12 11 18 19 22
;;  21 20 23 24 29 30 31 28 25 34 27 26 33 38 35 32 39
;;  40 43 36 37 42 41 48 49 52 45 44 53 50 47 54 55 46
;;  51 56 57 70 61 66 65 62 69 58 73 64 63 68 59 72 67
;;  60 71 78 79 84 83 74 75 76 81 82 85 88 91 90 89 92
;;  87 80 77 86 93 98 95 96 97 94 99 100]