Java to Clojure rewrite

clojurejavavaadin

I have just been asked by my company to rewrite a largish (50,000 single lines of code) Java application (a web app using JSP and servlets) in Clojure. Has anyone else got tips as to what I should watch out for?

Please bear in mind that I know both Java AND Clojure quite well.

Update

I did the rewrite and it went into production. It's quite strange as the rewrite ended up going so fast that it was done in about 6 weeks. Because a lot of functionality wasn't needed still it ended up more like 3000 lines of Clojure.

I hear they are happy with the system and its doing exactly what they wanted. The only downside is that the guy maintaining the system had to learn Clojure from scratch, and he was dragged into it kicking and screaming. I did get a call from him the other day saying he loved Lisp now though.. funny 🙂

Also, I should give a good mention to Vaadin. Using Vaadin probably accounted for as much of the time saved and shortness of the code as Clojure did.. Vaadin is still the top web framework I have ever used, although now I'm learning ClojureScript in anger! (Note that both Vaadin and ClojureScript use Google's GUI frameworks underneath the hood.)

Best Answer

The biggest "translational issue" will probably be going from a Java / OOP methodology to a Clojure / functional programming paradigm.

In particular, instead of having mutable state within objects, the "Clojure way" is to clearly separate out mutable state and develop pure (side-effect free) functions. You probably know all this already :-)

Anyway, this philosophy tends to lead towards something of a "bottom up" development style where you focus the initial efforts on building the right set of tools to solve your problem, then finally plug them together at the end. This might look something like this

Identify key data structures and transform them to immutable Clojure map or record definitions. Don't be afraid to nest lots of immutable maps - they are very efficient thanks to Clojure's persistent data structures. Worth watching this video to learn more.
Develop small libraries of pure, business logic oriented functions that operate on these immutable structures (e.g. "add an item to shopping cart"). You don't need to do all of these at once since it is easy to add more later, but it helps to do a few early on to facilitate testing and prove that your data structures are working..... either way at this point you can actually start writing useful stuff interactively at the REPL
Separately develop data access routines that can persist these structures to/from the database or network or legacy Java code as needed. The reason to keep this very separate is that you don't want persistence logic tied up with your "business logic" functions. You might want to look at ClojureQL for this, though it's also pretty easy to wrap any Java persistence code that you like.
Write unit tests (e.g. with clojure.test) that cover all the above. This is especially important in a dynamic language like Clojure since a) you don't have as much of a safety net from static type checking and b) it helps to be sure that your lower level constructs are working well before you build too much on top of them
Decide how you want to use Clojure's reference types (vars, refs, agents and atoms) to manage each part mutable application-level state. They all work in a similar way but have different transactional/concurrency semantics depending on what you are trying to do. Refs are probably going to be your default choice - they allow you to implement "normal" STM transactional behaviour by wrapping any code in a (dosync ...) block.
Select the right overall web framework - Clojure has quite a few already but I'd strongly recommend Ring - see this excellent video "One Ring To Bind Them" plus either Fleet or Enlive or Hiccup depending on your templating philosophy. Then use this to write your presentation layer (with functions like "translate this shopping cart into an appropriate HTML fragment")
Finally, write your application using the above tools. If you've done the above steps properly, then this will actually be the easy bit because you will be able to build the entire application by appropriate composition of the various components with very little boilerplate.

This is roughly the sequence that I would attack the problem since it broadly represents the order of dependencies in your code, and hence is suitable for a "bottom up" development effort. Though of course in good agile / iterative style you'd probably find yourself pushing forward early to a demonstrable end product and then jumping back to earlier steps quite frequently to extend functionality or refactor as needed.

p.s. If you do follow the above approach, I'd be fascinated to hear how many lines of Clojure it takes to match the functionality of 50,000 lines of Java

Update: Since this post was originally written a couple of extra tools/libraries have emerged that are in the "must check out" category:

Noir - web framework that builds on top of Ring.
Korma - a very nice DSL for accessing SQL databases.

Related Solutions

Java – What are the differences between a HashMap and a Hashtable in Java

There are several differences between HashMap and Hashtable in Java:

Hashtable is synchronized, whereas HashMap is not. This makes HashMap better for non-threaded applications, as unsynchronized Objects typically perform better than synchronized ones.
Hashtable does not allow null keys or values. HashMap allows one null key and any number of null values.
One of HashMap's subclasses is LinkedHashMap, so in the event that you'd want predictable iteration order (which is insertion order by default), you could easily swap out the HashMap for a LinkedHashMap. This wouldn't be as easy if you were using Hashtable.

Since synchronization is not an issue for you, I'd recommend HashMap. If synchronization becomes an issue, you may also look at ConcurrentHashMap.

Java – Is Java “pass-by-reference” or “pass-by-value”

Java is always pass-by-value. Unfortunately, when we deal with objects we are really dealing with object-handles called references which are passed-by-value as well. This terminology and semantics easily confuse many beginners.

It goes like this:

public static void main(String[] args) {
    Dog aDog = new Dog("Max");
    Dog oldDog = aDog;

    // we pass the object to foo
    foo(aDog);
    // aDog variable is still pointing to the "Max" dog when foo(...) returns
    aDog.getName().equals("Max"); // true
    aDog.getName().equals("Fifi"); // false
    aDog == oldDog; // true
}

public static void foo(Dog d) {
    d.getName().equals("Max"); // true
    // change d inside of foo() to point to a new Dog instance "Fifi"
    d = new Dog("Fifi");
    d.getName().equals("Fifi"); // true
}

In the example above aDog.getName() will still return "Max". The value aDog within main is not changed in the function foo with the Dog "Fifi" as the object reference is passed by value. If it were passed by reference, then the aDog.getName() in main would return "Fifi" after the call to foo.

Likewise:

public static void main(String[] args) {
    Dog aDog = new Dog("Max");
    Dog oldDog = aDog;

    foo(aDog);
    // when foo(...) returns, the name of the dog has been changed to "Fifi"
    aDog.getName().equals("Fifi"); // true
    // but it is still the same dog:
    aDog == oldDog; // true
}

public static void foo(Dog d) {
    d.getName().equals("Max"); // true
    // this changes the name of d to be "Fifi"
    d.setName("Fifi");
}

In the above example, Fifi is the dog's name after call to foo(aDog) because the object's name was set inside of foo(...). Any operations that foo performs on d are such that, for all practical purposes, they are performed on aDog, but it is not possible to change the value of the variable aDog itself.

For more information on pass by reference and pass by value, consult the following SO answer: https://stackoverflow.com/a/430958/6005228. This explains more thoroughly the semantics and history behind the two and also explains why Java and many other modern languages appear to do both in certain cases.

Update

Best Answer

Related Solutions

Java – What are the differences between a HashMap and a Hashtable in Java

Java – Is Java “pass-by-reference” or “pass-by-value”

Related Topic