Java – Why is Erlang slower than Java on all these small math benchmarks

compareerlangjavaperformance

While considering alternatives for Java for a distributed/concurrent/failover/scalable backend environment I discovered Erlang. I've spent some time on books and articles where nearly all of them (even Java addicted guys) says that Erlang is a better choice in such environments, as many useful things are out of the box in a less error prone way.

I was sure that Erlang is faster in most cases mainly because of a different garbage collection strategy (per process), absence of shared state (b/w threads and processes) and more compact data types. But I was very surprised when I found comparisons of Erlang vs Java math samples where Erlang is slower by several orders, e.g. from x10 to x100.

Even on concurrent tasks, both on several cores and a single one.

What's the reasons for that? These answers came to mind:

Usage of Java primitives (=> no heap/gc) on most of the tasks
Same number of threads in Java code and Erlang processes so the actor model has no advantage here
Or just that Java is statically typed, while Erlang is not
Something else?

If that's because these are very specific math algorithms, can anybody show more real/practice performance tests?

UPDATE: I've got the answers so far summarizing that Erlang is not the right tool for such specific "fast Java case", but the thing that is unclear to me – what's the main reason for such Erlang inefficiency here: dynamic typing, GC or poor native compiling?

Best Answer

Erlang was not built for math. It was built with communication, parallel processing and scalability in mind, so testing it for math tasks is a bit like testing if your jackhammer gives you refreshing massage experience.

That said, let's offtop a little:
If you want Erlang-style programming in JVM, take a look at Scala Actors or Akka framework or Vert.x.

Some test results

I've gotten a lot of good answers to this question--thanks folks--so I decided to run some tests and figure out which method is actually fastest. The five methods I tested are these:

the "ContainsKey" method that I presented in the question
the "TestForNull" method suggested by Aleksandar Dimitrov
the "AtomicLong" method suggested by Hank Gay
the "Trove" method suggested by jrudolph
the "MutableInt" method suggested by phax.myopenid.com

Method

Here's what I did...

created five classes that were identical except for the differences shown below. Each class had to perform an operation typical of the scenario I presented: opening a 10MB file and reading it in, then performing a frequency count of all the word tokens in the file. Since this took an average of only 3 seconds, I had it perform the frequency count (not the I/O) 10 times.
timed the loop of 10 iterations but not the I/O operation and recorded the total time taken (in clock seconds) essentially using Ian Darwin's method in the Java Cookbook.
performed all five tests in series, and then did this another three times.
averaged the four results for each method.

Results

I'll present the results first and the code below for those who are interested.

The ContainsKey method was, as expected, the slowest, so I'll give the speed of each method in comparison to the speed of that method.

ContainsKey: 30.654 seconds (baseline)
AtomicLong: 29.780 seconds (1.03 times as fast)
TestForNull: 28.804 seconds (1.06 times as fast)
Trove: 26.313 seconds (1.16 times as fast)
MutableInt: 25.747 seconds (1.19 times as fast)

Conclusions

It would appear that only the MutableInt method and the Trove method are significantly faster, in that only they give a performance boost of more than 10%. However, if threading is an issue, AtomicLong might be more attractive than the others (I'm not really sure). I also ran TestForNull with final variables, but the difference was negligible.

Note that I haven't profiled memory usage in the different scenarios. I'd be happy to hear from anybody who has good insights into how the MutableInt and Trove methods would be likely to affect memory usage.

Personally, I find the MutableInt method the most attractive, since it doesn't require loading any third-party classes. So unless I discover problems with it, that's the way I'm most likely to go.

The code

Here is the crucial code from each method.

ContainsKey

import java.util.HashMap;
import java.util.Map;
...
Map<String, Integer> freq = new HashMap<String, Integer>();
...
int count = freq.containsKey(word) ? freq.get(word) : 0;
freq.put(word, count + 1);

TestForNull

import java.util.HashMap;
import java.util.Map;
...
Map<String, Integer> freq = new HashMap<String, Integer>();
...
Integer count = freq.get(word);
if (count == null) {
    freq.put(word, 1);
}
else {
    freq.put(word, count + 1);
}

AtomicLong

import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
import java.util.concurrent.atomic.AtomicLong;
...
final ConcurrentMap<String, AtomicLong> map = 
    new ConcurrentHashMap<String, AtomicLong>();
...
map.putIfAbsent(word, new AtomicLong(0));
map.get(word).incrementAndGet();

Trove

import gnu.trove.TObjectIntHashMap;
...
TObjectIntHashMap<String> freq = new TObjectIntHashMap<String>();
...
freq.adjustOrPutValue(word, 1, 1);

MutableInt

import java.util.HashMap;
import java.util.Map;
...
class MutableInt {
  int value = 1; // note that we start at 1 since we're counting
  public void increment () { ++value;      }
  public int  get ()       { return value; }
}
...
Map<String, MutableInt> freq = new HashMap<String, MutableInt>();
...
MutableInt count = freq.get(word);
if (count == null) {
    freq.put(word, new MutableInt());
}
else {
    count.increment();
}

Java – Efficiency of Java “Double Brace Initialization”

Here's the problem when I get too carried away with anonymous inner classes:

2009/05/27  16:35             1,602 DemoApp2$1.class
2009/05/27  16:35             1,976 DemoApp2$10.class
2009/05/27  16:35             1,919 DemoApp2$11.class
2009/05/27  16:35             2,404 DemoApp2$12.class
2009/05/27  16:35             1,197 DemoApp2$13.class

/* snip */

2009/05/27  16:35             1,953 DemoApp2$30.class
2009/05/27  16:35             1,910 DemoApp2$31.class
2009/05/27  16:35             2,007 DemoApp2$32.class
2009/05/27  16:35               926 DemoApp2$33$1$1.class
2009/05/27  16:35             4,104 DemoApp2$33$1.class
2009/05/27  16:35             2,849 DemoApp2$33.class
2009/05/27  16:35               926 DemoApp2$34$1$1.class
2009/05/27  16:35             4,234 DemoApp2$34$1.class
2009/05/27  16:35             2,849 DemoApp2$34.class

/* snip */

2009/05/27  16:35               614 DemoApp2$40.class
2009/05/27  16:35             2,344 DemoApp2$5.class
2009/05/27  16:35             1,551 DemoApp2$6.class
2009/05/27  16:35             1,604 DemoApp2$7.class
2009/05/27  16:35             1,809 DemoApp2$8.class
2009/05/27  16:35             2,022 DemoApp2$9.class

These are all classes which were generated when I was making a simple application, and used copious amounts of anonymous inner classes -- each class will be compiled into a separate class file.

The "double brace initialization", as already mentioned, is an anonymous inner class with an instance initialization block, which means that a new class is created for each "initialization", all for the purpose of usually making a single object.

Considering that the Java Virtual Machine will need to read all those classes when using them, that can lead to some time in the bytecode verfication process and such. Not to mention the increase in the needed disk space in order to store all those class files.

It seems as if there is a bit of overhead when utilizing double-brace initialization, so it's probably not such a good idea to go too overboard with it. But as Eddie has noted in the comments, it's not possible to be absolutely sure of the impact.

Just for reference, double brace initialization is the following:

List<String> list = new ArrayList<String>() {{
    add("Hello");
    add("World!");
}};

It looks like a "hidden" feature of Java, but it is just a rewrite of:

List<String> list = new ArrayList<String>() {

    // Instance initialization block
    {
        add("Hello");
        add("World!");
    }
};

So it's basically a instance initialization block that is part of an anonymous inner class.

Joshua Bloch's Collection Literals proposal for Project Coin was along the lines of:

List<Integer> intList = [1, 2, 3, 4];

Set<String> strSet = {"Apple", "Banana", "Cactus"};

Map<String, Integer> truthMap = { "answer" : 42 };

Sadly, it didn't make its way into neither Java 7 nor 8 and was shelved indefinitely.

Experiment

Here's the simple experiment I've tested -- make 1000 ArrayLists with the elements "Hello" and "World!" added to them via the add method, using the two methods:

Method 1: Double Brace Initialization

List<String> l = new ArrayList<String>() {{
  add("Hello");
  add("World!");
}};

Method 2: Instantiate an ArrayList and add

List<String> l = new ArrayList<String>();
l.add("Hello");
l.add("World!");

I created a simple program to write out a Java source file to perform 1000 initializations using the two methods:

Test 1:

class Test1 {
  public static void main(String[] s) {
    long st = System.currentTimeMillis();

    List<String> l0 = new ArrayList<String>() {{
      add("Hello");
      add("World!");
    }};

    List<String> l1 = new ArrayList<String>() {{
      add("Hello");
      add("World!");
    }};

    /* snip */

    List<String> l999 = new ArrayList<String>() {{
      add("Hello");
      add("World!");
    }};

    System.out.println(System.currentTimeMillis() - st);
  }
}

Test 2:

class Test2 {
  public static void main(String[] s) {
    long st = System.currentTimeMillis();

    List<String> l0 = new ArrayList<String>();
    l0.add("Hello");
    l0.add("World!");

    List<String> l1 = new ArrayList<String>();
    l1.add("Hello");
    l1.add("World!");

    /* snip */

    List<String> l999 = new ArrayList<String>();
    l999.add("Hello");
    l999.add("World!");

    System.out.println(System.currentTimeMillis() - st);
  }
}

Please note, that the elapsed time to initialize the 1000 ArrayLists and the 1000 anonymous inner classes extending ArrayList is checked using the System.currentTimeMillis, so the timer does not have a very high resolution. On my Windows system, the resolution is around 15-16 milliseconds.

The results for 10 runs of the two tests were the following:

Test1 Times (ms)           Test2 Times (ms)
----------------           ----------------
           187                          0
           203                          0
           203                          0
           188                          0
           188                          0
           187                          0
           203                          0
           188                          0
           188                          0
           203                          0

As can be seen, the double brace initialization has a noticeable execution time of around 190 ms.

Meanwhile, the ArrayList initialization execution time came out to be 0 ms. Of course, the timer resolution should be taken into account, but it is likely to be under 15 ms.

So, there seems to be a noticeable difference in the execution time of the two methods. It does appear that there is indeed some overhead in the two initialization methods.

And yes, there were 1000 .class files generated by compiling the Test1 double brace initialization test program.