Here's the problem when I get too carried away with anonymous inner classes:
2009/05/27 16:35 1,602 DemoApp2$1.class
2009/05/27 16:35 1,976 DemoApp2$10.class
2009/05/27 16:35 1,919 DemoApp2$11.class
2009/05/27 16:35 2,404 DemoApp2$12.class
2009/05/27 16:35 1,197 DemoApp2$13.class
/* snip */
2009/05/27 16:35 1,953 DemoApp2$30.class
2009/05/27 16:35 1,910 DemoApp2$31.class
2009/05/27 16:35 2,007 DemoApp2$32.class
2009/05/27 16:35 926 DemoApp2$33$1$1.class
2009/05/27 16:35 4,104 DemoApp2$33$1.class
2009/05/27 16:35 2,849 DemoApp2$33.class
2009/05/27 16:35 926 DemoApp2$34$1$1.class
2009/05/27 16:35 4,234 DemoApp2$34$1.class
2009/05/27 16:35 2,849 DemoApp2$34.class
/* snip */
2009/05/27 16:35 614 DemoApp2$40.class
2009/05/27 16:35 2,344 DemoApp2$5.class
2009/05/27 16:35 1,551 DemoApp2$6.class
2009/05/27 16:35 1,604 DemoApp2$7.class
2009/05/27 16:35 1,809 DemoApp2$8.class
2009/05/27 16:35 2,022 DemoApp2$9.class
These are all classes which were generated when I was making a simple application, and used copious amounts of anonymous inner classes -- each class will be compiled into a separate class
file.
The "double brace initialization", as already mentioned, is an anonymous inner class with an instance initialization block, which means that a new class is created for each "initialization", all for the purpose of usually making a single object.
Considering that the Java Virtual Machine will need to read all those classes when using them, that can lead to some time in the bytecode verfication process and such. Not to mention the increase in the needed disk space in order to store all those class
files.
It seems as if there is a bit of overhead when utilizing double-brace initialization, so it's probably not such a good idea to go too overboard with it. But as Eddie has noted in the comments, it's not possible to be absolutely sure of the impact.
Just for reference, double brace initialization is the following:
List<String> list = new ArrayList<String>() {{
add("Hello");
add("World!");
}};
It looks like a "hidden" feature of Java, but it is just a rewrite of:
List<String> list = new ArrayList<String>() {
// Instance initialization block
{
add("Hello");
add("World!");
}
};
So it's basically a instance initialization block that is part of an anonymous inner class.
Joshua Bloch's Collection Literals proposal for Project Coin was along the lines of:
List<Integer> intList = [1, 2, 3, 4];
Set<String> strSet = {"Apple", "Banana", "Cactus"};
Map<String, Integer> truthMap = { "answer" : 42 };
Sadly, it didn't make its way into neither Java 7 nor 8 and was shelved indefinitely.
Experiment
Here's the simple experiment I've tested -- make 1000 ArrayList
s with the elements "Hello"
and "World!"
added to them via the add
method, using the two methods:
Method 1: Double Brace Initialization
List<String> l = new ArrayList<String>() {{
add("Hello");
add("World!");
}};
Method 2: Instantiate an ArrayList
and add
List<String> l = new ArrayList<String>();
l.add("Hello");
l.add("World!");
I created a simple program to write out a Java source file to perform 1000 initializations using the two methods:
Test 1:
class Test1 {
public static void main(String[] s) {
long st = System.currentTimeMillis();
List<String> l0 = new ArrayList<String>() {{
add("Hello");
add("World!");
}};
List<String> l1 = new ArrayList<String>() {{
add("Hello");
add("World!");
}};
/* snip */
List<String> l999 = new ArrayList<String>() {{
add("Hello");
add("World!");
}};
System.out.println(System.currentTimeMillis() - st);
}
}
Test 2:
class Test2 {
public static void main(String[] s) {
long st = System.currentTimeMillis();
List<String> l0 = new ArrayList<String>();
l0.add("Hello");
l0.add("World!");
List<String> l1 = new ArrayList<String>();
l1.add("Hello");
l1.add("World!");
/* snip */
List<String> l999 = new ArrayList<String>();
l999.add("Hello");
l999.add("World!");
System.out.println(System.currentTimeMillis() - st);
}
}
Please note, that the elapsed time to initialize the 1000 ArrayList
s and the 1000 anonymous inner classes extending ArrayList
is checked using the System.currentTimeMillis
, so the timer does not have a very high resolution. On my Windows system, the resolution is around 15-16 milliseconds.
The results for 10 runs of the two tests were the following:
Test1 Times (ms) Test2 Times (ms)
---------------- ----------------
187 0
203 0
203 0
188 0
188 0
187 0
203 0
188 0
188 0
203 0
As can be seen, the double brace initialization has a noticeable execution time of around 190 ms.
Meanwhile, the ArrayList
initialization execution time came out to be 0 ms. Of course, the timer resolution should be taken into account, but it is likely to be under 15 ms.
So, there seems to be a noticeable difference in the execution time of the two methods. It does appear that there is indeed some overhead in the two initialization methods.
And yes, there were 1000 .class
files generated by compiling the Test1
double brace initialization test program.
Best Answer
Let's discuss a fallacy in the question:
This is presumed, with absolutely no evidence backing it up. If false, the question is moot.
Is there evidence to the contrary? Well, let's consider the question itself -- it doesn't prove anything, but shows things are not that clear.
Of the four claims in the first sentence, the first three are true, and the fourth, as shown by user unknown, is false! And why it is false? Because, contrary to what the second sentence states, it traverses the list more than once.
The code calls the following methods on it:
and
Let's consider first
append
.append
takes a vararg parameter. That meansmax
andx
are first encapsulated into a sequence of some type (aWrappedArray
, in fact), and then passed as parameter. A better method would have been+=
.Ok,
append
calls++=
, which delegates to+=
. But, first, it callsensureSize
, which is the second mistake (+=
calls that too --++=
just optimizes that for multiple elements). Because anArray
is a fixed size collection, which means that, at each resize, the wholeArray
must be copied!So let's consider this. When you resize, Java first clears the memory by storing 0 in each element, then Scala copies each element of the previous array over to the new array. Since size doubles each time, this happens log(n) times, with the number of elements being copied increasing each time it happens.
Take for example n = 16. It does this four times, copying 1, 2, 4 and 8 elements respectively. Since Java has to clear each of these arrays, and each element must be read and written, each element copied represents 4 traversals of an element. Adding all we have (n - 1) * 4, or, roughly, 4 traversals of the complete list. If you count read and write as a single pass, as people often erroneously do, then it's still three traversals.
One can improve on this by initializing the
ArrayBuffer
with an initial size equal to the list that will be read, minus one, since we'll be discarding one element. To get this size, we need to traverse the list once, though.Now let's consider
toList
. To put it simply, it traverses the whole list to create a new list.So, we have 1 traversal for the algorithm, 3 or 4 traversals for resize, and 1 additional traversal for
toList
. That's 4 or 5 traversals.The original algorithm is a bit difficult to analyse, because
take
,drop
and:::
traverse a variable number of elements. Adding all together, however, it does the equivalent of 3 traversals. IfsplitAt
was used, it would be reduced to 2 traversals. With 2 more traversals to get the maximum, we get 5 traversals -- the same number as the non-functional, non-concise algorithm!So, let's consider improvements.
On the imperative algorithm, if one uses
ListBuffer
and+=
, then all methods are constant-time, which reduces it to a single traversal.On the functional algorithm, it could be rewritten as:
That reduces it to a worst case of three traversals. Of course, there are other alternatives presented, based on recursion or fold, that solve it in one traversal.
And, most interesting of all, all of these algorithms are
O(n)
, and the only one which almost incurred (accidentally) in worst complexity was the imperative one (because of array copying). On the other hand, the cache characteristics of the imperative one might well make it faster, because the data is contiguous in memory. That, however, is unrelated to either big-Oh or functional vs imperative, and it is just a matter of the data structures that were chosen.So, if we actually go to the trouble of benchmarking, analyzing the results, considering performance of methods, and looking into ways of optimizing it, then we can find faster ways to do this in an imperative manner than in a functional manner.
But all this effort is very different from saying the average Java programmer code will be faster than the average Scala programmer code -- if the question is an example, that is simply false. And even discounting the question, we have seen no evidence that the fundamental premise of the question is true.
EDIT
First, let me restate my point, because it seems I wasn't clear. My point is that the code the average Java programmer writes may seem to be more efficient, but actually isn't. Or, put another way, traditional Java style doesn't gain you performance -- only hard work does, be it Java or Scala.
Next, I have a benchmark and results too, including almost all solutions suggested. Two interesting points about it:
Depending on list size, the creation of objects can have a bigger impact than multiple traversals of the list. The original functional code by Adrian takes advantage of the fact that lists are persistent data structures by not copying the elements right of the maximum element at all. If a
Vector
was used instead, both left and right sides would be mostly unchanged, which might lead to even better performance.Even though user unknown and paradigmatic have similar recursive solutions, paradigmatic's is way faster. The reason for that is that he avoids pattern matching. Pattern matching can be really slow.
The benchmark code is here, and the results are here.