String Creation in Java During Concatenation

javaobjectstrings

I was asked about immutable strings in Java. I was tasked with writing a function that concatenated a number of "a"s to a string.

What I wrote:

public String foo(int n) {
    String s = "";
    for (int i = 0; i < n; i++) {
        s = s + "a"
    }
    return s;
}

I was then asked how many strings this program would generate, assuming garbage collection does not happen.
My thoughts for n=3 was

  1. ""
  2. "a"
  3. "a"
  4. "aa"
  5. "a"
  6. "aaa"
  7. "a"

Essentially 2 strings are created in each iteration of the loop. However, the answer was n2. What strings will be created in memory by this function and why is that way?

Best Answer

I was then asked how many strings this program would generate, assuming garbage collection does not happen. My thoughts for n=3 was (7)

Strings 1 ("") and 2 ("a") are the constants in the program, these are not created as part of things but are 'interned' because they are constants the compiler knows about. Read more about this at String interning on Wikipedia.

This also removes strings 5 and 7 from the count as they are the same "a" as String #2. This leaves strings #3, #4, and #6. The answer is "3 strings are created for n = 3" using your code.

The count of n2 is obviously wrong because at n=3, this would be 9 and even by your worst case answer, that was only 7. If your non-interned strings was correct, the answer should have been 2n + 1.

So, the question of how should you do this?

Since the String is immutable, you want a mutable thing - something you can change without creating new objects. That is the StringBuilder.

The first thing to look at is the constructors. In this case we know how long the string will be, and there is a constructor StringBuilder(int capacity) which means we allocate exactly as much as we need.

Next, "a" doesn't need to be a String, but rather it can be a character 'a'. This has some minor performance boosting when calling append(String) vs append(char) - with the append(String), the method needs to find out how long the String is and do some work on that. On the other hand, char is always exactly one character long.

The code differences can be seen at StringBuilder.append(String) vs StringBuilder.append(char). Its not something to be too concerned with, but if you're trying to impress the employer it is best to use the best possible practices.

So, how does this look when you put it together?

public String foo(int n) {
    StringBuilder sb = new StringBuilder(n);
    for (int i = 0; i < n; i++) {
        sb.append('a');
    }
    return sb.toString();
}

One StringBuilder and one String have been created. No extra strings needed to be interned.


Write some other simple programs in Eclipse. Install pmd and run it on the code you write. Note what it complains about and fix those things. It would have found the modification of a String with + in a loop, and if you changed that to StringBuilder, it would have maybe found the initial capacity, but it would certainly catch the difference between .append("a") and .append('a')