In my earlier post I was making a fuss over picking the faster hash algorithm, and then I realised I was using + to concatenate strings.
Should I always use a StringBuilder? Should I care even for small strings? Heck, if I use the StringBuilder I’ll surely create one extra object anyway…
I tried some variations of the test and I did not find any performance difference when comparing simple concatenation to using the string builder. I even tried bigger strings and other combinations. Still No difference.
That got me curious, so I wrote a very simple class and looked at it in the bytecode outline:
This java code:
public static void main(String[] args) {
String cip = "cip";
String ciop = "ciop";
String plus = cip + ciop;
String build = new StringBuilder(cip).append(ciop).toString();
}
Generates this bytecode (see how the two concatenation styles generate the very same code):
L0
LINENUMBER 23 L0
LDC "cip"
ASTORE 1
L1
LINENUMBER 24 L1
LDC "ciop"
ASTORE 2
// cip + ciop
L2
LINENUMBER 25 L2
NEW java/lang/StringBuilder
DUP
ALOAD 1
INVOKESTATIC java/lang/String.valueOf(Ljava/lang/Object;)Ljava/lang/String;
INVOKESPECIAL java/lang/StringBuilder.(Ljava/lang/String;)V
ALOAD 2
INVOKEVIRTUAL java/lang/StringBuilder.append(Ljava/lang/String;)Ljava/lang/StringBuilder;
INVOKEVIRTUAL java/lang/StringBuilder.toString()Ljava/lang/String;
ASTORE 3
// new StringBuilder(cip).append(ciop).toString()
L3
LINENUMBER 26 L3
NEW java/lang/StringBuilder
DUP
ALOAD 1
INVOKESPECIAL java/lang/StringBuilder.(Ljava/lang/String;)V
ALOAD 2
INVOKEVIRTUAL java/lang/StringBuilder.append(Ljava/lang/String;)Ljava/lang/StringBuilder;
INVOKEVIRTUAL java/lang/StringBuilder.toString()Ljava/lang/String;
ASTORE 4
L4
LINENUMBER 27 L4
RETURN
The compiler has transformed “cip+ciop” into “new StringBuilder(cip).append(ciop).toString()“.
In other words using “+” is a shorthand for the more verbose StringBuilder idiom.
The compiler will do same trick for cip + "ciop" and "cip" + ciop. (In case you wonder, "cip" + "ciop" will just be compiled as "cipciop").
This is great, but beware, the compiler will not always do the best thing for you:
This code
String big = "both"; big += cip; big += ciop;
Will be compiled into this:
String big = "both"; big = new StringBuilder(bag).append(cip).toString(); big = new StringBuilder(bag).append(ciop).toString();
While of course the most efficient way is
String big = new StringBuilder("both").append(cip).append(ciop).toString()
Now of course nobody in his right mind would ever write any of the above (or use those variable names), but here is a pattern that you may have seen before:
String boo = "both";
for (int i=1; i<100; i++)
boo += cip + ciop;
Now the compiler will do the obvious thing and instantiate one new StringBuilder at each iteration:
String boo = "both";
for (int i=1; i<100; i++)
boo += new StringBuilder(boo).append(cip).append(ciop).toString();
In this case it is best to use this idiom:
StringBuilder foo = new StringBuilder("both");
for (int i=1; i<2; i++)
foo.append(cip).append(ciop);
String boo = foo.toString();
Enjoy