Tuesday, January 21, 2014

Java - speed up high throughput String processing

In one of my projects I had to process huge amounts of textual data. Millions of strings per seconds were read from input files, processed (splitting, comparing, mapping) then concatenated and written to output file.

Obviously I used StringBuilder with appropriate initial size, buffered reading and writing and all other standard Java tools for fast string processing (if you are aware of better way to do this please do let me know). But still I was not satisfied with performance, GC was kicking in too often, even though I minimized unnecessary object creation.

Then I had idea: what if I reuse StringBuilder objects instead of creating them hundreds of thousands times per second just to perform string concatenation before writing output? Searched the Internet first, just to check if someone else did it. Naturally, I found many debates whether it is good programming practice or not, will it confuse the hell out of JIT etc... decided I have to try it myself...

After doing this little change throughput of my application increased by 30%, GC cycles were shorter and CPU usage was lower.

Even though it looks like bad programming practice - it helps.

No comments:

Post a Comment