Tuesday, January 21, 2014

Java - speed up high throughput String processing

In one of my projects I had to process huge amounts of textual data. Millions of strings per seconds were read from input files, processed (splitting, comparing, mapping) then concatenated and written to output file.

Obviously I used StringBuilder with appropriate initial size, buffered reading and writing and all other standard Java tools for fast string processing (if you are aware of better way to do this please do let me know). But still I was not satisfied with performance, GC was kicking in too often, even though I minimized unnecessary object creation.

Then I had idea: what if I reuse StringBuilder objects instead of creating them hundreds of thousands times per second just to perform string concatenation before writing output? Searched the Internet first, just to check if someone else did it. Naturally, I found many debates whether it is good programming practice or not, will it confuse the hell out of JIT etc... decided I have to try it myself...

After doing this little change throughput of my application increased by 30%, GC cycles were shorter and CPU usage was lower.

Even though it looks like bad programming practice - it helps.

Saturday, January 11, 2014

Infinispan (6.0.0.Final) and putAll performance

In one of my projects I had to load over a million key-value pairs into Infinispan cache at application startup (local cache, no transactions).

At first I used individual put(K,V) method invocations and this took around 2 minutes to finish Infinispan cache population. I tried to find online whether putAll(Map<K,V)) method should be faster than individual put invocations - but could not find any information in Infinispan documentation or elsewhere.

After switching to putAll() instead of put() I was able to load same pairs in a matter of 20 seconds.

So, in case you are interested, putAll() is faster for inserts than individual put() invocations. I guess it would be great if this was clearly documented.

Monday, January 6, 2014

BoneCP and Oracle JDBC program name (v$session.program)

For some reason Oracle JDBC driver (thin) does not support client info so it is not possible to set v$session.program. I struggled for good half hour how to set v$session.program to Oracle 11g using BoneCP (version 0.8.0.RELEASE) connection pool.

Here is what worked for me:

BoneCPDataSource dataSource = new BoneCPDataSource();
Properties clientInfoProperties = new Properties();
clientInfoProperties.put("v$session.program", "SomeProgramNamePassedToOracle");

and now if you do

select program from v$session;

in sqlplus you will be able to identify your session easily.


Tuesday, December 31, 2013

ProGuard and JAXB

I encountered strange issue today when obfuscating Java code using ProGuard. Problem was that my code used JAXB (Java 7 + annotations and generics) and after obfuscation I was constantly getting exception

java.lang.ClassCastException: com.sun.org.apache.xerces.internal.dom.ElementNSImpl cannot be cast to
I tried many different options:
<option>-keep public class my.jaxb.classes.** { public protected private *; }</option>

<option>-keepattributes *Annotation*</option>
but nothing helped (btw, you do have to use those two options from above, but they are not enough on their own, and it took me almost an hour to figure out what was happening).

Only after I added
 <option>-keepattributes Signature</option>
everything started behaving correctly.

Jaxb requires generics to be available to perform xml parsing and without this option ProGuard was not retaining that information after obfuscation. That was causing the exception above.

Hopefully this will save you some time.

ProGuard is a great tool, the only thing I found strange is that there is no official Maven plugin for it and also latest ProGuard releases can not be found in Maven central.

Thursday, March 17, 2011

JBoss 6 and cluster-wide EJB injection

First a bit about CDI: I tested CDI  and EJB injection with JBoss 6.0.0. Few things were not clear to me and I spent some time figuring out all the details. Here they are:

CDI (@Inject, @Named, @Producer, @Observes) works only within deployment unit scope (EAR file for example). It can not automagically inject remote resources. What you can try to do is to create you own @RemoteEjb annotation and use custom @Producer to look it up in JNDI and inject it for you. Yes you can do this but you will have to create @Producer for every @RemoteEjb type you need. This means that you can not have only one @Producer for following example:

Service1Interface service1;

Service2Interface service2;

Here I want to inject different remote EJB based on name passed in @RemoteEjb annotation. This would make my lookups generic and I could inject any EJB, just by using appropriate interface and appropriate JNDI name. And it would require only one CDI producer method.

But, if Service1Interface and Service2Interface do not have common superinterface you will have to create two different CDI @Producers because CDI producer can not return java.lang.Object - it has to return exact type. It would be great if we could create generic producers which would return objects of any type, performing lookup by name only. Then type check could be performed while casting and injecting.

If you want to inject EJBs and go beyond your deployment unit scope then you can use standard @EJB(mappedName="serviceX") annotation.

By default this will use local JNDI and not HA-JNDI. So, even if your JBoss is a part of cluster @EJB injection will not be able to find anything that is not deployed on local cluster member. Too bad that @EJB is not smart enough to try local JNDI first and then, if it can not find appropriate EJB, try in cluster JNDI.

In order to force @EJB to use HA-JNDI you have to include jndi.properties in the root of your EAR file (root, not META-INF folder). In that jndi.properties file you will have URL(s) pointing to at least one of the members in cluster. You only need one alive member to bootstrap cluster-wide JNDI lookup. InitialContext is smart enough to find out about other members and update itself. So, your jndi.properties might look like this:


And now, whenever you do @EJB(mappedName="serviceName") it will first try to find it in local JNDI, if it is not there and your JBoss instance is a part of cluster, it will try to find EJB anywhere in cluster. It could also cross cluster boundaries (with appropriate modifications in jndi.properties)

Cool, easy for developers and you do not have to worry about where your EJBs are deployed.