The application does a lot of copying of object graphs using Serialization. Fine - JVMs are pretty good at memory management these days and creating objects, even large numbers of them, is normally OK. In fact, a Sun engineer once told me that you should only start worrying about Garbage Collection when it reaches a hefty 5% of running time. This trading application was spending about 1% of time collecting garbage despite JConsole showing that the Heap Memory was bouncing up and down like a yo-yo.
So, time to get more invasive and attach the amazing YourKit (sadly not open source but worth its weight in gold). It showed that when client connections were timing out, the server-side threads servicing them were contending for a monitor on java.util.Hashtable.get() (Hashtable unlike HashMap has its methods synchronized, of course).
Whatsmore, YourKit was telling me that it was java.util.Calendar.getInstance() that was calling this method on Hashtable. Firing up Eclipse and putting a breakpoint in java.util.Hashtable.get() showed exactly the path of execution that lead to this:
Thread [main] (Suspended (breakpoint at line 333 in Hashtable))
GregorianCalendar(Calendar).setWeekCountData(Locale) line: 2445
Calendar.createCalendar(TimeZone, Locale) line: 1006
Calendar.getInstance() line: 943
Calendar.getInstance() creates a GregorianCalendar and the constructor of this accesses , a static field of type java.util.Hashtable. Since it's static, all threads in the JVM can contend for its monitor.
This monitor issue may seem trivial but if you're calling this method hundreds of times a second, it becomes an issue.
Of course, it would have been better had the business domain objects that were being copied via serialization not implemented java.io.Externalizable and had their readExternal and writeExternal methods call Calendar.getInstance(). This was due to a crufty hack to satisfy an architectural problem (Sybase 12.5 not handling time zones in its timestamp data type, but that's another story).
So, we're faced with a few possible solutions. One of which the team mentioned was to use Joda time which is (in one form or another) going to make its way into the standard Java API.
Of course, I'd like to see a more architectural solution - grumble, grumble.