$JAVA_HOME/jmap -dump:file=/tmp/dump_file PID
and then load the file into your profiler of choice and see where all the memory is being used.
I had to do this last week. As it happened, there wasn't a memory leak but a ConcurrentHashMap that was being populated faster than its elements could be consumed.
The interesting thing was that the memory profile looked all wrong. To see what I mean, we have to look closer at ConcurrentHashMap.
What is a ConcurrentHashMap?
ConcurrentHashMap is a map that more than one thread can read and update without any fear of ConcurrentModificationExceptions being thrown.
How does it do this?
ConcurrentHashMap is basically a hash table of hash tables. These inner hash tables are called "segments". Threads that insert elements only lock one of these segments. This is one way that ConcurrentHashMap improves throughput.
Threads that wish to read do not use locking. Instead, you see in the code, snippets like this:
transient volatile HashEntry table;
Without the method-scoped pointer (tab) pointing to the object-scoped pointer (table) on the first line of the method, the second line would have a race condition. That is: the table may change size as the array index was being calculated. This would return the wrong element.
Of course, this means that between the first and the second line, tab may have become stale. The danger is that this method could return an element that another thread had already removed from the "real" table.
[Aside: pointing a reference (tab) to another reference (table) is, of course, atomic ("Writes to and reads of references are always atomic, regardless of whether they are implemented as 32 or 64 bit values." - Java Language Spec 17.7, 3rd Edition).]
The JavaDocs do warn you about this potential "staleness" when it says:
"Retrievals reflect the results of the most recently completed update operations holding upon their onset."
A bug in ConcurrentHashMap
Back to my memory dump. The problem seen in the ourput from jmap was that all of the objects we wanted to store in the ConcurrentHashMap were in just one segment rather than being evenly distributed. The code of ConcurrentHashMap that determines which segment into which an object is to be inserted looks like this:
return segments[(hash >>> segmentShift) & segmentMask];
Where hash is the hashCode of our object and segmentShift is 28 for the default constructor of ConcurrentHashMap. Reducing this significantly means gobbling huge amounts of memory.
Shifting an Integer to the right 28 times means only a very small range of numbers that can be stored in a 32-bit Integer are useful - those greated than 2 e 29 and less than -2 e 29.
This bug is mentioned here but Sun/Oracle appear to say it is fixed in my version of Java when it definitely isn’t (as the source code shows).
This leaves us with the choice of hacking our hashCode() method to achieve these extreme numbers or grin and bear it until we upgrade our Java to a version that has fixed this problem.
It's not a major issue so it looks like the latter.