After playing with the OpenJDK C++ code, I found a good resource of documentation here. But this is a log of my own journey through the OpenJDK code that was prompted by workmates asking: what exactly is it that prompts a garbage collection?
Since we were using the parallel scavenge GC, I had a look in the code at ParallelScavengeHeap::mem_allocate. (We've tried G1 but failed to get good performance out of it. After a few hours, all target collection times were violated and over time it became slower and slower).
First, we try to allocate the memory from the young generation:
HeapWord* result = young_gen()->allocate(size);
Hopefully, this is successful. If it's not, then we try it again but before we do, we grab a lock:
MutexLocker is a subclass of StackObj, indeed the way we instantiate it in the above line of C++ code shows that this is going to be allocated on the stack. When the stack data is popped, the lock will automatically unlock itself. No need for something like Java's try/finally block here because our thread will execute the destructor of this object which looks like:
result = mem_allocate_old_gen(size);
A "death march" is a series of ultra-slow allocations in which a full gc is done before each allocation, and after the full gc the allocation still cannot be satisfied from the young gen.
If there is such a death march taking place, it will do the following 64 times: it will try to allocate the memory in the Old Generation, first by not expanding the heap, then (if forced) by seizing a lock and expanding the heap through all sorts of tricks (see PSOldGen::expand(size_t bytes) for more details) and even possibly putting the thread to sleep for a while.
If a GC is needed, the thread might stall and try the whole process again. If the thread thinks a JNI thread is an issue, it may fail.
If all this still does not provide enough space, the thread requests a GC. It cannot do this itself as only VM threads are allowed to GC, not application threads. So, it requests the VM thread to GC and puts itself to sleep (VMThread::execute). The VM thread then tries to allocate memory in the Young Generation just like the application thread did. Failing that, it tries to do a full collection (see ParallelScavengeHeap::failed_mem_allocate which calls ParallelScavengeHeap::do_full_collection) first without clearing all the soft references then if that doesn't work, it tries to clear them. A full collection can be expensive. Typically it was OK but with our 4gb heaps it was not unusually to see them last a couple of seconds.
If this fails, it's back into the loop until a given number of attempts has been made.
If all this fails, out of memory errors are thrown and the heap may be dumped depending on the JVM's arguments.