The problem
Recently, we upgraded our boxes from Linux 2.6.16-0.87.1-smp running on 16-processor Xeon E5620s to Linux 2.6.32-358.32.3.e16x86_64 running on 24-processor Xeon E5-2620s (all processor counts include hyper-threading).
Unfortunately, our GC times because horrendous. What exactly this was due to remains a mystery, but we were seeing events like this in the GC logs:
.
.
19948.548 [GC [PSYoungGen: 1386848K -> 5888K (1390080K)] 3481117K->2105461K(4186624K), 0.0306870 secs] [Times: user=0.18, sys=0.00, real =0.03 secs]
20077.869 [GC [PSYoungGen: 1387264K -> 5984K (1389568K)] 3486837K->2210813K(4186112K), 9.4264420 secs] [Times: user=4.05, sys=52.99, real =9.43 secs]
20131.745 [GC [PSYoungGen: 1387360K -> 5792K (1390080K)] 3492189K->2115622K(4186624K), 0.0271300 secs] [Times: user=0.16, sys=0.01, real =0.03 secs]
.
.
Whoa! What happened there? Nearly 10 seconds of GC pause times when all the others are fractions of a second?
Also, look where the time was spent - in kernel space (sys=52.99).
AlwaysPreTouch
Given that it's spending a lot of time in the kernel, it might be best to look at flags that interact with that. One such flag is -XX:+AlwaysPreTouch. This loads all the pages into memory at start-up. That is, there is an upfront cost.
If you run your JVM with this flag, you'll see the process taking its full amount of memory when you run the top command. That is, without it, you'll see:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6960 henryp 20 0 3312m 19m 10m S 16 0.3 0:00.97 java
but with it:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6923 henryp 20 0 3312m 1.0g 10m S 16 14.0 0:02.40 java
That is, the RESident space has filled to its assigned 1024m immediately (VIRTual space does not correspond to physical memory and can be ignored).
The JVM arg -XX:+AlwaysPreTouch causes this code in mutableSpace.cpp to execute when initializing MutableSpace:
void MutableSpace::pretouch_pages(MemRegion mr) {
for (volatile char *p = (char*)mr.start(); p < (char*)mr.end(); p += os::vm_page_size()) {
char t = *p; *p = t;
}
}
(You can also find something similar in VirtualSpace::expand_by)
I'm not a C++ guru, but this appears to do nothing but make a (method-scoped) reference to an area of memory. Why does it do this? Well, by making a reference to that memory page, the kernel is obliged to load it. Presumably, the reference is volatile just in case the compiler thinks that this is pointless and optimizes it away.
This link talks about how lazily loading the pages can be more expensive in the case of writing to a lot of the pages. With AlwaysPreTouch, this hit was taken when the JVM started up.
No comments:
Post a Comment