Thursday, October 27, 2011

Java Disassembled

In my last post, I talked about how false sharing due to objects being on the same cache line can cause code to run much slower. Now, I'm going to show why

matrix[fixed_index][i] = value + 1;

is slower than

matrix[i][fixed_index] = value + 1;

This slowness is a property of your CPU rather than something innate to Java but we'll use the JDK with the -XX:+PrintOptoAssembly flag and examine the generated assembler code to show the problem.

(Please note that I am not a professional x86 assembler programmer so any input would be gratefully received.)

First, some preliminaries. Most of the following was gleaned from using JDK 1.7.0 on Linux:

1. Java multi-dimensional arrays are actually arrays of arrays.

2. The second location in the memory representing an array (array reference + 1) yields information as to its type. We can show this by looking at the assembly generated by this Java:
       private void myInstanceOf(Object[] array) {
if (array instanceof Integer[]) {
aPublicInt++;
}
}
[Aside: the field aPublicInt is exactly that: an int that is publicly accessible. The reason we may need it is that Hotspot may optimize away our code if it considers that nothing noticeable changes.]

Anyway, the assembler it generates looks a little like this:
#r000 ecx   : parm 0: com/henryp/lang/SimpleClassForDisassembly:NotNull *
#r005 edx : parm 1: java/lang/Object *[int:>=0] *
# -- Old esp -- Framesize: 16 --
#r045 esp+12: return address
#r044 esp+ 8: pad2, in_preserve
#r043 esp+ 4: pad2, in_preserve
#r042 esp+ 0: Fixed slot 0
#
000 N45: # B1 <- BLOCK HEAD IS JUNK Freq: 1
000 CMP EAX,[ECX+4] # Inline cache check
JNE SharedRuntime::handle_ic_miss_stub
NOP
NOP
NOP

000
00c B1: # B5 B2 <- BLOCK HEAD IS JUNK Freq: 1
00c # stack bang
PUSHL EBP
SUB ESP,8 # Create frame
01a MOV EBX,[EDX + #4]
01d NullCheck EDX
01d
01d B2: # B4 B3 <- B1 Freq: 0.999999
01d CMPu EBX,precise klass [Ljava/lang/Integer;: 0xb76eeb68:Constant:exact *
023 Jne,us B4 P=0.100000 C=-1.000000
023
025 B3: # B4 <- B2 Freq: 0.899999
025 INC [ECX + #16] ! Field com/henryp/lang/SimpleClassForDisassembly.aPublicInt
028
028 B4: # N45 <- B3 B2 Freq: 0.999999
028 ADD ESP,8 # Destroy frame
POPL EBP
TEST PollPage,EAX ! Poll Safepoint

032 RET
032
033 B5: # N45 <- B1 Freq: 1.01328e-06
033 MOV EBP,ECX
035 MOV ECX,#-12
03a NOP # 1 bytes pad for loops and calls
03b CALL,static wrapper for: uncommon_trap(reason='null_check' action='make_not_entrant')
# com.henryp.lang.SimpleClassForDisassembly::myInstanceOf @ bci:1 L[0]=EBP L[1]=_ STK[0]=#NULL
# OopMap{ebp=Oop off=64}
040 INT3 ; ShouldNotReachHere
040

Verbose, isn't it?

(Note that this assembler was generated when the Java method was alternately passed an Integer[] and a String[] for many iterations. The JDK will generate other assembler depending on the which paths through the method are most commonly used.)

So what does this assembly code mean? The first line says the ecx register holds a reference to this. But the second line tells us the register edx points to an the argument that is passed to this method - the Object[].

So much for the references. The interesting code starts at the label B1. Here we create the stack for this method. EBP and ESP are special "32-bit registers that are used as pointers [,] the extended base pointer (EBP) and the extended stack pointer (ESP)...

"The EBP register (sometimes called the frame pointer (FP) or local base pointer (LB)) is used to reference variables in the current stack frame. Each stack frame contains the parameters to the function, its local variables, and two pointers that are necessary to put things back the way they were: the saved frame pointer (SFP) and the return address. The stack frame pointer is used to restore EBP to its previous value, and the return address is used to restore EIP [the address of the code currently executing] to the next instruction found after the function call." [1]

Breaking this code down, we see:
    PUSHL  EBP
(Push the old frame pointer value onto the stack).
    SUB    ESP,8    # Create frame
(Allocate 8 bytes on the stack. The stack starts high in the memory space and moves towards zero as more stack space is allocated so we SUBtract 8 bytes from the ESP).
01a       MOV    EBX,[EDX + #4]
01d NullCheck EDX
(Move the contents of the second memory space of the array - remember, EDX points to our array - into the EBX register. The Nullcheck is not an assembly instruction. It appears to be pseudo code as it has the same address - 01d - as the next line.)
01d       CMPu   EBX,precise klass [Ljava/lang/Integer;: 0xb76eeb68:Constant:exact *
023 Jne,us B4 P=0.100000 C=-1.000000
(Is EBX the same as the constant that represent an Integer array? If the unsigned (,us) comparison concludes they are not equal, go to B4. Otherwise...)
025       INC    [ECX + #16] ! Field com/henryp/lang/SimpleClassForDisassembly.aPublicInt
(add one to this.aPublicInt which occupies the memory address 16 bytes after the memory address of this. Remember, the first line tells us that ECX points to this).

Either way, we come to B4 where we undo all the work of allocating stack space and RETurn.

3. The third location in the memory holding an array (array reference + 2) yields information as to its length. We can show this by looking at the assembly generated by this Java:
       private static int myArraySize(Object[] array) {
return array.length;
}
The assembler generated here looks like this (ignoring all the stack frame manipulation etc that we saw in the previous example):
#r000 ecx   : parm 0: int[int:>=0]:exact *
.
.
.
00e MOV EAX,[ECX + #8]
011 NullCheck ECX
Here, ecx is the register that represents the argument (there is no reference to this in the code as you'll notice the Java method is static). We put the contents of the address space 2 memory slots (2 * 4 bytes = 8) after the array reference itself into EAX. This is the register that by convention holds the return value of any call.

Returning to our matrix navigation code, let's start with this method:
       private int[][] myMatrixNavigation() {
int size = 19;
int matrix[][] = new int[size][27];
for (int i = 0 ; i < size ; i++) {
matrix[i][7] = 11;
}
return matrix;
}

And run it enough times for the JDK to generate assembly language. It generates assembler that looks like:
       SUB    ESP,24   # Create frame
XOR EBP,EBP
MOV ECX,precise klass [[I: 0xa062f798:Constant:exact *
MOV EDX,#19
MOV EDI,#27
MOV [ESP + #0],EDI
NOP # 1 bytes pad for loops and calls
CALL,static wrapper for: _multianewarray2_Java
# com.henryp.lang.SimpleClassForDisassembly::myMatrixNavigation @ bci:6 L[0]=_ L[1]=#19 L[2]=_ L[3]=_
# OopMap{off=52}
#checkcastPP of EAX
MOV EDX,#19

B2: # B27 B3 <- B1 B5 Loop: B2-B5 inner stride: not constant pre of N159
CMPu EBP,#19
Jnb,u B27

B3: # B28 B4 <- B2
MOV EDI,[EAX + #12 + EBP << #2]
MOV ECX,[EDI + #8]
NullCheck EDI

B4: # B26 B5 <- B3
CMPu ECX,#7
Jbe,u B26

B5: # B2 B6 <- B4
MOV [EDI + #40],#11
INC EBP
CMP EBP,#1
Jl,s B2 # Loop end

B6: # B16 B7 <- B5
SUB EDX,EBP
AND EDX,#-4
ADD EDX,EBP
CMP EBP,EDX
Jge,s B16
NOP # 6 bytes pad for loops and calls

B7: # B28 B8 <- B6 B15 Loop: B7-B15 inner stride: not constant main of N96
MOV ECX,[EAX + #12 + EBP << #2]
MOV EDI,[ECX + #8]
NullCheck ECX

B8: # B23 B9 <- B7
CMPu EDI,#7
Jbe,u B23

B9: # B28 B10 <- B8
MOV EDI,[EAX + #16 + EBP << #2]
MOV [ECX + #40],#11
.
.
.
Hugely abbreviated, it does something like this:
        XOR    EBP,EBP
(Set EBP to 0. Exclusive-or of anything on itself will be 0. This is an optimisation. EBP will be our loop counter - equivalent to i in the Java code above).
        CALL,static  wrapper for: _multianewarray2_Java
(Call JVM code to create our array. Remember that the register EAX is used to return a value from a call so it's this that points to our array).
        MOV    EDI,[EAX + #12 + EBP << #2]
(Move the ith element of the outer array into EDI. Remember that the second memory location of an array (+4) is its type, the third (+8) its length so +12 is the first element).
        MOV    [EDI + #40],#11
(This element is itself an array. So, bearing in mind that the array reference + 4 is the type, the array reference + 8 is the length, the array reference + 12 the first element and so on, the array reference + 40 must be the 7th ((7 * 4) + 12) element. We set the contents of this memory address to 11 as our Java code says we should).

Note that this memory address is likely to be far away from the reference of the first array (stored in the EAX register). Compare that to the assembly code generated by this Java method:
       private int[][] myMatrixNavigation2ndIndice() {
int size = 19;
int matrix[][] = new int[27][size];
for (int i = 0 ; i < size ; i++) {
matrix[7][i] = 11;
}
return matrix;
}
This is very similar to before but this time we're fixing the 2nd index of the matrix, not the first. I'll spare you all the gory assembler, but the relevant code boils down to this:

        MOV    EDI,[EAX + #40]
(Once more, EAX stores the reference to our array and the +40 is used to access the 7th element).
        MOV    [EDI + #12 + EBP << #2],#11
(The 7th element is itself an array. So, we traverse this array once more with EBP as our index, setting the memory contents to 11, just as our Java code says we should).

Now the thing to note here is that if this code immediately above is multi-threaded, all threads will traverse a contiguous chunk of memory, that is [EDI + 12 + EBP] since EBP is just our integer index.

In the first piece of Java code where the first index is fixed, a thread will be writing to some element in the array that is found at [EDI + 12 + EBP] not the location [EDI + 12 + EBP] itself. It is much less likely that this memory is going to be close to the memory location for any other value of EBP (in fact for a sufficiently large array I think it might be impossible). Since cache lines are typically only 64 bytes, contention for the same cache line is much less likely and so the first piece of Java code is faster.

[1] The Art of Exploitation - Jon Erickson.

Wednesday, October 26, 2011

False Sharing

If multiple threads concurrently change the values in a matrix, which is faster: working through the matrix column-by-column or row-by-row? And why is there a difference?

Let's say we have a square, 2-dimensional matrix of ints that we model like so:
int SIZE    = 1000;
int ROWS = SIZE;
int COLUMNS = SIZE;

int[][] matrix = new int[ROWS][COLUMNS];

Technically, Java doesn't have multi-dimensional arrays. It has arrays of arrays. But for our purposes, let's define the first index as the row and the second as the column.

Now, let's say that at the same time our many threads change all the values in a given column with the following code:
  // change all the values in a fixed column of our matrix
for (int i = 0 ; i < ROWS ; i++) {
int value = matrix[i][fixed_index];
matrix[i][fixed_index] = value++;
}
and after this code has finished running, we change it slightly so that now the many threads change all the values in a given row.
    // change all the values in a fixed row of our matrix
for (int i = 0 ; i < COLUMNS ; i++) {
int value = matrix[fixed_index][i];
matrix[fixed_index][i] = value + 1;
}
This time, we've fixed the "row" index and let the "column" index vary so all values in a given row change.

The results on my MacBook Pro (Intel Core 2 Duo) running 100 threads each performing 1000 iterations look like this:

Run____fixed column____fixed row
1__________209_____________91___
2__________194_____________78___
3__________201_____________92___
4__________199_____________60___
5__________228_____________63___

[mean time in ms. Apologies for the formatting. Blogger keeps trying to eat whitespace].

As you can see, changing all the values in one of our rows is much faster than changing all the values in one of our columns.

Remembering that there are really no rows and columns in Java (we're actually modeling our matrix using arrays of arrays), the easy way to remember this is the 3 Fs: when it comes to changing the values along a give index of a 2-D array, Fixing the First is Fastest.

This phenomena is due to a mechanism called "false sharing" (there is a good podcast by Scott Meyers on the matter here).

False sharing slows concurrent changes to RAM. It occurs when one thread is writing to memory that is close to where another thread is writing. Robert Love defines it as when there are "two or more objects mapping to the same cache line despite existing at different addresses in memory" (Linux Kernel Development 3r Edition).

The maximum size of memory in which these two or more addresses need to be for this to happen is called a cache line. The exact size is dependent on the architecture but it is typically 64 bytes.

What's happening in the above example is that ints in the column of our matrix are evidently stored close together - so close there is some cache line collisions occurring - and that threads are contending to write to that piece of RAM.

I'll write another post as to how we can prove this.

Tuesday, October 4, 2011

Synchronicity

I had a debate over a beer with a friend about volatile arrays. I forgot who was trying to prove what but I did come across an interesting quirk.

We were talking about how adding the volatile keyword to a reference means that threads will always see the same thing once the reference had been set. But what if that reference were an array? Would the threads agree on the elements of the array? (The answer is "not necessarily" as pointed out in my blog last year).

But I had trouble demonstrating this with some simple, multi-threaded code. Annoyingly, my threads kept agreeing on the elements of the array. Maybe it was the chip architecture, I lazily thought.

Then it occurred to me. I was putting some System.out.println statements in my code to help me try to solve the problem. This, of course, was the problem. The reason why is that by introducing this logging, the threads are synchronizing on the same object - and as I mentioned in another blog post, this ensures "all threads see the most up-to-date values of shared mutable variables [when] the reading and writing threads ... synchronize on a common lock" (Java Concurrency in Practice, p37).

So, what was introducing this mysterious common lock? Why, System.out.println itself! The println method is synchronized so both threads synchronize on the common lock of the System.out object.

This reminded me of a time when a colleague was wondering why adding System.out.println appeared to be altering the behaviour of a piece of code. Stumped, I just thought it might have introduced a timing issue to his multi-threaded code. Now I know better.

But it begs the question: how much code is out there that is not properly multi-threaded with only some ad hoc logging holding it together?

Tuesday, August 23, 2011

Optimising Collection Access

I had an interview question recently that asked me to write code to access as efficiently as possible a collection while multiple threads intensively read and write to it.

Giving the caveat that everything should be subject to stress tests anyway, I went ahead and used a java.util.ConcurrentHashMap. I knew this is a very efficient piece of code and I wouldn't have to worry about synchronization.

When I got home, I wondered about my answer and started to put together JMeter tests to establish some empirical data. JMeter provides a lot of the plumbing for stress tests that makes it superior to just writing your own class.

[Incidentally, if you do write your own executable classes for stress tests, don't forget to pause for 4 seconds before running your code. Any object created in the first 4s of the JVM's life is not eligible for an optimization called Biased Locking (see this email from a Sun engineer). Biased Locking is an optimization for the use case where a lock is mostly sought by just one thread. In this case, the favoured thread finds attaining the lock is cheap.]

First, if you wish to plug your code into JMeter, it's helpful to have a superclass that implements the JMeter interface:
package com.henryp.stress;


import org.apache.jmeter.protocol.java.sampler.JavaSamplerClient;
import org.apache.jmeter.protocol.java.sampler.JavaSamplerContext;
import org.apache.jmeter.samplers.SampleResult;

public abstract class AbstractSamplerClient implements JavaSamplerClient {

public SampleResult runTest(JavaSamplerContext javaSamplerContext) {
SampleResult result = new SampleResult();
result.sampleStart();

doSample();

result.sampleEnd();
result.setSuccessful(true);
result.setResponseCodeOK();
result.setResponseMessageOK();

return result;
}

protected abstract void doSample();

}
Then, subclass it. For my ConcurrentHashMap, it looks something like this:
package com.henryp.stress;


import java.util.Map;

import org.apache.jmeter.config.Arguments;

public abstract class AbstractMapSamplerClient extends AbstractSamplerClient {

protected static Map concurrentHashMap;
protected final MapPopulator mapPopulator = new MapPopulator();

@Override
public Arguments getDefaultParameters() {
return mapPopulator.getDefaultParameters();
}

@Override
protected void doSample() {
for (Map.Entry entry :
concurrentHashMap.entrySet()) {
// NoOp while I think of something to do here
}
}

@Override
public synchronized void setupTest(JavaSamplerContext javaSamplerContext) {
if (concurrentHashMap == null) {
System.out.println("setupTest: " + javaSamplerContext);

concurrentHashMap = mapPopulator.makePopulatedMap(javaSamplerContext);
}
}

@Override
public void teardownTest(JavaSamplerContext javaSamplerContext) {
concurrentHashMap = null;
}
}
The MapPopulator is just a utility that instantiates a Map of a user-defined type and populates it with a user-defined number of elements. Setting the default parameters in that class looks something like:
.

.
.
protected static final int DEFAULT_COLLECTION_SIZE = 100;
protected static final String COLLECTION_SIZE_KEY = "COLLECTION_SIZE_KEY";

public Arguments getDefaultParameters() {
System.out.println("getDefaultParameters");

Arguments arguments = new Arguments();

Argument argument = new Argument();
argument.setName(COLLECTION_SIZE_KEY);
argument.setValue("" + DEFAULT_COLLECTION_SIZE);
arguments.addArgument(argument );

return arguments;
}
.
.
.

To make JMeter pick up my classes, I put this in $JMETER_HOME/bin/jmeter.properties:

search_paths=/Users/henryp/Documents/workspace/TestMaven/target/TestMaven-0.0.1-SNAPSHOT.jar

This way, I only had to execute Maven to compile my code and rebuild the JAR. It would be nice to hot-deploy the code but I don't know if JMeter supports this. So, instead I have to restart JMeter each time I change the code :-(

My code that iterates over the entry set of a ConcurrentHashMap was producing a throughput of about 25 000/s when using 100 read-threads while another 10 write-threads added an element to the collection with a constant throughput of 100/s (I've not shown the code for this).

Not bad. But could it be improved easily? Let's try using a java.util.List (perhaps surprisingly, ArrayList and LinkedList gave me similar figures) and use read/write locks to synchronize access. So, the read-code looks something like this:
 static ReadWriteLock readWriteLock;


@Override
protected void doSample() {
Lock readLock = readWriteLock.readLock();
try {
readLock.lock();
for (String aValue : list) {
// no-op
}
} finally {
readLock.unlock();
}
}

and the write-code looks something like this:
 private final AtomicInteger callId = new AtomicInteger();

.
.
.
@Override
protected void doSample() {
Lock writeLock = readWriteLock.writeLock();
try {
writeLock.lock();
String aValue = this.toString() + callId.getAndIncrement();
SingleListReaderWithRWLocksSamplerClient.list.add(aValue);
} finally {
writeLock.unlock();
}
}

The subsequent JMeter test (100 read thread, 10 write threads with a throughput of 100/s) indicated that this was about twice as fast as using ConcurrentHashMap.

What's more, it seemed to suffer a far smaller degradation in performance when I increased the throughput of write-threads to 1000/s.

Conclusion: they're both pretty nifty and perform at the same order of magnitude. Your mileage may vary but it seems java.util.concurrent.locks.ReadWriteLock is a more efficient approach.

Friday, August 12, 2011

Tomcat/Maven plugin

My consultancy contract has come to an end so now I have time to attend my much neglected blog.

In the last few months, I have been Maven-izing and modularizing a legacy application. (Architectural tip: err on the side of being too modularized. It's much easier to fuse modules together than to split them.)

Anyway, one of the jobs was to improve the developer experience by having a web app running via Maven. The advantages include the project working straight out of the box with no awkward manual configuration. But it was only partially working with the Tomcat Maven Plugin. Some parts of the page were not loading because of NullPointerExceptions in our code. And yet, when I pointed a standalone version of Tomcat at the same exploded WAR, everything worked.

Using Firebug's debugger and single-stepping through Javascript code revealed nothing amiss. But stepping through code of the Tomcat embedded in the Maven plugin, I found this in org.apache.catalina.authenticator.AuthenticatorBase:

if (session != null && changeSessionIdOnAuthentication) {

Manager manager = request.getContext().getManager();

manager.changeSessionId(session);

request.changeSessionId(session.getId());

}

So, session IDs were changing on each call. Indeed, Firebug did show that each AJAX call had a different JSESSIONID.

Further investigation shows that the Maven Tomcat Plugin uses Tomcat 6.0.29 (you can tell this by looking at the plugin's pom.xml). Our non-Maven Tomcat was 6.0.14.

What could be different? The change of a patch version number suggests bug-fixes and not new functionality. And what is this changeSessionIdOnAuthentication field?

It happens that we had a bespoke security plugin which extends Tomcat's AuthenticatorBase and authenticates on every HTTP call. With the default value of changeSessionIdOnAuthentication being true, this was causing the session to be discarded on the 6.0.29 Tomcat instance and so all fields in the session object to be uninitialized, ergo the NullPointerExceptions.

The changeSessionIdOnAuthentication field was added some time between releases 6.0.14 and 6.0.29. According to the Tomcat documentation, it is used "to prevent session fixation attacks".

These attacks are when somebody sends you a link to a website for which you have an account. The link includes the JSESSIONID that the attacker has been given by the website. You enter your credentials and the HTTP session associated with this JSESSIONID now carries some flag to say you have logged in. After this, the person who sent you the link can go to the website and as long as he is using the same JSESSIONID as you, he too can see your account since the associated HTTP sessions says you are authenticated.

The changeSessionIdOnAuthentication field being set to true means that as soon as you log in, you get a new JSESSIONID and the attacker is foiled.

Since our developers don't need to worry about this, we turned the functionality off in the server.xml file that the Maven plugin is told to use with:

<host name="localhost"... >
.
.
.
<valve classname="our.implementation.of.AuthenticatorBase" changeSessionIdOnAuthentication="false"...

Thursday, August 4, 2011

In praise of checked exceptions

I am swimming against the tide but I quite like checked exceptions - when done properly. I find them self-documenting and they force engineers to consider the possibility that their request cannot be serviced.

My partner, who is a lawyer but a wannabe geek, laughs when I tell her that lots of the problems I encounter at work are because somebody did not consider a certain set of circumstances. She tells me that in law, it is a very poor contract that does not cover all eventualities.

When I discuss my unpopular view with other engineers, I often hear that if you can't do anything then you have no choice but to re-throw an Exception. This is true but people often underestimate the number of options still open to them upon catching one.

For example, in the event of a database deadlock, one of the threads will immediately be rolled back. The JDBC driver will throw an SQLException but then that thread can try again.

To be fair, Java's JDBC API has left something to be desired. For most JDBC drivers, an update that violates a constraint and is never going to succeed throws the same exception as one that was deadlocked and might work upon a retry. (A class was added to the SQLException hierarchy to address this - java.sql.SQLRecoverableException - but it only appeared as late as JDK 1.6).

I am content to adopt the house-style. If my client prefers RuntimeExceptions then so be it. But an area where I feel checked exceptions should be mandatory is when one team creates an API for another.

A problem I saw recently was when the core team wrote a Java API and published it via Java RMI. A second team called that API but upon encountering a non-deterministic deadlock was surprised when the JVM barfed with an Error. It transpired that DeadlockLoserDataAccessException was not in their classpath - a Spring class that was thrown when Spring converted the SQLException to something a little more descriptive.

With checked exceptions, APIs are not only somewhat self-documenting but better define what the client needs in its classpath.

This proposal is not a panacea - objects passed over the wire might reference non-standard implementations. For instance, objects re-hydrated with Hibernate might have Hibernate implementations of standard Java collection interfaces (you might want to look at Gilead to clean up the object graph in this case).

[Interestingly, the proposal to use Gilead on this particular project also prompted a discussion on what was reasonable tech debt. Not all debt is bad - a mortgage allowed me to buy my house. Only when it becomes unmanageable is it a problem. When the team who wrote the API that passed Hibernate re-hydrated objects across the wire first started coding, they were the only clients of it. Therefore, they considered the use of Gilead as unnecessary. "Do the simplest thing that works," they said and put Hibernate in their client's classpath. The problem became a little more intractable when a second team used the API since they used a different version of Hibernate.]

One last point: if you publish APIs and both you and your clients use Maven, you can have the pom.xml of the API module to pull in any necessary third-party libraries as transitive dependencies. However, do remember that not all teams use Maven. What's more, not all APIs use just Java.


Sunday, April 24, 2011

To Infinispan and Beyond!

I've been playing for the last few weeks with JBoss's Infinispan - a distributed cache and the successor to JBossCache.

It's a nice piece of technology. But for our needs, it may be inappropriate.

It may be OK for storing users' HTTP sessions and the like. But we were thinking about storing data for which integrity is absolutely essential (bond orders that run typically into the billions of dollars). Let me run through a small number of arguments that made me think Infinispan is not for us.

The Infinispan FAQ says it uses MVCC to optimize performance by having write locks not block read locks. However, dig deeper and you see this is not MVCC as most people who are familiar with Oracle and Postgres would know it. In these database implementations, a reading thread works with a snapshot of data. If a writing thread updates that row in the database, then that's OK. The reading thread continues with its snapshot. This is great since nobody is blocking anybody and it can improve performance.

However, with Infinispan, this is not quite what is happening. It's true that write threads do not block read threads. But the data in question may be mutable.

In the Infinispan mailing lists, we read:

"[I]n this case you may well see the change before [a write transaction] commits, we don't explicitly clone or copy mutable objects ... I suppose though we could add the ability to defensively copy mutable objects, but we'd need a way of knowing which are immutable, etc. Also, this would be more expensive, depending on the size of the atomic map."

A quick-and-dirty test I wrote demonstrates this:


package org.infinispan.replication;

import static org.testng.AssertJUnit.assertNull;

import java.util.concurrent.CountDownLatch;

import javax.transaction.HeuristicMixedException;
import javax.transaction.HeuristicRollbackException;
import javax.transaction.NotSupportedException;
import javax.transaction.RollbackException;
import javax.transaction.SystemException;
import javax.transaction.TransactionManager;

import org.infinispan.Cache;
import org.infinispan.config.Configuration;
import org.infinispan.test.MultipleCacheManagersTest;
import org.infinispan.test.TestingUtil;
import org.infinispan.test.AbstractCacheTest.CleanupPhase;
import org.testng.AssertJUnit;
import org.testng.annotations.Test;

@Test(groups = "functional", testName = "replication.SyncReplLockingAtomicityPHTest")
public class SyncReplLockingAtomicityPHTest extends MultipleCacheManagersTest {

private static final String CACHE_NAME = "testcache";

private static final String k = "key", v = "value";

public SyncReplLockingAtomicityPHTest() {
cleanup = CleanupPhase.AFTER_METHOD;
}

protected Configuration.CacheMode getCacheMode() {
return Configuration.CacheMode.REPL_SYNC;
}

protected void createCacheManagers() throws Throwable {
Configuration cfg = getDefaultClusteredConfig(getCacheMode(), true);
cfg.setLockAcquisitionTimeout(500);
createClusteredCaches(2, CACHE_NAME, cfg);
}

public void testUpdateWhileReadLock() throws Exception {
final Cache cache = cache(0, CACHE_NAME);
final CountDownLatch latchAfterRead = new CountDownLatch(1);
final CountDownLatch latchBeforeCommit = new CountDownLatch(1);
final ReaderThread readerRunnable = new ReaderThread(k, cache, latchAfterRead, latchBeforeCommit);
final UpdateThread updateRunnable = new UpdateThread(k, cache);
updateWhileReadTX(latchAfterRead, latchBeforeCommit, readerRunnable, updateRunnable);
}

private void updateWhileReadTX(
CountDownLatch latchAfterRead,
CountDownLatch latchBeforeReadCommit,
ReaderThread readerRunnable,
UpdateThread updateRunnable) throws SecurityException, IllegalStateException, RollbackException, HeuristicMixedException, HeuristicRollbackException, SystemException, NotSupportedException, InterruptedException {
Cache cache1 = cache(0, "testcache");
Cache cache2 = cache(1, "testcache");
assertClusterSize("Should only be 2 caches in the cluster!!!", 2);
assertNull("Should be null", cache1.get(k));
assertNull("Should be null", cache2.get(k));

final StringBuffer initialObj = populateCache(cache1);
final Thread readThread = new Thread(readerRunnable);
readThread.start();
latchAfterRead.await();
final StringBuffer readObject = readerRunnable.getFromCache();

final Thread updateThread = new Thread(updateRunnable);
updateThread.start();
updateThread.join();
final StringBuffer writeObject = updateRunnable.getValue();
final String writeSnapshot = writeObject.toString();
final String readSnapshot = readObject.toString();

latchBeforeReadCommit.countDown();

AssertJUnit.assertSame(readObject, writeObject);
AssertJUnit.assertFalse(readSnapshot.equals(writeSnapshot));
}

private StringBuffer populateCache(Cache cache) throws SecurityException, IllegalStateException, RollbackException, HeuristicMixedException, HeuristicRollbackException, SystemException, NotSupportedException {
StringBuffer mutableObject = new StringBuffer("test");

TransactionManager mgr = TestingUtil.getTransactionManager(cache);
mgr.begin();
cache.getAdvancedCache().lock(k);
cache.put(k, mutableObject);
mgr.commit();
return mutableObject;
}

private class UpdateThread implements Runnable {

protected final String key;
protected final Cache cache;
protected final TransactionManager mgr;
private StringBuffer fromCache;

public UpdateThread(String k, Cache cache) {
this.key = k;
this.cache = cache;
mgr = TestingUtil.getTransactionManager(cache);
}

@Override
public void run() {
try {
mgr.begin();
getValueFromCache();
fromCache.append(System.currentTimeMillis());
cache.put(k, fromCache);
} catch (Exception x) {
x.printStackTrace();
} finally {
finishTx();
}

}

private void finishTx() {
try {
mgr.commit();
} catch (Exception x) {
x.printStackTrace();
}
}

private void getValueFromCache() {
fromCache = (StringBuffer) cache.get(key);
}

public StringBuffer getValue() {
return fromCache;
}

}


private class ReaderThread implements Runnable {

private final String key;
private final Cache cache;
private final CountDownLatch latchAfterRead, latchBeforeCommit;

private StringBuffer fromCache;

public ReaderThread(String k, Cache cache,
CountDownLatch latchAfterRead, CountDownLatch latchBeforeCommit) {
this.key = k;
this.cache = cache;
this.latchAfterRead = latchAfterRead;
this.latchBeforeCommit = latchBeforeCommit;
}

public StringBuffer getFromCache() {
return fromCache;
}

@Override
public void run() {
TransactionManager mgr = TestingUtil.getTransactionManager(cache);
try {
mgr.begin();
fromCache = (StringBuffer) cache.get(key);
latchAfterRead.countDown();
} catch (Exception x) {
x.printStackTrace();
} finally {
try {
latchBeforeCommit.await();
mgr.commit();
} catch (Exception x) {
x.printStackTrace();
}
}
}

}
}



The line in bold is where the test fails. The snapshot the read-thread has is the same as the write-thread, even though it read the object from the cache first. This is simply due to the fact that both read- and write-threads have the same object reference.

Note, if the write operations were rolled back, the read-thread would still see the change it had made. In fact, the whole cache from this point on would hold incorrect data.

This is not dissimilar to Hibernate updating an object (eg, assigning a primary key or a timestamp) even when the DB transaction it was in was rolled back. I've seen this happen when there were DB deadlocks and my transaction was chosen as the victim.

There are many other reasons why I think business critical data should not be put into Infinispan. I hope to present them in upcoming blog entries.

Useful SQL

Imagine an audit table of deal histories where every change to the deal means one more row in the table for a given deal ID.

How do you find the deals with the first audit time within a given time window?

[I'm posting this on my blog because I come across this problem quite often but never seem to remember the solution. By putting it here, I can copy-and-paste it in the future :) ]

So, imagine the audit table for deal_history looks a little like this:

+----+----------------+----------------+
|name|last_update_time|last_update_user|...

+----+----------------+----------------+

Such that the data looks like:

+-------+-------------------+----+--
|deal #1|03/14/2011 14:00:00
|Tom |...
+-------+-------------------+----+--
|deal #1|03/14/2011 15:33:33
|Dick|...
+-------+-------------------+----+--
|deal #2|03/14/2011 16:22:22
|Tom |...
+-------+-------------------+----+--
| ... ... ..
+-------+-------------------+----+--

The trick is to select over the same table twice thus:

select d1.last_update_time, d1.name, d1.last_update_user
from deal_history d1 , deal_history d2

where d1.last_update_time >= '03/14/2011 14:00:00'
and d1.last_update_time <= '03/21/2011 15:00:00'
and d1.deal_id = d2.deal_id
group by d1.last_update_time, d1.name , d1.last_update_user
having d1.last_update_time = min(d2.last_update_time)


Making sure the unique identifier for a give deal is equal in both sets of data (note: in our example, this is not a PK since there are many rows per deal ID).

Also, note that the group by clause has fields in the same order and that the having clause must follow the group by.

Three reasons to avoid Subversion

I've been using CVS for most of the last 15 years. I came to the Subversion party late. Despite all my friends saying how wonderful it was, I didn't use it professionally until 2009.

Initially, I was reasonably happy with it but over time I have come to see the wisdom of Linus Torvalds statement that Subversiom is "the most pointless project ever started."

Reason #1
The integration with Eclipse is patchy. No matter how great the tool, if the integration with your IDE of choice is buggy then avoid it unless absolutely necessary.

Eclipse sometimes says I have checked everything in when I haven't. It says I have sometimes checked everything out when I haven't. It can get confused during merges - especially if you have been moving files around. And this seems to be the case over a number of versions of Eclipse I have used in the last 2 years (most recently Helios) using both Subversive and Subclipse plugins. A few members of my team and I have been forced to avoid Eclipse altogether and use the Tortoise Subversion plugin.

When Subversion gets its knickers in a twist, it's harder to manually edit the metadata files in the .svn directory since they appear to be binaries. With CVS, they were plain text files. Of course, it's always nasty to do this. But sometimes, you just gotta.

Reason #2
Subversion can have old JARs stored in the .svn directories. As an example, I picked a random project from my hard drive (Gilead) and looked in the .svn directory.

phillip:1.3.1-SNAPSHOT henryp$ pwd
/Users/henryp/Documents/Code/gilead/gilead/maven-snapshots-repo/net/sf/gilead/comet4gwt/1.3.1-SNAPSHOT
phillip:1.3.1-SNAPSHOT henryp$ jar -tf .svn/text-base/comet4gwt-1.3.1-20100205.214253-2.jar.svn-base | more
META-INF/
META-INF/MANIFEST.MF
net/
net/sf/
net/sf/gilead/
net/sf/gilead/annotations/
net/sf/gilead/comet/
net/sf/gilead/exception/
net/sf/gilead/annotations/Comet.class
.
.
.


So, a living, breathing JAR in the metadata directory!

You might ask: "so what?". Well, this caused a lot of confusion on a project that loaded all its JARs into the classpath using a shell script that used a wild card. (Yes, I know that is bad practise in these days of OSGi et al but bad practices are unfortunately part of life and probably will continue to be until we are a regulated profession...) For a few hours, we were wondering why classes from a library that had long since been removed were showing up in our code.

Reason #3
When a friend breathlessly told me about Subversion, he said you could move files around and Subversion would understand. Poor CVS simply saw it as one file being deleted and another being created.

Well, this is not the entire story. If you move a file, the change history can move with it and that's about it.

“A lesser-known fact about Subversion is that it lacks “true renames”—the svn move command is nothing more than an aggregation of svn copy and svn delete.”
Version Control with Subversion (For Subversion 1.5, r3305) p106

This may be tolerable for just one or two classes. But, let's say you need to do a major refactoring. I'm sure you've been here before:

"Files and directories are shuffled around and renamed, often causing great disruption to everyone on the project. Sound like the perfect case for a branch, doesn't it? Just create a branch, shuffle things around, then merge the branch back into trunk, right?

"Alas, this scenario doesn't work so well right now and is considered one of Subversion's current weak spots."
Version Control with Subversion

“The moral of this story is that until Subversion improves, be very careful about merging copies and renames from one branch to another.”
Version Control with Subversion (For Subversion 1.5, r3305), p107

I found during a recent mass-refactor that if I moved a file then moved the file again (having not checked in until I was satisfied), Subversion would simply panic.

Conclusion
Subversion will not fulfill the dream of having many branches where you can refactor to your heart's content and merge later. If you want that, try another tool (I'm starting to get interested in Git).

The bottom line is that Subversion is only useful if you have one branch that's in production and one development branch. This is OK for many projects. But if you are working on a project that has teams from New York to London to Bangalore working on different product lines as I am currently, I'd strongly recommend that you avoid it.

Sunday, February 13, 2011

It's the environment, stupid!

During our recent release, an upgraded version of a library now pulled in MINA as a transitive dependency. Although there were no problems in our Linux QA environment, the Solaris QA environment just wouldn't work. There were no error messages, just our application saying that it was constantly trying to connect to another tier.

When the exact same code works in one environment but not in another, the problem simply has to be environmental. But proving it to your colleagues (who think the issue is with MINA) can be hard and finding what the exact problem is even harder.

If it's environmental, then we have to step away from Java (you can never live totally inside the virtual machine...) and look at what the actual machine is doing.

Let's write a test:
package com.henryp.nio;

import java.io.IOException;
import java.net.InetAddress;
import java.net.InetSocketAddress;
import java.net.ServerSocket;
import java.net.Socket;
import java.net.SocketAddress;
import java.nio.ByteBuffer;
import java.nio.channels.SocketChannel;
import java.util.concurrent.CountDownLatch;

import org.junit.AfterClass;
import org.junit.BeforeClass;
import org.junit.Test;


public class SocketNioTest {

private static final int SERVER_PORT = 9999;
private static ServerSocket serverSocket;

@BeforeClass
public static void setUp() throws IOException {
serverSocket = new ServerSocket(SERVER_PORT);
Thread thread = new Thread(new Runnable() {

public void run() {
while (true) {
try {
Socket accept = serverSocket.accept();
System.out.println("client connected");
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}

});
}

@AfterClass
public static void tearDown() throws IOException {
serverSocket.close();
}

@Test
public void testNio() throws IOException, InterruptedException {
SocketChannel socketChannel = SocketChannel.open();

System.out.println("getLocalHost...");
InetAddress inetAddress = InetAddress.getLocalHost() ;
System.out.println("binding...");
SocketAddress bindpoint = new InetSocketAddress(inetAddress, SERVER_PORT);
System.out.println("connecting...");
socketChannel.socket().connect(bindpoint );

sendRandomData(socketChannel);

System.out.println("closing...");
socketChannel.socket().close();
}

private void sendRandomData(SocketChannel socketChannel) throws IOException {
ByteBuffer byteBuffer = ByteBuffer.allocate(256);
socketChannel.write(byteBuffer);
}

}

OK, it's not a real unit test (as it can't fail) but JUnit makes a nice test harness. Let's put some break points in and run tcpdump.

[root@vmwareFedoraII henryp]# tcpdump -i lo port 9999

Note that you need to be root to do this. Also note that you need to monitor the loopback interface. This is a very useful feature of Linux as you can then monitor packets that are passed around within the same process - packets that never actually leave the network card.

Now this line:

socketChannel.socket().connect(bindpoint );

causes tcpdump to print:

21:09:59.414367 IP localhost.localdomain.44985 > localhost.localdomain.distinct: Flags [S], seq 625354929, win 32792, options [mss 16396,sackOK,TS val 22409432 ecr 0,nop,wscale 6], length 0

21:09:59.414409 IP localhost.localdomain.distinct > localhost.localdomain.44985: Flags [S.], seq 630583436, ack 625354930, win 32768, options [mss 16396,sackOK,TS val 22409432 ecr 22409432,nop,wscale 6], length 0

21:09:59.414420 IP localhost.localdomain.44985 > localhost.localdomain.distinct: Flags [.], ack 1, win 513, options [nop,nop,TS val 22409432 ecr 22409432], length 0


This is the famous three-way handshake (SYN, SYN/ACK, ACK).

  1. It's saying that our client (that happens to be using port 44985 this is random) sends a SYN packet (Flags [S]) to the server socket with sequence number 625354929.
  2. The server responds with a a SYN/ACK (Flags [S] ... ack) with sequence number 630583436 acknowledging 625354930 (the clients sequence plus 1)
  3. Then the client acknowledges receipt of this with an ACK (ack).

This is a healthy normal exchange. Note that this is only one line of Java code. Even if you single step into the Java source code, you'll see just all this chatter taking place in one native call.

Now, we send some random data and it is this line:

socketChannel.write(byteBuffer);

that produces this output in tcpdump:


21:12:21.380171 IP localhost.localdomain.44985 > localhost.localdomain.distinct: Flags [P.], seq 1:257, ack 1, win 513, options [nop,nop,TS val 22551398 ecr 22409432], length 256

21:12:21.380220 IP localhost.localdomain.distinct > localhost.localdomain.44985: Flags [.], ack 257, win 529, options [nop,nop,TS val 22551398 ecr 22551398], length 0


This is the client sending a PUSH packet (Flags [P.]) of 256 bytes (remember our ByteBuffer.allocate(256)) and the server acknowledging receipt (ack).

seq 1:257 refers to the sequence number for this packet (relative to the first sequence number 625354929). If we drop the stack frame and run the lines in this method again, we see:

21:16:02.165005 IP localhost.localdomain.44985 > localhost.localdomain.distinct: Flags [P.], seq 257:513, ack 1, win 513, options [nop,nop,TS val 22772183 ecr 22551398], length 256

21:16:02.165014 IP localhost.localdomain.distinct > localhost.localdomain.44985: Flags [.], ack 513, win 546, options [nop,nop,TS val 22772183 ecr 22772183], length 0


and the sequence number for the client has moved to seq 257:513.

And finally:

socketChannel.socket().close();

produces:

21:21:48.188898 IP localhost.localdomain.44985 > localhost.localdomain.distinct: Flags [F.], seq 513, ack 1, win 513, options [nop,nop,TS val 23118207 ecr 22772183], length 0

21:21:48.228956 IP localhost.localdomain.distinct > localhost.localdomain.44985: Flags [.], ack 514, win 546, options [nop,nop,TS val 23118247 ecr 23118207], length 0


where the client says it's FINished and the server ACKnowledges this.

Actually, this is not the entire story. Look what happens when the JVM closes down:

21:26:26.173044 IP localhost.localdomain.distinct > localhost.localdomain.44985: Flags [R.], seq 1, ack 514, win 546, options [nop,nop,TS val 23396191 ecr 23118207], length 0

This is the client sending the RST packet. This says: whatever data is currently being processed in the TCP/IP stack, forget it. The JVM/OS is telling the server this to save it from worrying any more about the sockets that have been opened. Apparently, firewalls use these types of packets to stop people from connecting.

So, FINishing the TCP/IP is not sufficient to close the socket. In fact, if we run the test again until after we have closed the socket but before we close the JVM, we see something like this using netstat:


[henryp@vmwareFedoraII ~]$ netstat -nap | grep 9999
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 :::9999 :::* LISTEN 8099/java
tcp 257 0 ::ffff:127.0.0.1:9999 ::ffff:127.0.0.1:52088 CLOSE_WAIT -
tcp 0 0 ::ffff:127.0.0.1:52088 ::ffff:127.0.0.1:9999 FIN_WAIT2 -


And this state can stay around for some time (typically of the order of 10s of seconds, depending on the OS). Note that it's also disconnected from the Java process at this point as it appears to be totally under the control of the OS.

With RST, there is no waiting. And this is what brings me to what we were seeing in our QA environment. But the RST was being sent when the JVM was still alive and during the three-way handshake (SYN, SYN/ACK, RST... WTF??). This infers that the problem could not possibly be Java code since this is far too low a level.

It's easy to blame the new NIO/MINA as that was the only thing new in our stack but since Linux was handling it fine, we set the System Administrators looking at this Solaris box to see what could be wrong. Finally, by comparing all environment files with that of root who could run our code, we found that for the chroot-ed user under which our code runs could not write to /dev/poll. I don't pretend to be a Solaris expert but looking at this link:

"The /dev/poll pseudo-device [is] maintaining the full set of file descriptors being polled in the kernel."

Which sounds a little like (OK, I'm clutching at straws) the opening page on the chapter on Selectors in Java NIO:

"multiplexing make it possible for a single thread to efficiently manage many I/O channels simultaneously. C/C++ coders have had the POSIX select( ) and/or poll( ) system calls in their toolbox for many years."

Now, I know my MINA code is using selectors, so I am assuming that this asks the operating system to add an entry to /dev/poll. Since the thread finds this unexpectedly read-only (a mistake by the admin setting up the chroot jail) things go horribly wrong and a RST is sent just as we're connecting.

Whether this is true or not, chmoding /dev/poll to being read/write for this user solved the problem.

Poisoned Ivy

Stress tests are not just for your code. They're for the full stack.

We recently updated our build to use Ivy. This is no small piece of work and (as we discovered) can be prone to error.

Since we work in an investment bank, we're encouraged to use internal Ivy repositories. The developer tasked with this work could not find the JBossCache library we were using (1.2.3 - yes, the code really is that old) and so used version 1.2.2. He regarded this downgrade as the lesser of the evils as transitive dependency management seemed more important than a small difference in the revision number of a library. What could possibly go wrong?

We passed the QA process and deployed to production. A few days later, the app became unresponsive and kept puking this error:

org.jboss.cache.lock.TimeoutException: write lock for /tranchebook could not be acquired after 10000 ms

Why had this cut of the code passed QA?

I ran a soak test and within half an hour saw the same problem. So, I changed the version of the JBossCache back to 1.2.3 (which is what we have been using these last few years). After an hour, I still had not seen this exception. Cue a hasty, overnight deployment to production.

Looking at the JBossCache code, I saw the problem. If you put a break point in org.jboss.cache.interceptors.UnlockInterceptor in the invoke(MethodCall m) method at line:

List locks=(List)lock_table.get(Thread.currentThread());


With jboss-cache-1.2.3.jar, you’ll see that lock_table is a Collections$SynchronizedMap.

With jboss-cache-1.2.2.jar, you’ll see that lock_table is an implementation of HashMap – bad since neither this method nor the methods calling it hold a lock.

It’s easy to see that if multiple threads are changing a HashMap, we’re going to get non-deterministic behaviour (from the JavaDocs: "Note that this implementation is not synchronized. If multiple threads access this map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally.").


Version Number Methodology

Version numbers are less arbitrary than most developers imagine. Pressman says:

"Configuration object 1.0 undergoes revision and becomes object 1.1. Minor corrections and changes result in versions 1.1.1 and 1.1.2, which is followed by a major update that is object 1.2. The evolution of object 1.0 continues through 1.3 and 1.4, but at the same time, a major modification to the object results in a new evolutionary path, version 2.0."

(Software Engineering: A Practitioner's Approach, 4th Edition, p233.)

Exact convention is moot but a popular one is:

Major.Minor.Revision.Build

This is what Maven uses [1] , is similar to what GNU [2] uses and (apart from the use of an underscore) what Sun uses:

phillip:~ henryp$ java -version
java version "1.6.0_22"


Cleverly, Sun let their marketing people use the minor number in "soft" literature - hence the reason we often call JDK 1.5, Java 5.0 etc. Apparently, they did this for Solaris also.

The beauty of this system is that it helps me with compatibility. For instance, when moving from a wildcard based classpath to an Ivy managed classpath, we noticed that something like Xalan 2.7.1 and Xalan 2.1.0 were both being pulled in as transitive dependencies for other libraries we're using. Since we 're not using OSGi, we can't use both and as it happened Xalan 2.1.0 was appearing first in the classpath. This was causing a problem since another library needed methods in Xalan 2.7.1. What to do?

Since both Xalan's were 2.x, we could be fairly sure that something written for version 2.1.0 will work with version 2.7.1 since they share the same major number. Note: the opposite is not true - something written for 2.7.1 has no guarantee that it will work when running against 2.1.0. So, we removed 2.1.0 by hand confident that 2.7.1 would serve both libraries. So far so good.

This is similar to the JDK convention. Java compiled with JDK 1.0 will play nicely with JDK 1.6. Similarly, I don't expect massive differences between JDK 1.6.0_22 and 1.6.0_5 except bug fixes and performance improvements (see my previous post). Presumably, the JDK will only move to verison 2.x.y_z when deprecated methods like Thread.stop are removed as this constitutes an API change [3].

[1] http://mojo.codehaus.org/versions-maven-plugin/version-rules.html
[2] http://www.advogato.org/article/40.html
[3] http://www.javabeat.net/articles/101-introduction-to-java-module-system-in-java-70-2.html

Sunday, January 23, 2011

Capital Markets Business Terminology

OK, this isn't tech. I'm writing this so I remember in a few years what it is I did. This follows chats with the guys closes to the business.

Equity Capital Markets

Ways to raise money and the stages:

Private placement
  • Pre-IPO
  • Less than 500 investors in accord with SEC rules.
IPO
  • Registering on an exchange.
  • Publicly tradable.
Green shoe option
  • The right of the underwriter(s) to sell additional stock.
  • Stabilizes the price.
Follow on
  • Post-IPO when the company still has equity to sell.
  • Non-publicly traded.
Rights Issue
  • Diluting stock.
  • Current stock-holder have the right to buy more.
  • They can sell this right.
Miscellaneous:

Pot
  • All orders controlled by the syndicate
Retention
  • Orders controlled by any one bank within the syndicate.
Block trader
  • Very secretive.
  • Relies heavily on network of clients.
  • Operates outside of market hours so as not to spook the market.
  • Deals with large amounts (~1% of company).
  • Offers a sizable discount on market price.
  • Essentially selling "warrants". That is to say, an agreement to exchange the equity through the exchange when the market next opens.
  • Doesn't want the salespeople to know anything!

Debt Capital Markets

There are two types of salesperson:

Pot
  • Institutional investors share of deal.
  • ~98% of demand.
PWM (Private Wealth Management)
  • Retail investors.
  • More legal restriction involving selling to these investors (eg, conflicts of interest).
Takedown
  • Investment bank buys the bond at booking time and sells slices of it.
  • This is iff the investment bank is handling the booking and delivering.
Switch
  • Alternative form of payment
Hedge
  • Unrelated bond from a different issuer.
Soft Allocation
  • What a bank would like a syndicate member to get.
  • This is applicable when said bank is not the book runner.
Duration Manager
  • Very desirable role for an investment bank.
  • Can see the hedges and swaps of other counter-parties. This is valuable market data.
  • Role is assigned by the issuer.

Badly tuned Hibernate queries gobbling tempdb space

Due to some poorly performing HQL, our Sybase server (12.5) was running out of tempdb space.

Debugging the issue has been something of a problem because of the way Sybase manages this odd table.

What is tempdb?

"tempdb needs to be big enough to handle the following processes for every concurrent Adaptive Server user:
  • Worktables for merge joins
  • Worktables that are created for distinct, group by, and order by, for reformatting, and for the OR strategy, and for materializing some views and subqueries
.
.
.


From the Sybase manual.

Since our HQL was using the distinct keyword, we were filling up the tempdb table.

A naive way of analyzing which query was taking up all this space is offered by Sybase themselves here. Basically, the idea is to drop the distinct keyword from your query and select the results into a temp table. For example:
select city
into #tempcity
from authors
"You could use this query to create a temporary table, and then use sp_spaceused," they advise. That is:

use tempdb

go

sp_spaceused


What I had difficulty finding was that this is not the whole story. It will only tell you the size of the data retrieved and not the whole amount of tempdb used. For instance, you might only get 10 rows of just a single integer each if you performed a distinct select on an ID. But if the same query has lots of complicated joins, you're actually using up a lot more tempdb space than this as the DB builds worktables with which to process the join relationships.

Our DBA sent me this email:

"Definitely the number of pages read in a single scan of a worktable is an absolute indicator of the size of that table (since the worktable is created on the fly and has n pages that are scanned once). The dataserver will have a 2K page size which means the table is n x 2Kb in size."

So, one way to find the size of these queries is to enable resource limits and see at what point your query breaks. For instance:

sp_configure "allow resource limits"
go
sp_add_resource_limit sa, NULL, 'at all times', tempdb_space, 5, 2, 3, 1

go

This sp_add_resource_limit stored procedure is saying for user sa connecting via an undefined application (NULL), at all times, limit the space in tempdb to 5 pages. In the event of any limit being exceeded (2), abort the transaction (3) for that query (1). See here for the full details of how to execute this SP.

Then, execute your query and see if it fails because of the limited resource. You may see something like this:

com.sybase.jdbc3.jdbc.SybSQLException: Exceeded tempdb space limit of 5 pages.

SQLWarning: ErrorCode: 3618 SQLState: 01ZZZ --- Transaction has been aborted.
Query 1 of 1, Rows read: 0, Elapsed time (seconds) - Total: 0.214, SQL query: 0.214, Building output: 0


Even a query that returns no rows whatsoever but that does a lot of joining may hit this limit.

By the way, when you're finished, run:

sp_drop_resource_limit sa, NULL
go

Piper at the Gates of Dawn

What's the best way to run integration tests for separate, discrete services?

In an attempt to decouple the stack, a service has been pulled out into a separate project. To run the integration tests of this discrete project, the stack needs to be brought up. We'd like to automate this so we used Java's ProcessBuilder thus:

ProcessBuilder processBuilder = new ProcessBuilder(command);

Process process = processBuilder.start();

for each module that needs to start up. We'd wait for some character sequence like "Module XXX started!" before starting the next.

All was going well until somebody changed the logging file in the module that we start up this way. The (RMI) thread that serviced our request was hanging and we saw this in JConsole:

Name: RMI TCP Connection(2)-154.1.225.59

State: RUNNABLE

Total blocked: 0 Total waited: 0

Stack trace:

java.io.FileOutputStream.writeBytes(Native Method)

java.io.FileOutputStream.write(FileOutputStream.java:260)

java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)

- locked java.io.BufferedOutputStream@cf0384

java.io.PrintStream.write(PrintStream.java:430)

- locked java.io.PrintStream@efe617

sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202)

sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:263)

sun.nio.cs.StreamEncoder.write(StreamEncoder.java:106)

- locked java.io.OutputStreamWriter@1eb1670

java.io.OutputStreamWriter.write(OutputStreamWriter.java:190)

java.io.BufferedWriter.flushBuffer(BufferedWriter.java:111)

- locked java.io.OutputStreamWriter@1eb1670

java.io.PrintStream.write(PrintStream.java:476)

- locked java.io.PrintStream@efe617

java.io.PrintStream.print(PrintStream.java:619)

java.io.PrintStream.println(PrintStream.java:756)

- locked java.io.PrintStream@efe617

org.hibernate.jdbc.AbstractBatcher.log(AbstractBatcher.java:396)

In this version of Hibernate (3.2.6) this line was just a System.out.println call. Why on earth would this cause the thread to block forever?

The thread was stuck in native code so was there something wrong with my JVM (1.6.0_20 on Windows)? Would looking at file handles show some sort of contention? Would it be OK on a *Nix machine?

Putting a breakpoint in this Hibernate code only confused me further. The first time Hibernate hits this System.out.println call, it is executed without problem. It's only the second time that the thread hangs forever.

Whenever something works fine for a while then stops working for good suggests a resource is being exhausted. It took me some Googling before realizing that the pipe's buffers were filling up - but not until the second execution of System.out.println.

When you start another process via the JVM, you can't then ignore it. It may send data to an output stream (for example, just printing to the console). Some thread needs to drain this stream even if it does nothing with the data - the notion of a StreamGobbler.

Obvious in retrospect but not immediately apparent when you forget and you're confronted by this behaviour.

Tuesday, January 18, 2011

Java producing native code

I rather like this.

Watch Java convert byte code to native code by downloading JDK 7 from here. Make sure you download the DEBUG file.

Then, run your code with something like:

/home/henryp/Tools/JDK_1_7_0_debug/jdk1.7.0/fastdebug/bin/java -XX:+PrintOptoAssembly -XX:CompileThreshold=5 -server -cp ./bin com.henryp.lang.ThreadWaitingMain

And watch glorious native code being generated.

The magic is in the XX:+PrintOptoAssembly flag. You also need -server but the -XX:CompileThreshold=5 is optional. It just says how many times the code needs to be executed before the JIT compiler kicks in.

You can see the atomic Java classes (eg, AtomicInteger) being reduced to little more than cmpxchg instructions (on x86 architectures at least) since they use compare-and-swap semantics of the underlying hardware to achieve their ends. See this excellent article for more information.

Sunday, January 9, 2011

JVM optimization

Want a quick and easy way to make your Java apps run faster? Upgrade your JVM.

Last year, I was using JDK 1.6.0_05 to run our app. I monitored it with YourKit and saw my application threads being blocked by the Reference Handler thread on the monitor of class java.lang.ref.Reference$Lock.

Upon upgrading to JDK 1.6.0_20, there was none of this particular contention. What's more, the app was running about 25% faster without me doing anything!

So, what was causing this contention?

In one sample, the JVM ran for less than 9 minutes and was only doing something “interesting” for about 2 of those yet the total time spent contending for this lock was about 111 seconds. This time was spread across roughly 50-60 threads and YourKit was saying that there are about 50 different methods that appear to be attempting to attain the lock. Of the handful of methods I looked at, I couldn’t see anything in the code that was obviously suspicious. I mean, it was literally blocking in a simple get method!

Let's see what happens when the JDK blocks. Let's take this code:



package com.henryp.lang;

public class ThreadWaitingMain {

public static void main(String[] args) {
final Object lock = new Object();
System.out.println("\nlock.hashCode = " + Integer.toHexString(lock.hashCode()) + "(" + lock.hashCode() + ")"); //

Thread firstThread = new Thread(createLockContender(lock));
firstThread.setDaemon(false);
firstThread.setName("Thread no. 1");
firstThread.start();


Thread otherThread = new Thread(createLockContender(lock));
otherThread.setDaemon(false);
otherThread.setName("Thread no. 2");
otherThread.start();

System.out.println("Main thread finishing");
}

private static Runnable createLockContender(final Object lock) {
return new Runnable() {

@Override
public void run() {
while (true) {
synchronized (lock) {
try {

System.out.println(Thread.currentThread().getId() + ":" + Thread.currentThread().getName() + ": Have lock. About to sleep.");
Thread.currentThread().sleep(1000L);
lock.notify();
System.out.println(Thread.currentThread().getId() + ":" + Thread.currentThread().getName() + ": Relinquishing lock...");
lock.wait();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}

};
}

}



Let's run it.

lock.hashCode = 1172e08(18296328)
Main thread finishing

.
.
.


Which PID is it?

[henryp@vmwareFedoraII Test]$ ps aux | grep java
henryp 12148 0.9 0.9 399372 10028 pts/6 Sl+ 22:13 0:00 /usr/java/jdk1.6.0_18/bin/java -server -Dfile.encoding=UTF-8 -classpath /home/henryp/workspace/Test/bin com.henryp.lang.ThreadWaitingMain
henryp 12168 0.0 0.0 4204 704 pts/3 S+ 22:14 0:00 grep java


OK, it's 12148.

By the way, running:

[henryp@vmwareFedoraII Test]$ ps -eLf | grep java

gives us not just the processes but all of the threads:

henryp 12148 11567 12148 0 12 22:13 pts/6 00:00:00 /usr/java/jdk1.6.0_18/bin/java -server -Dfile.encoding=UTF-8 -classpath /home/henryp/workspace/Test/bin com.henryp.lang.ThreadWaitingMain
henryp 12148 11567 12154 0 12 22:13 pts/6 00:00:00 /usr/java/jdk1.6.0_18/bin/java -server -Dfile.encoding=UTF-8 -classpath /home/henryp/workspace/Test/bin com.henryp.lang.ThreadWaitingMain
henryp 12148 11567 12155 0 12 22:13 pts/6 00:00:00 /usr/java/jdk1.6.0_18/bin/java -server -Dfile.encoding=UTF-8 -classpath /home/henryp/workspace/Test/bin com.henryp.lang.ThreadWaitingMain

.
.
.

But which ones are ours?

[henryp@vmwareFedoraII Test]$ jstack 12148 | grep "Thread no."
"Thread no. 2" prio=10 tid=0xb6a7d400 nid=0x2f84 waiting on condition [0x9f388000]
"Thread no. 1" prio=10 tid=0xb6a7bc00 nid=0x2f83 in Object.wait() [0x9f3d9000]


Very useful is our old friend jstack. Right, let's convert those hex values into decimal:

[henryp@vmwareFedoraII Test]$ printf "%d\n" 0x2f84
12164
[henryp@vmwareFedoraII Test]$ printf "%d\n" 0x2f83
12163


Now, let's see which Linux kernel commands are being called by these threads:

[henryp@vmwareFedoraII openjdk7]$ strace -p 12163 > ~/strace_12163.txt 2>&1 &
[henryp@vmwareFedoraII openjdk7]$ strace -p 12164 > ~/strace_12164.txt 2>&1 &


Let's tail one of these:

[henryp@vmwareFedoraII ~]$ tail -f strace_12164.txt
clock_gettime(CLOCK_MONOTONIC, {16852, 857324137}) = 0
clock_gettime(CLOCK_MONOTONIC, {16852, 857391406}) = 0
gettimeofday({1294611927, 576256}, NULL) = 0
clock_gettime(CLOCK_REALTIME, {1294611927, 576322753}) = 0
futex(0xb6a7e9a4, FUTEX_WAIT_PRIVATE, 1, {0, 999933247}) = -1 ETIMEDOUT (Connection timed out)
futex(0xb6a7e228, FUTEX_WAKE_PRIVATE, 1) = 0
clock_gettime(CLOCK_MONOTONIC, {16853, 859389474}) = 0
futex(0x9f80090c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x9f800908, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0xb6a7c828, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xb6a7e96c, FUTEX_WAIT_PRIVATE, 689, NULL) = 0
futex(0xb6a7e028, FUTEX_WAIT_PRIVATE, 2, NULL) = 0
futex(0xb6a7e028, FUTEX_WAKE_PRIVATE, 1) = 0


And it's this futex - "fast userspace mutex" - that we're interested in. It's how Linux does it's locking.

But that's where the story ends as I am not a Linux guru and need to do more research. So, I'm afraid this is an incomplete post. I wrote it to keep a record of what I am working on in my spare time.