More production issues with Jetty, this time hit by this bug - a deadlock in the web server.
While diagnosing similar bugs, I wondered when a connection is associated with the Java server process. This is not as simple a question as it sounds. The client can make a connection to a server socket but Linux won't recognise the connection as belonging to the process until it is accepted.
Take this code:
SocketChannel channel = ((ServerSocketChannel)key.channel()).accept();
Now put a breakpoint on this line, run it and make a connection to it like:
mds@gbl04214[Skye]:~> !telnet
telnet 128.162.27.126 10110
Trying 128.162.27.126...
Connected to 128.162.27.126.
Escape character is '^]'.
This is a load of bunkum
You'll see the breakpoint gets hit.
> netstat -nap 2>/dev/null | grep 10110
tcp 0 0 128.162.27.126:10110 :::* LISTEN 5711/java
tcp 26 0 128.162.27.126:10110 128.162.27.125:55658 ESTABLISHED -
(Note how netstat is reporting that there are 26 bytes in the buffer ready to be read, my message This is a load of bunkum).
Pass over the break point (ie, we've now called accept) and we see:
> netstat -nap 2>/dev/null | grep 10110
tcp 0 0 128.162.27.126:10110 :::* LISTEN 5711/java
tcp 26 0 128.162.27.126:10110 128.162.27.125:55658 ESTABLISHED 5711/java
Now the OS associates the connection with our process.
So, if your server goes mad and stops accepting connections, it may seem as if the connections backing up are not making it to your process. This is wrong.
As an addendum, if you did the same thing with netcat rather than telnet, notice that the socket state is CLOSE_WAIT rather than ESTABLISHED. Netcat has apparently terminated and sent a FIN packet but the socket won't transition out of this state until the application closes the connection.
> echo this is rubbish | netcat 128.162.27.126 10110
> netstat -nap 2>/dev/null | grep 10110
tcp 0 0 128.162.27.126:10110 :::* LISTEN 5711/java
tcp 17 0 128.162.27.126:10110 128.162.27.125:49854 CLOSE_WAIT -
Still, there is data to read even if the client no longer exists.
While diagnosing similar bugs, I wondered when a connection is associated with the Java server process. This is not as simple a question as it sounds. The client can make a connection to a server socket but Linux won't recognise the connection as belonging to the process until it is accepted.
Take this code:
SocketChannel channel = ((ServerSocketChannel)key.channel()).accept();
Now put a breakpoint on this line, run it and make a connection to it like:
mds@gbl04214[Skye]:~> !telnet
telnet 128.162.27.126 10110
Trying 128.162.27.126...
Connected to 128.162.27.126.
Escape character is '^]'.
This is a load of bunkum
You'll see the breakpoint gets hit.
> netstat -nap 2>/dev/null | grep 10110
tcp 0 0 128.162.27.126:10110 :::* LISTEN 5711/java
tcp 26 0 128.162.27.126:10110 128.162.27.125:55658 ESTABLISHED -
(Note how netstat is reporting that there are 26 bytes in the buffer ready to be read, my message This is a load of bunkum).
Pass over the break point (ie, we've now called accept) and we see:
> netstat -nap 2>/dev/null | grep 10110
tcp 0 0 128.162.27.126:10110 :::* LISTEN 5711/java
tcp 26 0 128.162.27.126:10110 128.162.27.125:55658 ESTABLISHED 5711/java
Now the OS associates the connection with our process.
So, if your server goes mad and stops accepting connections, it may seem as if the connections backing up are not making it to your process. This is wrong.
As an addendum, if you did the same thing with netcat rather than telnet, notice that the socket state is CLOSE_WAIT rather than ESTABLISHED. Netcat has apparently terminated and sent a FIN packet but the socket won't transition out of this state until the application closes the connection.
> echo this is rubbish | netcat 128.162.27.126 10110
> netstat -nap 2>/dev/null | grep 10110
tcp 0 0 128.162.27.126:10110 :::* LISTEN 5711/java
tcp 17 0 128.162.27.126:10110 128.162.27.125:49854 CLOSE_WAIT -
Still, there is data to read even if the client no longer exists.
No comments:
Post a Comment