Saturday, December 23, 2023

Cloud native

A cloud native approach to writing code is that the instance in which it lives can die at any time.

"Users sometimes explicitly send the SIGKILL signal to a process using kill -KILL or kill -9. However, this is generally a mistak. A well-designed application will have a handler for SIGTERM that causes the application to exit gracefully, cleaning up temporary files and realeasing other resources beforehand. Killing a process with SIGKILL bypasses the SIGTERM handler." - The Linux Programming Interface (Micahel Kerrisk)
Using docker stop sends SIGTERM.
Using docker kill sends SIGKILL.

The latter does not give the JVM a chance to clean up. In fact, no process in any language has the chance to clean up with SIGKILL. (SIGTERM on any thread - not just main - causes the whole JVM process to end and shutdown hooks to execute.) 

A Tini problem...

If the JVM process creates another process is killed with SIGKILL, that process carries on living but its parent becomes (on Ubuntu 20.04.6 LTS) systemd which in turn is owned by init (PID 1).

Running your JVM directly in a Docker container has some issues. This revolves around Linux treating PID 1 as special. And the ENTRYPOINT for any Docker container is PID 1.

In Linux, PID 1 should be init. On my Linux machine, I see:

$ ps -ef | head -2
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 Oct21 ?        00:18:23 /sbin/init splash

This process serves a special purpose. It handles SIGnals and zombie processes. Java is not built with that in mind so it's best to bootstrap it with a small process called tini. There's a good discussion why this is important here on GitHub. Basically, Tini will forward the signal that killed the JVM onto any zombies that are left behind. This gives them the chance to clean up too. 

It also passes the JVM's exit code on so we can know how it failed. Exit codes 0-127 are reserved [SO] and the value of the kill (kill -l lists them) is added to 128. If you want to set the exit code in the shutdown hook, note you need to call Runtime.halt rather than Runtime.exit (to which System.exit delegates). The exit method will cause the JVM to hang in this situation [SO].

No comments:

Post a Comment