Sunday, July 19, 2020

Cancellation idioms

Java IO interrupt refresher

"The InterruptibleChannel interface is a marker that, when implemented by a channel, indicates that the channel is interruptible... Most, but not all, channels are interruptible.

"Channels introduce some new behaviors related to closing and interrupts. If a channel implements the InterruptibleChannel interface, then it's subject to the following semantics. If a thread is blocked on a channel, and that thread is interrupted (by another thread calling the blocked thread's interrupt() method), the channel will be closed, and the blocked thread will be sent a ClosedByInterruptException.  Additionally, if a thread's interrupt status is set, and that thread attempts to access a channel, the channel will immediately be closed, and the same exception will be thrown." [Java NIO, Hitchens]

Summarising, if a thread is interrupted before or during a blocking call on a channel, the channel is closed and an exception is thrown.

"It may seem rather draconian to shut down a channel just because a thread sleeping on that channel was interrupted. But this is an explicit design decision made by the NIO architects.  Experience has shown that it's impossible to reliably handle interrupted I/O operations consistently across all operating systems."

"Interruptible channels are also asynchronously closable. A channel that implements InterruptibleChannel can be closed at any time, even if another thread is blocked waiting for an I/O to complete on that channel. When a channel is closed, any threads sleeping on that channel will be awakened and receive an AsynchronousCloseException. The channel will then be closed and will be no longer usable." [ibid]

The problems with Java

Daniel Spiewak @djspiewak Apr 24 19:47, 2020
There are a couple things with thread interruption that are horrible:

There's no way to build "uninterruptible" code. Meaning that you cannot have a critical section which acquires a resource atomically in several steps. Or in other words, there is no analogue to the acquire action in bracket.

The only way to detect self-cancelation for valid purposes (e.g. resource cleanup) is catching the InterruptedException, but doing this immediately flips the interrupted bit on the Thread back to false! 

The only solution to this is to do Thread.currentThread().interrupt() at the end of your exception handler, which almost no one knows to do. To make matters more annoying, even if you do this correctly, you mess up the stack trace on the interruption, because it's technically a new interrupt.
Oh, and exception handlers are not critical regions either, so if someone is just hammering the interrupt() button over and over externally, you could catch the exception, try to clean things up, and then get immediately interrupted again. This actually happens a lot because of the next point.

Catching Exception or Error will silently catch InterruptedException, even when that's almost guaranteed to not be what you want to do. This leads to silently ignoring interruption in most code paths, which is why people repeatedly hammer interrupt() in the first place.

The problems also go deeper than just Java but down to the OS level. "The underlying stream may not know it its closed until you attempt to write to it (e.g. if the other end of a socket closes it)" [SO]. "There's no API for determining whether a stream has been closed." [SO]

The semantics of cancelling

This is a complicated area. (see the interruption model proposed for Cats Effects 3 at https://github.com/typelevel/cats-effect/issues/681)
Basically interruptible/uninterruptible are not composable. The best way to think about it is that "interruptable means always accept the interrupt, no matter what", while "uninterruptible means always suppress the interrupt no matter what". But the uninterruptible(fa >> interruptible(fb) >> fc) breaks either one of the guarantees.
So you have to choose: do you want resource leaks (by biasing in favor of the innermost in that context), or do you want possible deadlocks (by biasing in favor of the outermost)?
And you can't even phrase it as inner/outer, because you can do the same thing in reverse: uninterruptible(interruptible(fa >> uninterruptible(fb) >> fc))
..
as prior art here, Haskell tried all of these and ultimately decided that mask/poll was the sanest solution [Daniel Spiewak, Gitter]

An alternative architecture

"it's a mistake to think of interruptibility as being an attribute of threads or fibers. Instead it should be an attribute of the activities which run on the threads/fibers, and of necessity, that means that any interruptible activity must have it's own first class interrupt channel. If we go down that route then an activity is interruptible if it 1) has an interrupt channel and 2) it's interrupt channel is accessible. If it doesn't have an interrupt channel, or the channel is hidden somehow, then it's uninterruptible.

"Exposing an explicit interrupt channel on every blocking operation that we want to be interruptible is obviously a lot more laborious than just firing random interrupts at globally visible threads/fibers and hoping for the best, but I think it's the only way to go."
[Miles Sabin]


How does this affect Effectful Systems?

Integrating Scala code that uses effectful libraries with Java IO code can cause problems.

Gavin Bisesi @Daenyth Feb 05 21:16, 2020
InputStream is always a blocking api
(note you don't need much to make a blocker; Blocker.apply gives a Resource of one)

Daniel Spiewak @djspiewak Feb 05 21:46, 2020
FYI, all things involving files are blocking except on Windows, and even then they're blocking most of the time.
So the "NIO stuff" that is inside of getResourceAsStream is actually not NIO but rather regular IO wrapped up with a thread pool :-(
I generally use Blocker just to be safe on resource access. It doesn't really cost that much in terms of syntax
... non-blocking things must have a callback-driven API
either directly (via callbacks passed to functions) or indirectly (via Future or CompletableFuture)
If something doesn't have a callback API, then you know it's blocking


When can you cancel?

The effect of a cancellation is felt at every asynch boundary or every N flatMaps (where N=1 for ZIO, it seems).

Note that there is no code in ZIO nor Cats that calls Thread.interrupt() that I could find. Note however that ZIO still gives you the ability to wrap your code in a Future and cancelling this will lead to an InterruptedException (see this gist).

Note that there are still undefined areas in Cats regarding cancellation:

Raas Ahsan @RaasAhsan Jul 07 20:26
calling cancel right after start results in non-deterministic behavior

Fabio Labella @SystemFw Jul 08 19:38
yeah I wanted to say
fa.guarantee(foo) doesn't guarantee that foo will always happen
it guarantees that if fa happens, then foo always happen
In particular if you have fa.guarantee(foo).start.flatMap(_.cancel)foo might happen or not, because the program can be cancelled before fa gets scheduled to run

See tip #2 at this Cats video (An Introduction to Interruption by Jakub Kozlowskiat 11'03" ) where starting and joining in a for comprehension is an anti-pattern (what if one fails to complete?). Instead, one should use (ioa, iob).parTupled.


No comments:

Post a Comment