Friday, January 10, 2025

The New Logging

Twenty years ago, everybody was writing their own logging framework. Today, when distributed computing is the norm, distributed logging is a necessity.

Functional Programmer really comes into its own in such an environment. So, it's no surprise that the FP tools already support distributed logging. In the Cats ecosystem, there is Natchez. This can feed into distributed tracing system like the open source Jaeger, a Go application that can store the data in Elastic or Cassandra. Zipkin is another if you prefer a Java implementation. 

[Regarding Natchez "unless you have a specific reason to want to use Natchez, you may want to look at otel4s instead. I think the community will be migrating to otel4s once it is binary-stable (Natchez is great and still works well but if you're starting fresh you may as well use the library that implements the industry standard for tracing)" - Discord]

MDC

First, some terminology.

"Mapped Diagnostic Context is to provide a way to enrich log messages with pieces of information that could be not available in the scope where the logging actually occurs, but that can be indeed useful to better track the execution of the program." [Baeldung]

Aside: don't use this mechanism in your business logic code.
Fabio Labella @SystemFw Jan 13 17:20
I don't know how many details I can give, but basically it got to the point that C-level executives knew the word "ThreadLocal", which is really bad. But basically someone had built an entire internal framework based essentially on MDC for a lot of business logic in a multitentant environment (falling down the slippery slope that it was "context"), and that at some point there were a crap load of race conditions due to ThreadLocal + asynchrony, risking crosstalk, which in a financial institution can well mean your whole company gets shut down
Spans and Kernels

"The usual tracing approach involves threading spans (aka the context) throughout the application we wish to instrument. On the other hand, distributed tracing requires a so-called kernel to be able to continue the previous tracing span." - Functional Event Driven Architecture, Volpe.

Cats

Within an effects engine, ThreadLocal becomes the wrong tool. You can use Task Local in Monix to create MDC functionality.
Gerry Fletcher @gerryfletch Jan 13 17:17
It's purely to add the request id into every log line
Fabio Labella @SystemFw Jan 13 17:17
log4cats has a withContext that lets you do that without relying on state
In "Practical FP in Scala", Gabriel Volpe says:
Normally, people use the Slf4j implementation, which is created as shown below. 
implicit val logger: Logger[IO] = Slf4jLogger.getLogger[IO]
In our application, we are going to use Log4cats , which is by now the standard logging library of the Cats ecosystem.Whenever we need logging capability, all we need is to add a Logger constraint to our effect type. For instance:

def program[F[_]: Apply: Logger]: F[Unit] =
  Logger[F].info("starting program") *> doSomething
The idea is that your code logs as normal but transparently these events are sent to some aggregator. What's more, the event are contextual - they might take place over several nodes in a cluster.

Regarding architecture:
Christopher Davenport @ChristopherDavenport Jan 13 17:49
So I can write middlewares on my logger and clients that enhance them with aspects about the user and then I can queue them into my in memory queue for when its something thats decoupled from response cycle. I keep all the logging across the entire stack, even after submission to a queue as a result.
Including things like request-id, user-information, and can continue to propogate that information to other services for tracing reasons.

Theres also a lifter that can lift a client into a Kleisli in http4s, so if you do that before the rest of your app you can still work in a fully abstract F, just pass in the client.

I extract and enhance my loggers at the point I know user identity and then pass those around with a ton of enhanced information to make debugging simple.
The very nice Gabriel Volpe test trading application creates using docker-compose some microservices to demonstrate distributed logging. However, to get the full use out of it, you need to set up a HoneyComb account to view them properly. (You also need to fudge the classpath if you want to Feed the cluster with some fake trades as it needs access to modules/domain/jvm/target/test-classes).