Friday, June 24, 2022

Free Monads

Remember in a previous post that the Free monad can represent a tree where each node has either a branch or a label (thus inferring only leaves have labels).

When we "convert method calls into references to an ADT [we get] the Free monad. When an ADT mirrors the arguments of related functions, it is called a Church encoding. Free is named because it can be generated for free for any S[_] . For example, we could set S to be the [algebras] and generate a data structure representation of our program... In FP, an algebra takes the place of an interface in Java... This is the layer where we define all side-effecting interactions of our system." [1]

My friend, Aaron Pritzlaff, has written a small project that (apart from demonstrating interoperability) does not use any Cats nor Zio magic. 

Here, we build case classes that are all instances of Operation[A] (where the A represents the type of the operation's result). This can conveniently be lifted into a Free[F, A] (where F is now our Operation) with .liftM

As it happens, this creates a type of Free[ called a Suspend[ that's essentially a placeholder for the A as when we flatMap on it, the function we use is A => Free[F, A]. And we can keep doing this until we have built up a tree of Free[Operation, A]s as we promised at the beginning of this post. A shallow path represents a for comprehension that terminates early. 

The important thing to remeber is that we built a data structure that is yet to be executed.

We would execute the true with a call to ".foldMap [which] has a marketing buzzword name: MapReduce." [1] In Aaron's implementation, we just recursively call .foldMap on the Free tree we constructed. Either we'll terminate with a pure value or a Suspended node that will be mapped to a natural transformation (using the ~> function) that maps not A to B, but F[_] to G[_]. This is our interpreter and for us, G must be a monad so we can call flatMap as we pop the stack.

As we work our way back through the stack of monads calling flatMap as we go, we' re invoking those functions,  A => Free[F, A], we spoke about earlier. But note that they're acting on the Gs that our interpreter instantiated.

Although it's good to understand, the Free monad might not be your best choice according to the Cats people:

Rob Norris @tpolecat Sep 24 17:49 2020

Free is out of style but it's still great for some applications. Tagless style is much more common now for writing DSLs and APIs in general. Free is good if you really need to limit what the user can do (the algebra is exactly your data type's (i.e., F's) constructors plus Monad. No more, no less. Sometimes you want this.

The end result is easier to use than tagless because you're just dealing with data. You don't have to thread an effect parameter (typically) everywhere. So your business logic can be very concrete, which is easier for beginners I think.

Fabio Labella @SystemFw Sep 24 19:35 2020

the idea behind Free is great for implementing advanced monads like IO or Stream (which are implemented with a similar strategy even though they don't use Free literally)

Daniel Spiewak @djspiewak Sep 25 15:49 2020

I use Free for prototyping sometimes. It can be a useful tool to see how your effect algebra teases apart at a granular level without actually committing to an implementation, but it really only works for algebraic effects, and you very quickly bump up against some annoying type metaprogramming problems if you want to push it.

I think we've pretty much all learned that there are better ways of encoding effect composition than Free, and simultaneously the mocking it enables, while powerful, isn't that useful in practice... It's still a cool toy though.

[1] Functional Programming for Mortals with Cats


Monday, June 6, 2022

Packaging Python

Java programmers don't know the meaning of classpath hell until they've played with Python. Here are some notes I took while ploughing through the excellent Practical MLOps (Gift & Deza). Following their instructions, I as attempting to get a ML model served using Flask in a Docker container. Spoiler: it didn't work out of the box.

Since the correct OnnxRuntime wheel for my Python runtime did not exist, I had to build onnxruntime with --build-wheel while making the artifact.

This is where I encountered my first dependency horror:

CMake 3.18 or higher is required.  You are running version 3.10.2

when running onnxruntime/build.sh. (You can put a new version first in your PATH and avoid having to install it at the OS level).

This finally yielded onnxruntime-1.12.0-cp36-cp36m-linux_x86_64.whl which could be installed into my environment with pip install WHEEL_FILE... except that cp number must correspond to your Python version (3.6 in this case).

Moving virtual environments between machines is hard. You'd be best advised to use pip freeze to capture the environment. But ignoring this advice yields an interesting insight into the Python dependency system:

The first problem is that if you've created the environment with python -m venv then the scripts have your directory structure backed into them, as a simple grep will demonstrate. Copying the entire directory structure up to the virtual environment solved that.

But running the code gave me "No module named ..." errors. Looking at the sys.path didn't show my site-packages [SO] despite me having run activate. Odd. OK, so I defined PYTHONPATH and then I could see my site-packages in sys.path.

Then, you want to use exactly the same Python version. No apt-get Python for us! We have to manually install it [SO]. When doing this on a Docker container, I had to:

RUN apt-get update
RUN apt-get install -y wget
RUN apt-get install -y gcc
RUN apt-get install -y make
RUN apt-get install -y zlib1g-dev

Note that this [SO] helped me to create a Docker container that just pauses the moment it starts. This allows you to login and inspect it without it instantly dying on a misconfiguration.

The next problem: there are many compiled binaries in your virtual environment.

# find $PYTHONPATH/ -name \*.so | wc -l
185

Copying these between architectures is theoretically possible but the "as complexity of the code increases [so does] the likelihood of being linked against a library that is not installed" [SO]

Indeed, when I ran my Python code, I got a Segmentation Fault which can happen if "there's something wrong with your Python installation." [SO]

Python builds

A quick addendum on the how Python builds projects: the standard way is no longer standard: "[A]s of the last few years all direct invocations of setup.py are effectively deprecated in favor of invocations via purpose-built and/or standards-based CLI tools like pip, build and tox" [Paul Gannsle's blog]