Friday, March 13, 2020

Why FP?


In my last position, I was surprised by the number of programmers who were writing Java in the Scala language. That is, different language, same style and idioms. This to me misses the point of Scala. It is not just a better Java. In the realm of distributed computing, it's essential to fully embrace its functional programming aspects.

I gave a talk that was well-received by my colleagues and have since moved in. Unfortunately, the new shop is like the old where the developers are writing Java in Scala. So, these are the notes for my presentation that I may soon have to dust off and present again.

Intro

  • there are FP design patterns just like GoF in OO
  • Haskell and Scala programmers can talk in FP patterns just as fluently as Java and C++ programmers can talk in OO patterns
  • ground work for FP dates back to 1920s-1950s mathematics.
  • I’m assuming you all already know basic FP points (eg, immutability, don’t throw exceptions etc)
  • Disclaimer: I’m not the best FP programmer in the world and although I love Cats, am far from an expert.

Motivation

  • Make programming more mathematical using math proofs. This way, logic errors can be caught by the compiler rather than at runtime.
  • A common maths language allows us to more easily reason about the code (see Semigroups below).
  • Since everybody is singing to the same hymn sheet, code had a lot less boilerplate and is easier to reason about.

For-comprehensions

  • Advantage: calling and called code are decoupled regarding the ‘container’ type.
  • Advantage: which monad doesn’t matter (eg, Try, Either, Option, List etc).
  • Terminates early
  • The obvious structure for sequential operation.
  • Sequence: that is M[T[_]] => T[M[_]] can take a list of results and produce a result.

Which brings me to:

Semigroups

  • Useful in aggregation.
  • They’re associative operations on a type.
  • Question: is f(x1, f(x2, f(x3, x4))) associative for adding integers? Eg, is it the same as f(x1, f(f(x2, x3), x4)))?
  • Is it associative for subtracting integers?
  • It allows partitioning
  • If it’s also commutative, you can multi-thread the operations!

Applications to big data

  • Associativity allows partitioning
  • Commutativity allows maximum parralelism
  • Addition of real numbers is associative and commutative
  • Concatenating strings is associative but it’s not commutative
  • Substring real numbers is neither associative nor commutative

Disadvantages

  • If you use Cats, there are some odd imports and dependencies you need to get used to (not too hard)
  • You need to learn a new ‘pattern language’ but since it’s been about since the mid 20th Century, it’s not going to change overnight.
Conclusion

The following was not in my presentation but is just a great description of why we should prefer an FP approach:
"Consider a service which must read some data from shared state in order to proceed. The shared state is stored on other servers (e.g. Redis, Postgres, some other service, etc). State must be read from multiple of these sources in order to formulate a response. These services have varying latencies and availabilities, and you need to be failover each one individually to alternate primaries.

"Each read requires an asynchronous connect/read/close resource management process, its own retry and independent exponential backoff cycle, its own failover, etc. All of them must be run in parallel such that if any of them fails, they're all aborted as quickly as possible. Meanwhile, the client connection itself might get aborted, which should clean up all associated resources. In the event of any errors, or a success, the response must be returned to the client. All asynchronously.

"Doing the above without functional effects is astonishingly complex even with absolute best-in-class tools. Your best bet outside of IO is to use something like Future with very careful and prescribed use of cooperative explicit cancelation checks and extremely well thought-out refactoring into defs. It would be possible but brittle.

"Now imagine taking the above and adding an extra data source which needs to be aggregated to form a response. With FP this is trivial: you just write it independently and then add it to the parTraverse. With any other approach, even Futures, this is not really all that straightforward.

"To me, the above unequivocally demonstrates the value of functional programming in modern scalable service architectures" - Daniel Spiewak


No comments:

Post a Comment