Thursday, September 22, 2022

Rage against the Markup

Markup is pervasive in DevOps. But markup is also:

  • hard to refactor
  • has limited control flow
  • not type-safe
  • hard to test

Refactoring

I see in our codebase something like this:

      - name: Set env to Develop
        if: endsWith(github.ref, '/develop')
        run: |
          echo "ENVIRONMENT=develop" >> $GITHUB_ENV
      - name: Set env to Staging
        if: endsWith(github.ref, '/main')
        run: |
          echo "ENVIRONMENT=staging" >> $GITHUB_ENV
      - name: Set env to Productions
        if: endsWith(github.ref, '/production')
        run: |
          echo "ENVIRONMENT=production" >> $GITHUB_ENV

Not only is this ugly, it's copy-and-pasted everywhere. I can't refactor it and what's more, there is a...

(Lack) of Control

Imagine I want to create an AWS Postrgres instance with Terraform then provision the DB with a schema using the Python library, Alembic all via GitHub Actions? GHA can call the Terraform file and create a DB, but how do I get the URL of that Postgres instance so I can give it to Alembic for it to connect and create the tables? Weaving different technologies together in a Turing Complete language is easy; with markup: less so. I had to hack some calls to the AWS CLI command and parse the JSON it returned, all in bash.

Type-safety issue #1

An example of a lack of type safety can be found in any Terraform script. We had something like this:

resource "aws_ecs_task_definition" "compute_task" {
  family                   = var.task_name
  container_definitions    = <<DEFINITION
  [
    {
      "name": "${var.task_name}",
      "image": "${aws_ecr_repository.docker_container_registry.repository_url}:${var.tag}",
...

Notice the CloudControl JSON being injected into Terraform. Now, trying to add a reference to the aws_iam_role here (as is suggested on numerous websites - see a previous post here) is silently ignored. This wouldn't happen using, say, an SDK in a type-safe language, as you can obvioulsy only access the methods it offers you.

Type-safety issue #2

The secrets in GitHub actions can only be uppercase alpha-numeric and underscore, apparently. AWS identifiers can include alpha-numerics and a dash. Mess these up and you spend time fixing tedious bugs. Tiny types would help with this.

Types #3 

Another example: for creating a DB, we used the password M3dit8at!0n$ - seems OK, right? The DB built fine, the GitHub Actions script then also created the schema fine but we could not login to the DB. Cue hours of frantic checking that there were no network or permission issues. The problem? That bloody password includes characters that need to be escaped on the Linux CLI and that's how Terraform and Alembic were invoked! They were at least consistent - that is, the infrastructure was Terraformed and Alembic built the schema, but for the rest of us, the password didn't work.

In Java, a String is just a String and its content isn't going to break the program's flow. Not so in markup land.

Testing times

Which leads to testing. I made my changes in my GitHub Actions file to use the password in the secret ${{ secrets.DB_PASSWORD-staging }} and ran my mock GHS so:

act -j populate -s DB_PASSWORD-staging=...

and the whole thing worked wonderfully. Only when I tried to create the secret in-situ was I told that DB_PASSWORD-staging was an invalid name.

And how do you test this abomination?

Spot the errors
This was the result of some hasty copy-and-paste.

Solutions...?

What you can do in Scala 3 with metaprogramming is truly amazing. For example, proto-quill creates ASTs that are passed around and are checked against the DB at compile time! I think this might be a bit overkill and a more practical approach is IP4S that compile-time checks your URLs, ports etc. I have a couple of much-neglected projects to at least give the gist of a solution (here's one) that I'll expand on soon.

No comments:

Post a Comment