Tuesday, March 11, 2025

Cloud Maintenance

Although it's often referred to as "infrastructure as code", there is very little code in what most people call DevOps. It's mostly markup. This can cause maintenance issues. There are, however, ways of dealing with this situation.

CDK8s is the Cloud Development Kit for Kubernetes. It has Typescript, Java, Python and Go implementations.
"I wouldn’t head down the Helm path for that before I took a long look at CDK8s.  Helm templates are a nightmare for debugging in my experience. Instead having a real programming language backing up the templates is so much better...  It renders the YAML, it does not apply it. I use it with ArgoCD as my deployment mechanism. So I take the resulting YAML and check it into git for ArgoCD to apply to the cluster.  Execution of the CDK8s code and check into git is automatic as part of the CI jobs." [Reddit]
CDK8s is a templating framework that allows you to build K8s config files in a number of languages, including Java.

"KCL is an open-source, constraint-based record and functional programming language. It leverages mature programming language technology and practices to facilitate the writing of many complex configurations. KCL is designed to improve modularity, scalability, and stability around configuration, simplify logic writing, speed up automation, and create a thriving extension ecosystem." [DevOps.dev]. Its config is made up of schema, lambdas and rules [home] that constrain not just the structure of the document but also the values.

KCL is a new language whild CDK8s leverages popular languages that already exist.

Saturday, March 8, 2025

Automated documentation

I've never seen anybody go back to fixing documentation that goes out of date - but then I've only been software engineering for 30 years. Since nobody ever fixes it, a better strategy is to automate it.

To this end, there is OpenApi (formally Swagger) for documenting REST API endpoints. There are tools that convert the OpenAPI config files into Python, Java, Scala amongst others. You can go the other way and generate code from the OpenAPI config files (example for Python here). 

An example can be found in the Apache Polaris codebase where Gradle builds the classes given a YAML file. IntelliJ quite cleverly recognises it as an OpenApi file and allows you to test the API by invoking the service through the IDE.

IntelliJ's version of Postman

It will even generate the DTOs as defined in the YAML file.

Databricks' DABs

If you're writing Databricks code in an IDE that is to be run on an ad hoc basis (rather than some CI/CD pipeline) you might want to use the Databricks VSCode plugin. This will automatically build your Data Asset Bundle for you. Upon signing in, a databricks.yml file will be created at the root of your project. It contains the minimum amount of information to deploy your code to .bundle in your DBx home directory under a sub-folder called the bundle's name field.

You can also deploy bundles via VSCode. Configure a root_path under workspace in databricks.yml and when you press the cloud button on the BUNDLE RESOURCE EXPLORER pane withing the Databricks module:

Upload via the cloud button
the bundle will be uploaded to the workspace and directory specified. You can, of course, use the databricks CLI. But for ad hoc uploads, VS Code is very convenient. By default, deployments are to /Users/${workspace.current_user.userName}/.bundle/${bundle.target}/${bundle.name}.

Configuration can be for development of production mode. The advantage of development is "turns off any schedules and automatic triggers for jobs and turns on development mode for Delta Live Tables pipelines. This lets developers execute and test their code without impacting production data or resources." [1]

[1] Data Engineering With Databricks, Cookbook.