Agile Java Man

Network security for numpties

2026-04-07T07:05:00.000-07:00

Some concepts and terminology:

STUN (Session Traversal Utility for NAT)

STUN servers can be queried via the STUN protocol to give the IP address you are known as on the internet (that is, after NAT).

Hole Punching

Outbound packets implicitly create an open port to allow the expected reply. This is exploited to allow a long running conversation to take place between peers.

Some NAT blocks this technique.

DERP (Designated Encrypted Relay for Packets)

A server through which traffic can pass if hole punching is not available.

This is secure since the packets are encypted by the two peers, so the server just routes packets without being part of the conversation.

WireGuard

Uses UDP so fast.

Built into the Linux kernel using a very small number of lines of code.

Peers are not identified by their IP addresses but their public keys.

TailScale

Part open source, part paid service that allows you to have a Virtual Private Network that can span multiple physical locations.

Each device talks to each other using all the technologies mentioned above.

The server part is proprietary but there are open source alternatives (Headscale, NetBird, Nebula from Slack).

Tailscale differs from the commercial Prisma Access in that it's architecture is peer-to-peer whereas yout traffic in Prisma passes through their edge Service Edge where packets can be inspected for security reasons.

Remote Desktops

There are a few (Selkies, Apache Guacamole, Kasm, n.eko etc) but they all follow one of two paradigms: old fashioned RDP; and WebRTC. The latter doesn't expose the VM directly (by using the technologies above), and uses encryption natively.

Kubernetes Cilium

Cilium is considered more secure than Calico as it uses WireGuard (see above) and eBPF - where Linux filters packets at the kernel level and also reduces copying data into user space so it's more efficient. (Apparently, Calico can now be configured to use WireGuard).

Note that if you want to use Cilium in AWS and you're using EksCluster to create your Kubernetes cluster, you first need to kubectl delete ds both aws-node and kube-system.

OAuth logins

If you want to hide your Kubernetes service behind an OAuth page, you can use oauth2-proxy which starts a pod in your cluster that links to the OAuth provider defined by --oidc-issuer-url. In my case this is https://accounts.google.com and I've configured my Google account to redirect to my URL and have it generate the credentials under OAuth 2.0 Client IDs at the Google GUI.

Multi-cloud Devops tips

2026-03-20T01:20:00.000-07:00

I'm using multiple clouds on a regular basis and constantly need to jump between them. So, here are some commands I use often:

Kubernetes

See all the clusters you have access to with:

kubectl config get-contexts

Your current one is highlighted with an asterisk. If you just want to see your current context, run:

kubectl config current-context

Change to another with:

kubectl config use-context NAME_OF_CONTEXT

With AWS, you might need to refresh your K8s config with:

aws eks update-kubeconfig --region $REGION --name $CLUSTER_NAME

after switching.

AWS

See who you currently are in AWS with:

aws sts get-caller-identity

and just to check you have access to S3:

aws s3 ls s3://BUCKET_NAME/DIRECTORY/

Azure

See who you are with:

az ad signed-in-user show --query id -o tsv

az account show

And check access to BLOB storage with:

az storage blob list --account-name ACCOUNT_NAME --container-name BUCKET --output table --prefix acceptancetests/ --delimiter /

GCP

See who you are with:

gcloud auth list

and check storage access with:

gcloud storage ls gs://BUCKET_NAME/DIRECTORY

I've put all these commands here because I keep forgetting them.

What happens in an LLM? (part 1)

2026-03-14T02:13:00.000-07:00

A nice overview that's detailed but not too intricate is here [blog of SteelPh0enix AKA Wojciech Olech]

Note that when using a fully trained LLM, things are conceptually much simpler because it is more or less just a feedforward network. That is, the weights are immutable. State lives outside of the ANN and is updated by the output after each token runs through the feedforward network.

Attention!

The attention mechanism of an RNN takes the encodings and for each token, augments the input vector both forwards and backwards. "The rationale behind this is to capture additional information since current inputs may have a dependence on sequence elements that came either before or after it in a sentence, or both." [1] The two vectors are then concatenated to make one long vector. "We can consider this concatenated hidden state as the annotation of the source word since it contains the information of the jth word in both directions." [1]

The self-attention mechanism for each word has three vectors: query, key and and value. It compares the query of each word to the key of the others. This process is done in parallel ("multi-head attention") using different Q/K/V weights and the results combined.

The transformer architecture has superceded RNNs.

The cat sat on the mat

Conceptually, q, represents the query (eg, the word "sat" is looking for something to in the sentence to do the sitting); k is the key where the word "cat" is saying I am a noun that can sit; and v links the verb looking for a noun and the noun looking for a verb.

This is similar to when we apply singular value decomposition (SVD) to a document/term space and create a concept space.

Basic self attention

Imagine T input vectors x⁽ⁱ⁾.

Tokens are embedded in a space of size d.

The T output vectors of self-attention are vectors z⁽ⁱ⁾.

These vectors are calculated thus:

z^{(i) =}Σ_j=1^T α_ij x⁽ⁱ⁾

where α is a matrix of the dot products of all the x vectors with softmax applied to it (remember, softmax does not change the relative sizes of the logits but does convert them into probabilities).

Multi-headed attention is the same algorithm calculated for multiple heads that is, mutliple swimlanes of datat that represent nuances in language structure. Note that these nuances (grammar, sentiment etc) are not deliberately targeted. They are an emergent property like concept space in SVD/NLP.

Self attention with learnable parameters

Here we project each x vector onto U_k, U_v and U_q. Note that Q=xU_q etc and U_q etc are fixed for a feed forward network. That is, somebody has done the hard work of calculating them during training.

The projections onto q and v are then multiplied together and the result is put into a matrix ω_ij where i is a token and j any other token.

The matrix ω is divided by (typically) √d and softmaxed.

That is:

Attention(Q, V, K) = softmax ( Q K^T / √d_k ) V

Tiling

Tiling is a technique when performing matrix operations on data that won't fit into memory.

"With naïve algorithm, to compute each element of the result, we gonna need to fetch S elements from both matrices. The output matrix has S² elements, therefore the total count of loaded elements is 2S³." [Penny Xu's blog]

This breaks down as 2 [vectors - one from each matrix] * S [the size of those vectors] * S²[the number of elements that are the result of all these dot products - that is, the size of the matrix].

"With 32×32 tiling, to compute each 32×32 tile of the result, we gonna need to fetch S/32 tiles from both matrices. The output size in tiles is (S/32)², the total count of loaded [elements] is 2*(S/32)³. Each 32×32 tile contains 32² elements, the total count of loaded elements is therefore (32²)*2*(S/32)³ = (2/32)*S³. Therefore, the tiling reduced global memory bandwidth by the factor of 32, which is a huge performance win." [ibid]

In other words, having broken the matrix down into a grid of size 32×32, and each block in the grid involving 2*(S/32)³ operations, the total number of operations is this first number times the second - that is, (2/32)*S³.

Note that the resulting matrix is tiled also. So, if we doing the matrix multiplication of C=AB, the total memory needed is one tile of each of A, B and C.

There's a further optimization. If we're looking for the maximum value (which is very common in neural nets where we typically employ the softmax function), we only need to store one value per tile per column/row.

[1] Machine Learning with PyTorch and SciKitLearn.

Permissions and Lakes

2026-03-06T07:00:00.000-08:00

OpenID (authorisation) is built on top of OAuth (authentication).

"It allows clients to verify the identity of the end user based on the authentication performed by an authorization server, as well as to obtain basic profile information about the end user in an interoperable and RESTlike manner." Zero Trust Networks (O'Reilly)

The (Java) KeyCloak is a common choice for an open source solution. KeyCloak is an Identity Provider (IdP).

Polaris can vend credentials but it (rather than the cloud IAM system) controls who gets what. It acts as an ACL for ACLs, if you like. This code shows how a request is associated with a realm and a realm with a credential.

If you're going to use Polaris in production, you'll probably need a certificate from a recognised Certificate Authority like Let's Encrypt. The reason is that all HTTPS clients need to make a call to a list of hard coded authorities who will sign off the certificate as genuine.

AWS users can use AWS Certificate Manager (ACM) to certify endpoints - rather than configuring Polaris to use SSL. You can have AWS manage the whole thing; or you can "install an externally signed private CA certificate on your subordinate CA. This CA certificate must be signed by a parent CA. Installing the certificate completes the creation and activation of the CA."

Either way, the idea is that the Elastic Load Balancer provides an HTTPS endpoint, does all the de/encryption gubbins and then forwards plain HTTP on to Polaris that sits securely in your Virtual Private Cloud.

To this end, it seems you must deploy the AWS Load Balancer Controller in your Kubernetes cluster much like the vpc-cni EksAddon.

Note that it can take a minute or two for a mapping from a domain name to an endpoint to be registered. Run:

dig A YOUR_DOMAIN_NAME +short

to see if your DNS is updated.

Fine Grained Access

OpenFGA adds Fine Grained Access control. This implementation is written in Go.

Kubernetes

Azure and GCP do things differently. It uses a sidecar that leverages Let's Encrypt.

"Uploading and managing TLS secrets can be difficult. In addition, certificates can often come at a significant cost. To help solve this problem, there is a nonprofit called “Let’s Encrypt” running a free Certificate Authority that is API-driven. Since it is API-driven, it is possible to set up a Kubernetes cluster that automatically fetches and installs TLS certificates for you. It can be tricky to set up, but when working, it’s very simple to use. The missing piece is an open source project called cert-manager created by Jetstack, a UK startup, onboarded to the CNCF." - Kubernetes Up & Running 3rd Ed., O'Reilly

Certificates and Challenges

In the context of Kubernetes, the certificate contains the secret name where the final certificate will be stored and a reference to an issuer. The result is a Kubernetes secret containing the actual public and private key for HTTPS.

The challenge proves to the CA that you own the domain. You can either use a HTTP-01 challenge where you host a token on a URL that uses the domain.

Or you use DNS-01 where you ask your DNS provider to host a record containing the token (see below).

Domain Ownership Validation

AWS has Route53 that nicely integrates management of domain names with Kubernetes. That is, you can seemlessly have an EKS ingress assigned an AWS managed domain.

Google, however, have recently sold their domain name arm so you need to persuade them that the domain you own is really yours before they'll point it at your Kubernetes ingress. To do this, run:

gcloud certificate-manager dns-authorizations create ARBITRARY_STRING --domain="YOUR_DOMAIN" --project YOUR_PROJECT

Secrets

So much for security outside the cloud. Here is how you deal with it inside.

You'll need to install secrets-store-csi-driver-provider-aws which is a CRD that runs in Kubernetes and talks to AWS. It allows you to mount the secrets in your container as if they were any other filesystem.

I followed the instructions in the AWS CSI driver (above) but I could just not get Option 1 to work despite a lot of time checking that everything was OK. It's still a mystery but Option 2 worked first time.

Restart a deployment with something like:

kubectl rollout restart deployment polaris

This is especially useful if you've updated the secrets.

Cloud and K8s

2026-03-06T06:56:00.000-08:00

I've spent the week setting up HTTPS certificates and domain names for my Azure and GCP K8s clusters.

At one point, the GCP K8s installation just kept hanging with "Still creating...". It turned out that we just didn't allocate enough resources:

But not all hanging Kubernetes deployments are so easy to spot.

Automatically point domain names at K8s pod IPs

AWS is better integrated because you can buy the domain names through Route53. But for Azure and GCP K8s, we did the following.

we bought a domain name from AWS via Route53.
we delegated the nameservers of this domain to Microsoft or Google.
a Kubernetes sidecar starts up and contacts Let's Encrypt's API .
Let's encrypt returns a token
the sidecar encodes with its private key and hosts it (a.k.a Key Authorization) on port 80.
Let's encrypt reads that file and decodes it with the cluster's public key. Now it can grant a certificate.

The sidecar is called ACME (Automatic Certificate Management Environment) and is ephemeral:

$ kubectl get events -A --sort-by=.lastTimestamp | grep -i acme

...

default 45m Normal Started pod/cm-acme-http-solver-6j5xp Started container acmesolver

default 44m Normal Sync ingress/cm-acme-http-solver-lgl5w Scheduled for sync

default 43m Normal Killing pod/cm-acme-http-solver-6j5xp Stopping container acmesolver

the outside world can talk to both a service and an ingress. Which you choose depends on what you want. Use Ingress for HTTPS.

An Ingress always talks to a service. Note that it is a logical abstraction and can have multiple ingresses for one domain. For example:

$ kubectl get ingress -A

NAMESPACE NAME CLASS HOSTS ADDRESS PORTS AGE

default cm-acme-http-solver-ldflx <none> emryspolarisgcp.click 35.189.87.151 80 14m

default polaris-ingress nginx emryspolarisgcp.click 35.189.87.151 80, 443 14m

is saying the emryspolarisgcp.click can point to different services depending on its ports. Here ACME is sticking around to finish the creation of certificates.

This was for GCP where the nameservers were incorrect (they changed every time we deployed the managed zones though Terraform). You might want to see them with:

gcloud dns managed-zones describe polaris-zone --project=afon-core

What can go wrong?

Traffic can be swallowed because:

network security groups (or lack of them)
misconfigured ports
selectors not pointing at the correct pods

Incoming traffic goes through the system in this order:

ingress (optional - see above)
service
endpoint
pod

kubectl get ingress shows the name and the ports of the front facing interface

kubectl describe ingress XXX shows the service to which traffic is sent

Note with an nginx load balancer, this service comes before the ingress:

$ dig A emryspolarisazure.click +short

20.108.199.92

$ kubectl get service -A

NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE

... 50m

default polaris-internal ClusterIP 10.2.185.168 <none> 8181/TCP 47m

ingress-nginx nginx-ingress-ingress-nginx-controller LoadBalancer 10.2.238.76 20.108.199.92 80:30157/TCP,443:32110/TCP 47m

ingress-nginx nginx-ingress-ingress-nginx-controller-admission ClusterIP 10.2.30.150 <none> 443/TCP 47m

...

Debugging

If in doubt, port forward:

kubectl port-forward svc/polaris-internal 8080:8181 -n default

This will at least establish that the communication between your service and application is fine.

It's important to check that the firewall is at least expecting a connection for that IP address and port. Don't use curl for this as it is subject to network security rules and certificates being in place. So, run:

nc -zv 20.108.199.92 80

if you want to make sure that port is open as firewalls allow a TCP three way handshake even if the Network Security Group blacks further traffic.

Upon setting up the stack with tofu, the certificate doesn't look healthy and I can't access my site via HTTP.

$ kubectl get certificate polaris-tls

NAME READY SECRET AGE

polaris-tls False polaris-tls 28m

$ kubectl get challenges -A

NAMESPACE NAME STATE DOMAIN AGE

default polaris-tls-1-648081749-304604175 invalid emryspolarisazure.click 36m

$ kubectl describe certificate polaris-tls

...

Events:

Type Reason Age From Message

---- ------ ---- ---- -------

Normal Issuing 23m cert-manager-certificates-trigger Issuing certificate as Secret does not exist

Normal Generated 23m cert-manager-certificates-key-manager Stored new private key in temporary Secret resource "polaris-tls-rw6t9"

Normal Requested 23m cert-manager-certificates-request-manager Created new CertificateRequest resource "polaris-tls-1"

Warning Failed 21m cert-manager-certificates-issuing The certificate request has failed to complete and will be retried: Failed to wait for order resource "polaris-tls-1-648081749" to become ready: order is in "invalid" state:

$ kubectl describe challenge -A

...

Events:

Type Reason Age From Message

---- ------ ---- ---- -------

Normal Started 37m cert-manager-challenges Challenge scheduled for processing

Normal Presented 37m cert-manager-challenges Presented challenge using HTTP-01 challenge mechanism

Warning Failed 35m cert-manager-challenges Accepting challenge authorization failed: acme: authorization error for emryspolarisazure.click: 400 urn:ietf:params:acme:error:connection: 51.132.211.134: Fetching http://emryspolarisazure.click/.well-known/acme-challenge/5yGd57VQUrjc2ns-Q-VEVIl3vl6WKFK4B2fQu643_TM: Timeout during connect (likely firewall problem)

Running:

kubectl delete certificate polaris-tls

did the trick as it forces the certificate to renew. Watch and wait for it to be ready with:

kubectl get certificate polaris-tls -w

You need to run this (or put the equivalent in your Terraform file):

kubectl annotate service nginx-ingress-ingress-nginx-controller -n ingress-nginx "service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path=/healthz"

and that should now all work. A happy system should look like:

$ kubectl describe certificate polaris-tls

...

Events:

Type Reason Age From Message

---- ------ ---- ---- -------

Normal Issuing 42m cert-manager-certificates-trigger Issuing certificate as Secret does not exist

Normal Generated 42m cert-manager-certificates-key-manager Stored new private key in temporary Secret resource "polaris-tls-thlvj"

Normal Requested 42m cert-manager-certificates-request-manager Created new CertificateRequest resource "polaris-tls-1"

Normal Issuing 40m cert-manager-certificates-issuing The certificate has been successfully issued

Gradle cheat sheet

2026-03-05T08:01:00.000-08:00

Some miscellaneous notes I found useful.

Building

Build with neither tests nor RAT:

gradle build -x rat -x test

List the projects with:

./gradlew projects

Viewing Dependencies

If you want to see the dependencies of a project, use:

gradlew :PROJECT_NAME:dependencies [--configuration runtimeClasspath|compileClasspath]

where PROJECT_NAME is what you get from listing the projects.

To examine a particular dependency:

./gradlew :polaris-server:dependencyInsight --dependency protobuf-java --configuration runtimeClasspath

Defining Dependencies

Typically, you'll find a libs.versions.toml that defines libraries but does not by itself include them. What Gradle does do is autogenerate a Java class that becomes the lib object in your build.gradle.kts files and you can reference the entities in the .toml file as Java code.

Tests

If you want to run a single test:

gradlew :PROJECT_NAME:test --tests FQN

Tasks

View tasks for a particular project with, for example:

./gradlew :polaris-config-docs-site:tasks

An unruly Terraform

2026-02-19T03:40:00.000-08:00

If the Terraform state is out of synch with reality, you might need to change that state manually with something like:

tofu state list

followed by

tofu state rm XXX

I had to delete load balancers manually through the AWS Web Console and then also the EKS instance. I then had to manually delete any references to them from my JSON.

Tip: regularly delete the directory in which the Terraform lives as state gets kept there that the next run implicitly relies upon. The consequence if you don't is that after a major refactor, you run the configuration and everything looks fine. You check in thinking you've done a good job but there was an invisible dependency on the previous running and checking out to a fresh directory fails. So:

Delete all files regularly

I was getting lots of:

│ Error: Get "https://21D13D424AA794FA2A76DE52CA79FBE9.gr7.eu-west-2.eks.amazonaws.com/api/v1/namespaces/default/services/jupyter-lb": dial tcp: lookup 21D13D424AA794FA2A76DE52CA79FBE9.gr7.eu-west-2.eks.amazonaws.com on 127.0.0.1:53: no such host

│

even after blatting my Terrafrom cdktf.out/stacks directory. Turns out state files were accumulating in the root directory of my project (which contained cdktf.out). Once they too were blatted, things looked better.

Changing the cdktf.out.json file resulted in:

│ Error: Inconsistent dependency lock file

│

│ The following dependency selections recorded in the lock file are inconsistent with the current configuration:

│ - provider registry.opentofu.org/hashicorp/helm: required by this configuration but no version is selected

│

│ To update the locked dependency selections to match a changed configuration, run:

│ tofu init -upgrade

The solution was to run tofu init -upgrade

GCP

You might see this error when running Terraform on GCP:

│ Error: Error setting access_token

│

│ with data.google_client_config.gcp-polaris-deployment_currentClient_7C40CA9C,

│ on cdk.tf.json line 25, in data.google_client_config.gcp-polaris-deployment_currentClient_7C40CA9C:

│ 25: }

│

│ oauth2: "invalid_grant" "reauth related error (invalid_rapt)" "https://support.google.com/a/answer/9368756"

It's nothing really to do with TF but rather your GCP credentials. Login with gcloud auth application-default login and try again. D'oh.

AWS

aws ec2 describe-network-interfaces --filters Name=vpc-id,Values=$VPC --region $REGION

aws ec2 describe-internet-gateways --filters Name=attachment.vpc-id,Values=$VPC --region $REGION

aws ec2 describe-subnets --filters Name=vpc-id,Values=$VPC --region $REGION

aws ec2 describe-security-groups --filters Name=vpc-id,Values=$VPC --region $REGION

This last one showed 3 security groups.

The reason that these AWS entities lingered is because my tofu destroy was always hanging. And the reason it never finished is that there were finalizers that prevented it. To avoid this, I needed to run:

kubectl patch installation default -p '{"metadata":{"finalizers":[]}}' --type=merge

kubectl patch service YOUR_LOAD_BALANCER -p '{"metadata":{"finalizers":null}}' --type=merge

Also, CRDs need to be destroyed:

for CRD in $(kubectl get crds | awk '{print $1}') ; do {

kubectl patch crd $CRD --type=json -p='[{"op": "remove", "path": "/metadata/finalizers"}]'

kubectl delete crd $CRD --force

} done

I would then run these scripts as a local-exec provisioned in a resource.

I asked on the DevOps Discord server how normal this was:

PhillHenry
I'm using Terraform to manage my AWS stack that (amongst other things) creates a load balancer using an aws-load-balancer-controller. I'm finding destroying the stack just hangs then times out after 20 minutes.

I've had to introduce bash scripts that patch finalizers in services and installations plus force delete CRDs. Finally, tofu detroy cleans everything up but I can't help feeling I'm doing it all wrong by having to add hacks.

Is this normal? If not, can somebody point me in the right direction over what I'm doing wrong?

snuufix
It is normal with buggy providers, it's just sad that even AWS is one.

It appears I am not the only one:

The_Ketchup, CJO
This is mainly for my homelab to teardown when Im done for the day. So when the aws ingress controller makes an LB via K8s, terraform doesnt know about it so I have to manually go in and delete it in the aws console. Its not very clean. So I was thinking maybe if its managed under argocd it will know about it and delete it? Idk its kinda confusing. Maybe I jsut do kubectl delete ingress --all or something and THEN do terraform destroy?
Cuz right now it just wont delete my subnets since theres an LB in there when I do terraform destroy

Darkwind The Dark Duck
U could use AWS Nuke to clean anything remaining 😄

Redeploying

Redeploying a component was simply a matter of running:

tofu apply -replace=kubernetes_manifest.sparkConnectManifest --auto-approve

This is a great way to redeploy just my Spark Connect pod when I've changed the config:

Helm

If you want to find out what version of a Helm chart you're using when you forget to set it, this might help. It's where Helm caches the charts it downloads.

$ ls -ltr ~/.cache/helm/repository/

...

-rw-r--r-- 1 henryp henryp 107929 Nov 12 10:41 spark-kubernetes-operator-1.3.0.tgz

-rw-r--r-- 1 henryp henryp 317214 Nov 20 15:56 eks-index.yaml

-rw-r--r-- 1 henryp henryp 433 Nov 20 15:56 eks-charts.txt

-rw-r--r-- 1 henryp henryp 36607 Nov 24 09:15 aws-load-balancer-controller-1.15.0.tgz

-rw-r--r-- 1 henryp henryp 493108 Dec 11 14:47 kube-prometheus-stack-51.8.0.tgz

-rw-r--r-- 1 henryp henryp 38337 Dec 15 12:20 aws-load-balancer-controller-1.16.0.tgz

...

More Kubernetes notes

2026-02-19T03:32:00.000-08:00

Networking

This was an interesting online conversation about how network packets set to the cluster's IP address are redirected to a pod in Kubernetes cluster.

Each pod has its own IP which is managed by the Container Networking Interface (CNI). Every node runs a kube-proxy which manages how a cluster IPs map to pod IPs. Pod IPs are updated dynamically and only include pods passing their health check.
The node receiving the TCP request does forward to the destination pod, but the mechanisms sort of depend on the CNI. In cloud environments like AWS and GCP, the CNI just sends it directly out on to the network and the network itself knows the pod IPs and takes care of it. Those are so-called VPC Native networking.

Some CNIs have no knowledge of the existing network, they run an overlay inside the cluster that manages the transport and typically that's done with IP encapsulation and sending the encapsulated packet to the destination node.

In VPC Native networking, your node just sends packets to the destination pod like a regular packet. The pods are fully understood and routable by the network itself.

It works differently on-prem. It depends on your CNI. In an on-prem network, using most other CNIs, including microk8s which uses Calico, the network doesn't know anything about pod IPs. Calico sets up an overlay network which mimic a separate network to handle pod-to-pod communication.
In VPC Native networking, things that are outside your kubernetes cluster can communicate directly with k8s pods. GCP actually supports this, while AWS uses security groups to block this by default (but you can enable it). in overlay CNIs like Calico or Flannel, you have to be inside the cluster to talk to pods in the cluster.

[hubt - Discord]

Debugging

This proved very useful when a pod suddenly died:

kubectl get events -A --sort-by=.lastTimestamp | grep -i POD_NAME

Turned out:

The node was low on resource: ephemeral-storage. Threshold quantity: 2139512454, available: 1826840Ki. Container spark-kubernetes-driver was using 1551680Ki, request is 0, has larger consumption of ephemeral-storage.

This is about the best way to see what happened to a pod once it dies.

GPU Programming pt 1.

2026-02-14T04:36:00.000-08:00

This is a nice Java implementation of various LLMs that can also defer to TornadoVM to process the maths on a GPU. Looking at the code gives a good idea of the impedance mismatch between CPU and GPU programming as the Java code covers both. Here are some miscellaneous notes.

GPU Terminology

In TornadoVM's KernelContext you'll see various references to Global, Group and Local IDs.

The global ID is the overall ID of the thread. Note that it can be virtualized - that is, it can be larger than the number of threads your hardware supports.

The group ID refers to a logical grouping of threads. Note that a warp is a physical (ie, hardware dependent) grouping of threads. Work groups are made of integer multiples of warps. Warps always have 32 threads in NVidia hardware and process in lockstep. Work groups however can execute their warps asynchronously.

If a warp hits an if/else statement, the branches are executed sequentially and you lose parallelism!

Threads in a work group can share memory. The local ID is the ID of a thread within that group.

GPU algorithms

Writing algorithms is different in the world of GPUs. This Java/TornadoVM code that is turned into GPU code:

context.localBarrier();

for (int stride = localWorkGroupSize / 2; stride > 0; stride >>= 1) {

if (localId < stride) {

localSum[localId] += localSum[localId + stride];

}

context.localBarrier();

}

is actually reducing an array to an int using the GPUs threads. It first with half the warf of 32 threads. Starting with 16 threads, then 8, then 4 etc each thread takes 2 elements in the array and adds them. Only half the number of threads are needed on the next iteration and so on. All the other threads are "masked", that is, not used.

Mapping TornadoVM to GPU concepts

There is a line in fusedQKVMatmulX that basically says:

if (context.globalId < input_array_size) ...

Yeah, but what if the maximum globalId (the actual thread ID) is much lower than the array size? Do we ignore the rest of the array?

The answer is no because globalId is a virtual ID and does not represent the physical limits of your hardware. As it happens, my RTX 4070 (Laptop) has 4608 CUDA cores whereas the model I am running (Llama-3.2-1B-Instruct-F16) has a hidden size of 4096 so it seems like it all fits into memory without resorting to virtualization tricks.

The functions above don't generally have loops in them. The reason is that the loop is implicit. Each GPU thread calls the function.

Graal- and TornadoVM

Note that TornadoVM heavily relies on GraalVM. If you look at the stack, you'll see the code in PTXCompiler.emitFrontEnd appears to exploit GraalVM's ability to examine the bytecode of the functions mentioned above. It does this so it can convert them into CUDA code.

Consequently, you'll never see any breakpoints hit in these TransformerComputeKernelsLayered functions.

My Polaris PR

2026-02-11T07:18:00.000-08:00

I've had some issues with a federated (a.k.a catalog) in Polaris connecting to GCP so I raised this ticket outlining the problem

Having a bit of time to implement it, I've raised a PR. The first thing I had to do was get familiar with:

The Polaris Architecture

Note the DTOs are automatically generated (see spec/polaris-management-service.yml). See client/python/spec/README.md for full instructions, but running:

redocly bundle spec/polaris-catalog-service.yaml -o spec/generated/bundled-polaris-catalog-service.yaml

brings all the YML together and then:

./gradlew format compileAll && ./gradlew check -x rat

The reason for doing it this way is to generate both Python (with make client-regenerate) and Java (with ./gradlew :polaris-api-management-model:openApiGenerate) that are in lockstep with the spec.

So, the DTOs are auto generated but the DPOs are hand coded. This is because they are internal whereas DTOs are client facing and that client could be Java, Python or something else.

After making the change, then it's:

./gradlew assemble -x test && ./gradlew publishToMavenLocal -x test

to push it to the awaiting code in my project.

Git

Then as I make my changes, I keep pulling from original repo with:

git pull https://github.com/apache/polaris.git main --rebase

The --rebase at the end is saying "make my branch exactly the same as the original repo then add my deltas on to it at the end of its history."

Following the Polaris instructions, I noticedthat my origin was the Polaris Git repo (see this with git remote -v).I actually found it easier to run:

git remote set-url origin https://github.com/PhillHenry/polaris.git

git remote add upstream https://github.com/apache/polaris

git push --force-with-lease origin 3451_federated_google_auth # this is the branch

to push my changes (and any from Apache) to my own branch.

Now, with:

$ git remote -v

origin https://github.com/PhillHenry/polaris.git (fetch)

origin https://github.com/PhillHenry/polaris.git (push)

upstream https://github.com/apache/polaris (fetch)

upstream https://github.com/apache/polaris (push)

I can keep my repo in synch with the original and ensure that my changes are always the last commits in the history with:

git fetch upstream

git fetch origin

git rebase upstream/main

as rebase flattens the history graph and rewrites the hash of commits (not the commits themselves).

To squash the commits, run:

git config --global core.editor "vim" # I prefer vim to emacs
git rebase -i HASH_OF_LAST_COMMIT_THAT_IS_NOT_YOURS

then edit the file such that the top line starts with pick and the subsequent list of commits begin with squash. Save it then you'll be prompted to write another file. Put the final, informative comment here. Save it too then push.

If you get into a pickle,

git reset --hard

rm -fr ".git/rebase-merge"

gets you back to where you were.

Once you're happy

Don't forget to

./gradlew spotlessApply

Note, this will change the files on disk. Also, run:

./gradlew build -x rat

For my 24 core Intel Ultra 9 185H:

BUILD SUCCESSFUL in 23m 52s

so, I don't want to do this too often...

Debugging

Polaris is heavily dependent on Quarkus which was throwing an HTTP 400 according to the logs but gave no further information. So, it's good at this point to put a breakpoint in org.jboss.resteasy.reactive.server.handlers.RequestDeserializeHandler as I suspected that it was related to my new DTOs.

Google

Google by default stops an account from impersonating itself.

So, to mitigate this in my integration tests, I've created two service accounts - one that my Polaris always runs as and the second to pretend to be the account that manages access to the external catalog. You get the Polaris SA to impersonate the external SA with:

gcloud iam service-accounts add-iam-policy-binding EXTERNAL_SA@PROJECT_ID.iam.gserviceaccount.com --member="serviceAccount:POLARIS_SA@PROJECT_ID.iam.gserviceaccount.com" --role="roles/iam.serviceAccountTokenCreator"

An unexpected regression

Almost there, I came across this unexpected error:

2026-02-09 09:38:11,924 ERROR [io.qua.ver.htt.run.QuarkusErrorHandler] ... java.lang.NoClassDefFoundError: Could not initialize class com.google.cloud.iam.credentials.v1.stub.GrpcIamCredentialsStub

at com.google.cloud.iam.credentials.v1.stub.IamCredentialsStubSettings.createStub(IamCredentialsStubSettings.java:145)

The error was deep in some class initialization so I added this code:

try {

ProtoUtils.marshaller(GenerateAccessTokenRequest.getDefaultInstance());

} catch (Throwable t) {

t.getCause().printStackTrace();

LOGGER.error( "Failed to create IAM credentials stub", t);

}

which gave:

Caused by: com.google.protobuf.RuntimeVersion$ProtobufRuntimeVersionException: Detected incompatible Protobuf Gencode/Runtime versions when loading GenerateAccessTokenRequest: gencode 4.33.2, runtime 4.32.1. Runtime version cannot be older than the linked gencode version.

at com.google.protobuf.RuntimeVersion.validateProtobufGencodeVersionImpl(RuntimeVersion.java:120)

at com.google.protobuf.RuntimeVersion.validateProtobufGencodeVersion(RuntimeVersion.java:68)

at com.google.cloud.iam.credentials.v1.GenerateAccessTokenRequest.<clinit>(GenerateAccessTokenRequest.java:32)

... 77 more

com.google.protobuf.RuntimeVersion$ProtobufRuntimeVersionException: Detected incompatible Protobuf Gencode/Runtime versions when loading GenerateAccessTokenRequest: gencode 4.33.2, runtime 4.32.1. Runtime version cannot be older than the linked gencode version.

at com.google.protobuf.RuntimeVersion.validateProtobufGencodeVersionImpl(RuntimeVersion.java:120)

at com.google.protobuf.RuntimeVersion.validateProtobufGencodeVersion(RuntimeVersion.java:68)

at com.google.cloud.iam.credentials.v1.GenerateAccessTokenRequest.<clinit>(GenerateAccessTokenRequest.java:32)

at org.apache.polaris.core.storage.gcp.GcpCredentialsStorageIntegration.createIamCredentialsClient(GcpCredentialsStorageIntegration.java:287)

Urgh. It appears that GenerateAccessTokenRequest (which is itself @com.google.protobuf.Generated) in JAR proto-google-cloud-iamcredentials-v1:2.83.0 says in its static initializer that it is associated with protobuf version 4.33.2. Meanwhile, RuntimeVersion in JAR protobuf-java:4.32.1 checks this against itself and obviously fails it.

Polaris Federation notes

2026-02-03T08:22:00.000-08:00

Here are some miscellaneous notes I made as I work my way through groking Polaris Federation and Authorization.

First:

Polaris DevOps

Polaris has integration tests in the integration-tests/src/main/java/ directory not the test directory as you would have imagined. The reason for this is that they can be packaged as a JAR and used elsewhere in the codebase.

The advantage to doing it this way is that the same tests can be run against a local Polaris, a Polaris in the cloud, a Polaris running in Docker etc.

So, if we take CatalogFederationIntegrationTest, we can see it subclassed in

the spark-tests Gradle package where it can be run with:

./gradlew :polaris-runtime-spark-tests:intTest

If you try to run the superclass in its own module with Gradle, it cannot be found as it's not in the test directory. If you try to run it with your IDE, you'll find that classes to be wired in at runtime are missing. Running the subclass, CatalogFederationIT, starts a MinIO docker container against which it can run.

Federation

The DTOs (Data Transfer Objects) for creating catalogs etc live in

org.apache.polaris.core.admin.model

For example ExternalCatalog which can be serialized into JSON.

These are passed across the wire and are turned into DPOs (Data Persistence Objects) that live in

org.apache.polaris.core.connection.iceberg

In the case of the IcebergRestConnectionConfigInfoDpo, this DPO object is not a mere aneamic domain model. It has the logic to, for instance, create the properties that will be used to instantiate the class that will govern authentication. It does this by delegating to this factory class:

org.apache.iceberg.rest.auth.AuthManagers

Notice that we have moved from Polaris to the world of Iceberg. The various AuthManagers implement access to OAuth2 providers, Google, SigV4 for AWS etc.

However, there is a mismatch. The AuthenticationParameters DTO classes don't fully align with the AuthManager classes. For instance, there doesn't appear to be a way of creating an external catalog with authorisation via org.apache.iceberg.gcp.auth.GoogleAuthManager.

So, after a day of investigating and trying to hack something together, it looks like this:

Iceberg can talk to Google no problem using org.apache.iceberg.gcp.auth.GoogleAuthManager.
However, there is currently no Polaris code to use GoogleAuthManager in an external catalog.
Instead, the only way to do it currently is to use the standard OAuth2 code.
However, Google does not completely follow the OAuth2 spec, hence this Iceberg ticket that lead to the writing of GoogleAuthManager and this StackOverflow post that says GCP does not support the grant_type that Iceberg's OAuth2Util uses.

This has now been raised in this Polaris ticket.

Notes on Poetry

2026-01-31T02:09:00.000-08:00

The dependencies in Poetry can get screwed. If hashes don't agree then blatting the poetry.lock file will not help. Instead, run:

poetry cache clear pypi --all

poetry cache clear --all .

rm -rf ~/.cache/pypoetry

When updating a dependency, run:

poetry lock

poetry install

This will install an environment in a subfolder of:

~/.cache/pypoetry/virtualenvs/

You can point your IDE at the Python executable underneath this.

Poetry is pretty nice when showing you dependencies. Running something like:

poetry show pandas

shows you everything Pandas needs and everything that depends on it.

This was necessary when decoding a bizarre error in a Jupyter notebook where an import was failing when it was clearly there. In this case, statsmodel and Pandas seem to be disagreeing.

The code to put in the notebook to check it was using the right version of a library is:

import statsmodels

import pandas

print(statsmodels.__version__)

print(pandas.__version__)

print(statsmodels.__file__)

Now, compare this to the Python environment:

poetry run python - <<EOF

import statsmodels, pandas

print(statsmodels.__version__)

print(statsmodels.__file__)

print(pandas.__version__)

EOF

I had changed dependency versions but my IDE (PyCharm) did not recognise the change until I restarted it.

Some useful one-liners

Run your tests with:

poetry run pytest

The whereabouts of your tests can be found in your pyproject.toml file. It should look something like:

testpaths = ["tests"]

pythonpath = "src"

With this, your tests can import anything under the ROOT/src directory.

Add a dependency with something like:

poetry add ipykernel

This will update your metadata file.

The Federation: AWS Glue

2026-01-12T23:55:00.000-08:00

Apache Polaris can act as a proxy to other catalogs. This still appears to be work in progress as the roadmap proposal has "Catalog Federation" as "Tentatively Planned" at least until release 1.5.

If you're running it from source, you'll need to enable:

polaris.features."ENABLE_CATALOG_FEDERATION"=true

polaris.features."SUPPORTED_CATALOG_CONNECTION_TYPES"=["ICEBERG_REST", "HIVE"]

polaris.features."SUPPORTED_EXTERNAL_CATALOG_AUTHENTICATION_TYPES"=["OAUTH", "BEARER", "SIGV4"]

in runtime/defaults/src/main/resources/application.properties.

Apache Polaris can be a proxy for an Iceberg REST endpoint. In each case, a org.apache.polaris.core.admin.model.ExternalCatalog is passed across the wire to create a catalog. Only the details differ.

AWS Glue

Glue is its own beast but it does offer an Iceberg REST endpoint. To use it, the AuthenticationParameters in the ExternalCatalog must be of type SigV4AuthenticationParameters.

"In IAM authentication, instead of using a password to authenticate against the [service], you create an authentication token that you include with your ... request. These tokens can be generated outside the [service] using AWS Signature Version 4(SigV4) and can be used in place of regular authentication." [1]

So, the SigV4AuthenticationParameters ends up taking the region, role ARN, etc. The role must be available to the Principal that is associated with the Polaris instance. In addition, there must be a --policy-document that allows the Action glue:GetCatalog.

Finally, the Glue database and table must be created with Parameters that contain iceberg.table.default.namespace and an IcebergInput block.

TL;DR - most of the work is in configuring AWS not the calling code.

[1] Security and Microservice Architecture on AWS

Cross account AWS Permissions

2026-01-06T03:23:00.000-08:00

You can have one AWS account see the contents of the S3 bucket of another entirely separate account if you configure it correct. Note that S3 buckets are unique across the whole AWS estate, irrespective of who owns it. This was for historical reasons, it seems.

Anyway, to have account Emrys (say) read the bucket of account Odin (say), run the commands below.

Note, that all of this can be run from the same command line if you have Emrys as your [default] account and [odin] as Odin's in ~/.aws/config and credentials. You'll need source_profile = odin in config to point to the correct credentials.

First, we say create a role in Odin that Emrys will assume:

aws iam create-role --role-name S3ReadWriteRoleEmrys --assume-role-policy-document '{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {

"AWS": "arn:aws:iam::EMRYS_ID:root"

} ,

"Action": "sts:AssumeRole"

}

]

}' --profile odin

Then we create a policy and attach it to the role.

aws iam create-policy --policy-name ReadOdinS3IAMPolicy --policy-document file://emrys-policy.json --profile odin

aws iam attach-role-policy --role-name S3ReadWriteRoleEmrys --policy-arn arn:aws:iam::ODIN_ID:policy/ReadOdinS3IAMPolicy --profile odin

Note that emrys-policy.json is just a collection of s3 Actions that act on a Resource that is Odin's bucket - nothing special.

Then in the Emrys real estate, we

aws iam create-policy \

--policy-name AssumeOdinRole \

--policy-document file://assume-account-odin-role.json

aws iam attach-user-policy \

--user-name MY_USER \

--policy-arn arn:aws:iam::EMRYS_ID:policy/AssumeOdinRole

where assume-account-odin-role.json just contains the sts:AssumeRole for Odin's S3ReadWriteRoleEmrys.

Finally, we get the temporary credentials to read the bucket with:

aws sts assume-role \

--role-arn arn:aws:iam::ODIN_ID:role/S3ReadWriteRoleEmrys \

--role-session-name s3-access

Just paste the output of this into your AWS environment variables.

"For a principal to be able to switch roles, there needs to be an IAM policy attached to it that allows the principal to call the AssumeRole action on STS [Security Token Service].

"In addition, IAM roles can also have a special type of a resource-based policy attached to them, called an IAM role trust policy. An IAM role trust policy can be written just like any other resource-based policy, by specifying the principals that are allowed to assume the role in question... Any principal that is allowed to access a role is called a trusted entity for that role." [1]

Note that the key in the Prinicipal map is significant as it defines the category of the identity. It can be

a Service that is allowed to assume a role (eg, EKS)
AWS which indicates an IAM user or role or an assumed role (see below). Note that root does not indicate the most powerful user as in Unix. On the contrary, it means anybody legitimately associated with this account.
Federated which means it's a provider external to the native AWS ecosystem.

STS is the system an identity must apply if it wishes to assume a role. This system checks that the identity is indeed allowed to do this.

An assumed role looks like this:

arn:aws:sts::123456789012:assumed-role/S3ReadWriteRoleSF/snowflake

where S3ReadWriteRoleSF is the normal, IAM role name and snowflake is the session name. This session name is merely a tag and has no intrinsic permissions (although it may be used in Condition/StringEquals). This will be set in --role-session-name (see above) when assuming the role.

[1] Security and Microservice Architecture on AWS

Debugging JNI calls to the GPU

2025-12-31T07:46:00.000-08:00

I'm playing aroung with a Java based LLM (code here). When running a JVM that calls the GPU using the TornadoVM, it crashed and in the log, I saw:

Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)

C [libcuda.so.1+0x1302b0]

C [libcuda.so.1+0x332420]

C [libtornado-ptx.so+0x64b8] Java_uk_ac_manchester_tornado_drivers_ptx_PTXStream_cuLaunchKernel+0x198

j uk.ac.manchester.tornado.drivers.ptx.PTXStream.cuLaunchKernel([BLjava/lang/String;IIIIIIJ[B[B)[[B+0 tornado.drivers.ptx@2.2.1-dev

...

Now, finding the Shared Object files (*.so), I called:

objdump -d /usr/lib/x86_64-linux-gnu/libcuda.so.1

objdump -d /usr/local/bin/Java/tornadovm-2.2.1-dev-ptx/lib/libtornado-ptx.so

and looked at the addresses in the stack dump.

First, libtornado-ptx.so. Note that the address (0x64b8) is the return address from a call, that is the next line after the call that went Pete Tong.

64b3: e8 b8 e1 ff ff call 4670 <cuLaunchKernel@plt>

64b8: 48 83 c4 30 add $0x30,%rsp

So, it's the call to cuLaunchKernel that is interesting.

33241b: e8 00 de df ff call 130220 <exit@plt+0x4e460>

332420: 5a pop %rdx

and the final (top most) stack frame:

1302ab: 4d 85 e4 test %r12,%r12

1302ae: 74 58 je 130308 <exit@plt+0x4e548>

1302b0: 41 8b 04 24 mov (%r12),%eax

The instruction test %x,%y is a common idiom in null checks (basically, it's x and y are ANDed and the je jumps if the Zero Flag is set - note that this flag is set if the result of the AND is non-zero or both x and y are zero).

So, it looks like we've essentially got what's equivalent to a NullPointerException in the machine code. Still looking at what's null... [Solved: had to use a model that is compatible with GPULlama3.java)

AWS cheatsheet

2025-12-15T06:56:00.000-08:00

Various command lines that have helped me recently.

IAM

List the role's attached, inline and assumed (trust) policies with:

aws iam list-attached-role-policies --role-name $ROLE_NAME

aws iam list-role-policies --role-name $ROLE_NAME

Whoami with:

aws sts get-caller-identity

Policies are a collection of actions and services that can be assigned. List all homemade policies with:

aws iam list-policies --scope Local --query 'Policies[].Arn' --output table

Similarly, list all roles with:

aws iam list-roles --query 'Roles[].RoleName' --output table

List all the Actions for a policy with:

aws iam get-policy-version --policy-arn $POLICY_ARN --version-id $(aws iam get-policy --policy-arn $POLICY_ARN --query 'Policy.DefaultVersionId' --output text) --query 'PolicyVersion.Document.Statement[].Action' --output json | jq -r '.[]' | sort -u

List all the trust policies for a given role:

aws iam get-role --role-name $ROLE_NAME --query 'Role.AssumeRolePolicyDocument' --output json

Note that assuming a role implies some temporary elevation of privileges while attaching a role is more about defining what a role actually is.

List everything attached to a policy:

aws iam list-entities-for-policy --policy-arn $POLICY_ARN

Instance profiles contain roles. They act as a bridge to securely pass an IAM role to an EC2 instance, enabling the instance to access other AWS services without needing to store long-term, hard-coded credentials like access keys. You can see them with:

aws iam list-instance-profiles-for-role --query 'AttachedPolicies[*].PolicyArn' --role-name $ROLE_NAME --query "InstanceProfiles[].InstanceProfileName" --output text

In short:

Trust policies say who can access a role.
Permission policies say what a role can do.

Note that this is why trust polices have only one action: sts:AssumeRole.

Secrets

See access to K8s secrets with:

kubectl logs -n kube-system -l app=csi-secrets-store-provider-aws -XXX

See an AWS secret with:

aws secretsmanager get-secret-value --secret-id $SECRET_ARN --region $REGION

Deleting them is interesting as they will linger unless told otherwise:

aws --region $REGION secretsmanager delete-secret --secret-id $SECRET_NAME --force-delete-without-recovery

Infra

To see why your EKS deployments aren't working:

kubectl get events --sort-by=.metadata.creationTimestamp | tail -20

Terraform seems to have a problem deleting load balancers in AWS. You can see them with:

aws elbv2 describe-load-balancers

List the load balancers:

aws elb describe-load-balancers --region $REGION

List the VPCs:

aws ec2 describe-vpcs --region $REGION

Glue

Create with:

aws glue create-database --database-input '{"Name": "YOUR_DB_NAME"}' --region $REGION

Create an Iceberg table with:

aws glue create-table \

--database-name YOUR_DB_NAME \

--table-input '

{

"Name": "TABLE_NAME",

"TableType": "EXTERNAL_TABLE",

"StorageDescriptor": {

"Location": "s3://ROOT_DIRECTORY_OF_TABLE/",

"Columns": [

{ "Name": "id", "Type": "int" },

...

{ "Name": "randomInt", "Type": "int" }

"SerdeInfo": {

"SerializationLibrary": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"

}

"Parameters": {

"iceberg.table.default.namespace": "YOUR_DB_NAME"

}

}' \

--open-table-format-input '

{

"IcebergInput": {

"MetadataOperation": "CREATE",

"Version": "2"

}

}' \

--region $REGION

Get all the databases with:

aws glue get-databases --query 'DatabaseList[*].Name' --output table

Get tables with:

aws glue get-tables --database-name YOUR_DB_NAME

Drop with:

aws glue delete-table --name TABLE_NAME --database-name YOUR_DB_NAME

AWS and HTTPs

2025-11-24T03:18:00.000-08:00

This proved surprisingly hard but the take-away points are:

A Kubernetes service account must be given permissioned to access AWS infra structure
The Kubernetes cluster needs AWS specific pods to configure the K8s ingress such that it receives traffic from outside the cloud
The ingress is where the SSL de/encryption is performed.
Creating the certificate is easy when using the AWS web console and there you just associate it with the domain name.

The recipe

The following steps assume you have an ingress and a service already up and running. I did the mapping between the two in Terraform. What follows below is how to allow these K8s primitives to use AWS so they can be contacted by the outside world.

You need to associate an OpenID provider with the cluster and create a Kubernetes service account that is permissioned to use the AWS load balancer. Note that lines that are predominantly Kubernetes are blue and AWS lines are red.

eksctl utils associate-iam-oidc-provider --cluster $CLUSTERNAME --approve

curl -o iam_policy.json https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/install/iam_policy.json

aws iam create-policy --policy-name AWSLoadBalancerControllerIAMPolicy --policy-document file://iam_policy.json

eksctl create iamserviceaccount --cluster $CLUSTERNAME --namespace kube-system --name aws-load-balancer-controller --attach-policy-arn arn:aws:iam::$AWS_ACCOUNT_ID:policy/AWSLoadBalancerControllerIAMPolicy --approve

kubectl describe sa aws-load-balancer-controller -n kube-system # check it's there

Then you need to configure Kubernetes to use the AWS load balancer.

helm install aws-load-balancer-controller eks/aws-load-balancer-controller -n kube-system --set clusterName=$CLUSTERNAME --set serviceAccount.create=false --set serviceAccount.name=aws-load-balancer-controller

However, I could see my replicasets were failing when I ran:

kubectl get rs -A

with something like:

Type Reason Age From Message

---- ------ ---- ---- -------

Warning FailedCreate 67s (x15 over 2m29s) replicaset-controller Error creating: pods "aws-load-balancer-controller-68f465f899-" is forbidden: error looking up service account kube-system/aws-load-balancer-controller: serviceaccount "aws-load-balancer-controller" not found

and there are no load balancer pods.

So, it seemed I need to:

kubectl apply -f aws-lbc-serviceaccount.yaml

where aws-lbc-serviceaccount.yaml is:

apiVersion: v1

kind: ServiceAccount

metadata:

name: aws-load-balancer-controller

namespace: kube-system

annotations:

eks.amazonaws.com/role-arn: arn:aws:iam::AWS_ACCOUNT_ID:role/AmazonEKSLoadBalancerControllerRole

labels:

app.kubernetes.io/component: controller

app.kubernetes.io/name: aws-load-balancer-controller

app.kubernetes.io/instance: aws-load-balancer-controller

The pods were now starting but quickly failing with errors like:

{"level":"error","ts":"2025-11-21T17:31:22Z","logger":"setup","msg":"unable to initialize AWS cloud","error":"failed to get VPC ID: failed to fetch VPC ID from instance metadata: error in fetching vpc id through ec2 metadata: get mac metadata: operation error ec2imds: GetMetadata, canceled, context deadline exceeded"}

We can set it with:

helm upgrade aws-load-balancer-controller eks/aws-load-balancer-controller --namespace kube-system --set clusterName=$CLUSTE_NAME --set vpcId=$VPC_ID --set serviceAccount.create=false --set serviceAccount.name=aws-load-balancer-controller

and now the pods are running.

However, my domain name was still not resolving. So, run this to get the OIDC (OpenID Connector) issuer:

aws eks describe-cluster --name $CLUSTERNAME --query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///"

Note that this value changes every time the cluster is created.

Then run:

aws iam create-role \

--role-name ${IAM_ROLE_NAME} \

--assume-role-policy-document file://lbc-trust-policy.json

where lbc-trust-policy.json is:

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {

"Federated": "arn:aws:iam::AWS_ACCOUNT_ID:oidc-provider/OIDC_ISSUER"

"Action": "sts:AssumeRoleWithWebIdentity",

"Condition": {

"StringEquals": {

"OIDC_ISSUER:aud": "sts.amazonaws.com",

"OIDC_ISSUER:sub": "system:serviceaccount:kube-system:aws-load-balancer-controller"

}

]

}

Get the ARN of that policy:

POLICY_ARN=$(aws iam list-policies --scope Local --query "Policies[?PolicyName=='AWSLoadBalancerControllerIAMPolicy'].Arn" --output text)

Create the role:

aws iam create-role --role-name AmazonEKSLoadBalancerControllerRole --assume-role-policy-document file://lbc-trust-policy.json

attach the policy:

aws iam attach-role-policy --role-name AmazonEKSLoadBalancerControllerRole --policy-arn ${POLICY_ARN}

then inform the cluster:

kubectl annotate serviceaccount aws-load-balancer-controller -n kube-system eks.amazonaws.com/role-arn="arn:aws:iam::AWS_ACCOUNT_ID:role/AmazonEKSLoadBalancerControllerRole" --overwrite

If you're logging the aws-load-balancer-controller-XXX pod, youll see it register this change if you restart the ingress with:

kubectl rollout restart deployment aws-load-balancer-controller -n kube-system

then check its status with:

kubectl describe ingress $INGRESS_NAME

Note the ADDRESS. It will be of the form k8s-XXX.REGION.elb.amazonaws.com. Let's define it as:

INGRESS_HOSTNAME=$(kubectl get ingress $INGRESS_NAME -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')

Find the HostedZoneId of your load balancer with:

aws elbv2 describe-load-balancers --query "LoadBalancers[?DNSName=='INGRESS_HOSTNAME'].CanonicalHostedZoneId" --output text --region $REGION

Registering domain in Route 53

Create the A type DNS entry with:

HOSTED_ZONE_ID=$(aws route53 list-hosted-zones-by-name \

--dns-name $FQDN \

--query "HostedZones[0].Id" --output text | awk -F'/' '{print $3}')

aws route53 change-resource-record-sets --hosted-zone-id "$HOSTED_ZONE_ID" --change-batch file://route53_change.json

where route53_change.json is:

{

"Comment": "ALIAS record for EKS ALB Ingress",

"Changes": [

{

"Action": "UPSERT",

"ResourceRecordSet": {

"Name": "FQDN",

"Type": "A",

"AliasTarget": {

"HostedZoneId": "YOUR_HOST_ZONE_ID",

"DNSName": "INGRESS_HOSTNAME",

"EvaluateTargetHealth": false

}

]

}

After a few minutes, you'll see that IP address of the domain name and INGRESS_HOSTNAME are the same.

You can create your own hosted zone with:

aws route53 create-hosted-zone --name "polarishttps.emryspolaris.click" --caller-reference "$(date +%Y-%m-%d-%H-%M-%S)"

but this can lead to complications.

"Public-hosted zones have a route to internet-facing resources and resolve from the internet using global routing policies. Meanwhile, private hosted zones have a route to VPC resources and resolve from inside the VPC." - AWS for Solution Architects, O'Reilly

Certificate

We've now linked the domain name to an endpoint. Now we need to create a certificate. I did this through the AWS web console and after just a few clicks, it gave me the ARN.

You might need to wait a few minutes for it to become live but you can see the status of a certificate with:

aws acm describe-certificate --certificate-arn "$CERT_ARN" --region eu-west-1 --query "Certificate.Status" --output text

Snowflake and AWS

2025-11-18T04:26:00.000-08:00

This is how you get Snowflake to talk to your AWS real estate. Before we start, get your AWS account ID with:

aws sts get-caller-identity

This will be used as your GLUE_CATALOG_ID (see below).

Now, you need to create in Snowflake a volume like this:

CREATE OR REPLACE EXTERNAL VOLUME YOUR_VOLUME_NAME

STORAGE_LOCATIONS = (

( NAME = 'eu-west-2'

STORAGE_PROVIDER = 'S3'

STORAGE_BASE_URL = 's3://ROOT_DIRECTORY_OF_TABLE/'

STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::GLUE_CATALOG_ID:role/ROLE_NAME'

)

ALLOW_WRITES = FALSE;

You run this even though you have yet to create the role. Then run:

select system$verify_external_volume('YOUR_VOLUME_NAME');

This will give you some JSON that includes a STORAGE_AWS_IAM_USER_ARN. You never create this user. Snowflake does it itself. Its ARN is what you need to create a role in AWS that allows Snowflake's user to see data.

You create a role was created with an ordinary aws iam create-role --role-name S3ReadWriteRoleSF --assume-role-policy-document... using the ARN that we got from Snowflake, above. That is, our Snowflake instance has its own AWS user and you must give that user access to your real estate.

Now, give Snowflake access to your cloud assets with:

aws iam put-role-policy --role-name ROLE_NAME --policy-name GlueReadAccess --policy-document file://glue-read-policy.json

Where glue-read-policy.json just contains the Actions needed to talk to Glue.

Finally, we create the Glue catalog (note that this is not a REST catalog like Polaris) but Glue:

CREATE OR REPLACE CATALOG INTEGRATION CATALOG_NAME

CATALOG_SOURCE = GLUE

TABLE_FORMAT = ICEBERG

CATALOG_NAMESPACE = 'YOUR_DB_NAME'

GLUE_CATALOG_ID = 'GLUE_CATALOG_ID'

GLUE_AWS_ROLE_ARN = 'arn:aws:iam::GLUE_CATALOG_ID:role/ROLE_NAME'

GLUE_REGION = 'eu-west-2'

ENABLED = TRUE;

Now you bring all these threads together when you create a table with:

CREATE ICEBERG TABLE arbitrary_name

EXTERNAL_VOLUME = 'YOUR_VOLUME_NAME'

CATALOG = 'CATALOG_NAME'

CATALOG_TABLE_NAME = 'TABLE_NAME';

Create a REST catalog with:

CREATE OR REPLACE CATALOG INTEGRATION polaris_int

CATALOG_SOURCE = POLARIS

TABLE_FORMAT = ICEBERG

REST_CONFIG = (

CATALOG_URI = 'https://YOUR_HOST:8181/api/catalog/',

CATALOG_NAME = 'YOUR_CATALOG_NAME_IN_POLARIS'

)

REST_AUTHENTICATION = (

TYPE = BEARER

BEARER_TOKEN = 'TOKEN'

)

ENABLED = TRUE;

Note that the URI must be talking HTTPS not HTTP and the TOKEN is from Polaris.

Debugging Google Cloud Kubernetes

2025-11-15T02:19:00.000-08:00

A problem I was having when spinning up a K8s cluster and then trying to deploy my own Polaris was that the pod stuck in the Pending state. A quick kubectl describe pod gave the last event as "Pod didn't trigger scale-up:"

So, let's look at the events (a.k.a operations):

gcloud container operations list --project $PROJECT

Then to drill down on the operation of interest:

gcloud container operations describe operation-XXX --region $REGION --project $PROJECT

It seemed pretty quiet. The last two events were:

CREATE_CLUSTER began at 16:35:38 and ran to 16:41:37
DELETE_NODE_POOL started at 16:41:41 and ran to 16:46:02

So, that delete came hot on the heals of the cluster successfully being created. I looked at the logs with:

gcloud logging read "resource.labels.cluster_name=spark-cluster AND timestamp>=\"2025-11-14T16:41:35Z\" AND timestamp<=\"2025-11-14T16:41:42Z\"" --project=$PROJECT --limit 10 --order=desc

and one of these logs looked like this:

requestMetadata:

callerSuppliedUserAgent: google-api-go-client/0.5 Terraform/1.10.7 (+https://www.terraform.io)

Terraform-Plugin-SDK/2.36.0 terraform-provider-google/dev6,gzip(gfe)

...

response:

operationType: DELETE_NODE_POOL

This was saying that the DELETE_NODE_POOL originated from my own Terraform-Plugin-SDK! And the reason for that was my Terraform had:

"remove_default_node_pool": true

It did this because it then tried to create its own node pool. However, it seems that having 2 node pools at once exhausted the GCP quotas. My node failed to start but TF merrily went ahead and continued to delete the default pool.

You can see quotas with:

gcloud compute regions describe $REGION

and node pools with:

gcloud container node-pools describe default-pool --cluster $CLUSTER_NAME --region $REGION --project $PROJECT

Spark Operator

2025-11-05T00:27:00.000-08:00

I've found that managing Spark clusters in Kubernetes is far easier using the Spark Operator. Here are some commands that helped me diagnose issues.

Dude, where's my appliction?

List your Spark applications with:

kubectl get sparkapplications

kubectl get sparkapplications spark-connector -o yaml

to see what might be causing trouble for, say, the connector.

It can be annoying when you can't delete a sparkapplication with

kubectl delete sparkapplication YOUR_APP

even though it's running. In my case, I thought a

kubectl rollout restart deployment spark-kubernetes-operator

left an orphaned cluster.

It's possible that you don't see anything even though there are Spark pods clearly there. In this case:

kubectl describe pod POD_NAME

and you should see something like:

...

Controlled By: StatefulSet/XXX

...

Great, so it looks like the Spark Operator has set the cluster up by delegating to Kubernetes primitives. Let's see them:

kubectl get statefulsets

and then we can just:

kubectl delete statefulset XXX

OK, so, dude, where's my cluster

But we're barking up the wrong tree. The YAML to create a cluster has kind: SparkCluster so we're using the wrong CRD with sparkapplications.

kubectl get crd | grep spark

sparkclusters.spark.apache.org 2025-11-04T10:52:56Z

...

Right, so now:

kubectl delete sparkclusters YOUR_CLUSTER

Python

As a little aside, I was seeing strange errors when running PySpark commands that appeared to be a versioning problems. A few commands that came in useful were:

import sys

print(sys.path)

to print where the Python executable was getting its libraries from and:

from pyspark.version import __version__

print(__version__)

to make sure we really did have the correct PySpark version.

As it happened, it was the wrong version of the Iceberg runtime in spark.jars.packages.

AWS, Kubernetes and more

2025-11-03T00:45:00.000-08:00

Setting up a 3 node Kubernetes cluster in AWS is as simple as:

eksctl create cluster --name $CLUSTERNAME --nodes 3

but this really hides a huge amount of what is going on. Apart from IAM, eksctl automatically creates:

a new Virtual Private Cloud (VPC) in which sit the K8s control plane and workers. A VPC is "a logically isolated and secure network environment that is separate from the rest of the AWS cloud" [1]
two public subnets and two private subnets (best practice if you want high availability). By putting worker nodes in the private subnet, they cannot be maliciously scanned from the internet.
all necessary NAT Gateways to allow the private subnets to access the internet
Internet Gateways allowing the internet to talk to your public subnets.
Route Tables which are just rules for network traffic. It's the "Routers use a route table to determine the best path for data packets to take between networks" [2]

You can see some details with:

$ eksctl get cluster --name=$CLUSTERNAME --region=$REGION

NAME VERSION STATUS CREATED VPC SUBNETS SECURITYGROUPS PROVIDER

spark-cluster 1.32 ACTIVE 2025-10-27T10:36:02Z vpc-REDACTED subnet-REDACTED,subnet-REDACTED,subnet-REDACTED,subnet-REDACTED,subnet-READACTED,subnet-REDACTED sg-REDACTED EKS

Terraform

If you use Terraform, you might need to configure your local kubectl to talk to the EKS cluster by hand.

First, back up your old config with:

mv ~/.kube ~/.kube_bk

then run:

aws eks update-kubeconfig --name $CLUSTERNAME --region $REGION

But if you are running aws via Docker, this will have updated ~/.kube/config in the container, not the host. So, run:

docker run --rm -it -v ~/.aws:/root/.aws -v ~/.kube:/root/.kube amazon/aws-cli eks update-kubeconfig --name $CLUSTERNAME --region $REGION

Now it will write to your host's config but even then you'll have to change the command at the end of the file to point to a non-Docker version (yes, you'll have to install the AWS binary - preferably in a bespoke directory so you can continue using the Docker version).

Another issue I had was the connection to the new EKS cluster was different to my ~/.kube/config. This in itself was not a problem as you can put in (using Java and CDKTF):

LocalExecProvisioner.builder()

.when("create") // Run only when the resource is created

.command(String.format(

"aws eks update-kubeconfig --name %s --region %s",

CLUSTER_NAME,

AWS_REGION)

)

.type("local-exec")

.build()

which depends on the EksCluster and the DataAwsEksClusterAuth and in turn, the failing resource are to depend on it.

However, this introduced other problems.

First, I tried to get the reading of ~/.kube/config to depends_on the EKS cluster. That way, I'd only read it once the cluster was up and running, right? Well, no. This introduces a circular dependency as it's read before the cluster is started.

Any fiddling with the dependency tree leads to reading ~/.kube/config when it's stale. So, you need to initialize the Kubernetes details (which appears to be global and otherwise implicit) directly with:

String base64CertData = cluster.getCertificateAuthority().get(0).getData();

String encodedCert = com.hashicorp.cdktf.Fn.base64decode(base64CertData);

KubernetesProvider kubernetesProvider = KubernetesProvider.Builder.create(this, "kubernetes")

.host(cluster.getEndpoint())

.clusterCaCertificate(encodedCert)

.token(eksAuthData.getToken()) // Dynamically generated token

.build();

Strangely, you still need to define the environment variable, KUBE_CONFIG_PATH as some resources need it, albeit after it has been correctly amended with the current cluster's details.

Zombie Clusters

Running:

tofu destroy -auto-approve

just kept hanging. So, I ran:

tofu state list | grep -E "(nat_gateway|eip|eks_cluster)"

and found some EKS components running that I had to kill with:

tofu destroy -auto-approve -target=...

Finally, kubectl get pods barfed with no such host.

Load balancers

The next problem was the tofu destroy action was constantly saying:

aws_subnet.publicSubnet2: Still destroying... [id=subnet-XXX, 11m50s elapsed]

So, I ran:

aws ec2 describe-network-interfaces \

--filters "Name=subnet-id,Values=subnet-XXX" \

--query "NetworkInterfaces[].[NetworkInterfaceId, Description, InterfaceType, Status]" \

--output table

and got an ENI that I tried to delete with:

aws ec2 delete-network-interface --network-interface-id eni-XXX

only to be told that it was still in use. Ho, hum:

$ aws ec2 describe-network-interfaces \

--network-interface-ids eni-XXX \

--query "NetworkInterfaces[0].{ID:NetworkInterfaceId, Description:Description, Status:Status, Attachment:Attachment}" \

--output json

...

"InstanceOwnerId": "amazon-elb",

...

So, let's see what that load balancer is:

$ aws elb describe-load-balancers \

--query "LoadBalancerDescriptions[?contains(Subnets, 'subnet-XXX')].[LoadBalancerName]" \

--output text

which gives me its name and now I can kill it with:

aws elb delete-load-balancer --load-balancer-name NAME

Finally, the destroy just wasn't working, failing ultimately with:

│ Error: deleting EC2 VPC (vpc-XXX): operation error EC2: DeleteVpc, https response error StatusCode: 400, RequestID: 8412a305-..., api error DependencyViolation: The vpc 'vpc-XXX' has dependencies and cannot be deleted.

Just going into the web console and deleting it there was the simple but curious solution.

[1] Architecting AWS with Terraform

[2] The Self-Taught Cloud Computing Engineer

Exactly Once Semantics in Iceberg/Kafka

2025-10-23T02:06:00.000-07:00

Iceberg has a Kafka Connect component that ensures exactly one semantics despite there being two transactions for the two systems (Kafka and Iceberg) per write.

So, how does is this done? The answer is simple: idemptotency. This code here that prepares the data to be committed but by delegating to this code in MergingSnapshotProducer, it ensures there are no dupes.

Note the Coordinator commits metadata to Iceberg.

The Coordinator and the Worker dance a tango. The steps go like this:

The Worker writes Iceberg data to storage. Having saved its SinkRecords to its RecordWriter, it then goes on to processControlEvents.

This involves polling the internal topic for Kafka ConsumerRecords. If this worker is the in the consumer group, and the event is a START_COMMIT, it sends to its internal topic all the ContentFiles that it was responsible for writing, wrapped in a DataWritten object. It also sends a DataComplete object with these Events all as part of a single Kafka transaction.

In turn, if the Coordinator receives a DataComplete object, it calls the code that idempotently writes to Iceberg mentioned above within Iceberg transactions. That is, if the ContentFiles wrapped in the DataWritten object are already in the metadata, they are essentially ignored.

The Coordinator can also trigger a commit if it deems it to have timed out.

The key thing is the worker only acknowledges the SinkRecords it read from the external Kafka topic as part of the same transaction it uses to send the Events to its internal Kafka topic. That way, if the worker crashes after writing to Iceberg storage, those SinkRecords will be read again from Kafka and written again to Iceberg storage. However, the Kafka metadata will be updated exactly once - at the potential cost of some orphan files, it seems.

[In addition, Coordinator sends a CommitToTable and CommitComplete objects to its internal Kafka topic in an unrelated transaction. This appears to be for completeness as I can't see what purpose it serves.]

It's the Coordinator that, in a loop, sent a StartCommit object onto the internal Kafka topic in the first place. It only does this having deemed the commit ready (see previous blog post).

And so the cycle is complete.

Spark and K8s on AWS

2025-10-20T01:16:00.000-07:00

This is a high-level overview of my work creating a Spark cluster running in a managed AWS Kubernetes cluster (EKS) giving Spark the permissions to write to cross cloud storage. I might write further

EKS

Create a Kubernetes cluster with:

eksctl create cluster --name spark-cluster --nodes 3

Note that this can take about 15 minutes.

Note that you can have several K8s contexts on your laptop. You can see them with:

kubectl config get-contexts

and choose one with:

kubectl config use-context <your-cluster-context-name>

then you can run kubectl get pods -A and see K8s in the cloud.

A couple of caveats: I had to install aws-iam-authenticator by building it from scratch. This also required me to install GoLang. Binaries are installed in ~/go/bin.

Authorization

The hard thing about setting up a Kubernetes cluster in AWS is configuring permissions. This guide helped me but it was strictly the section Option 2: Using EKS Pod Identity that helped me.

Basically, you have to configure a bridge between the K8s cluster and AWS. This is done through through a CSI (Container Storage Interface), a standard K8s mechanism for storage but in this case it's a means for storing secrets.

The high-level recipe for using secrets is:

Create the secret with aws secretsmanager create-secret ...
Create a policy with aws iam create-policy... This reference the secret's ARNs in step 1.
Create a role with aws iam create-role... that allows a role to be assumed via STS.
Attach the policy in step 2 with the role in step 3 with aws iam attach-role-policy...
Create a SecretProviderClass with kubectl apply -f... that references the secrets created in step 1
Associate your K8s Deployment with the SecretProviderClass using its volumes.

Thanks to the installation of the CSI addons, secrets can be mounted in a Polaris container that are then used to vend credentials to Spark.

Diagnosing Spark in the Cluster

Connecting to Spark from my laptop produced these logs:

25/09/23 18:15:58 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://a457XXX.eu-west-2.elb.amazonaws.com:7077...

25/09/23 18:15:58 INFO TransportClientFactory: Successfully created connection to a457XXX.eu-west-2.elb.amazonaws.com/3.9.78.227:7077 after 26 ms (0 ms spent in bootstraps)

25/09/23 18:16:18 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://a457XXX.eu-west-2.elb.amazonaws.com:7077...

25/09/23 18:16:38 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://a457XXX.eu-west-2.elb.amazonaws.com:7077...

where the connections were timing out. So I logged in with:

kubectl exec -it prod-master-0 -- /bin/bash

and looked at the actual logs under /opt/spark/ which were far more illuminating.

Spark Kubernetes Operator

I used the Spark operator to configure a Spark cluster for me. However, examples/prod-cluster-with-three-workers.yaml seemed to be out of synch with the CRDs installed by Helm(?). The apiVersion seems to need to be:

apiVersion: spark.apache.org/v1alpha1

This change meant I could then start my Spark cluster.

Aside: to restart the whole cluster, use:

kubectl delete -f spark_cluster.yaml

kubectl apply -f spark_cluster.yaml

AWS CLI annoyances

Running the aws command rather annoyingly adds control characters. I spent a frustrated hour wondering why the exact text rendered from its output could not be used as input for another command.

To remove the invisible control characters, run something like:

aws secretsmanager list-secrets --query "SecretList[?Name=='CloudDetails'].ARN" --output text --no-cli-pager | tr -d '[:cntrl:]' | tr -d '\n'

where in this example, I trying to find the ARN of a particular secret.

Configuring Polaris Part 1

2025-10-15T02:00:00.000-07:00

To vend credentials, Polaris needs an AWS (or other cloud provider) account. But what if you want to talk to several AWS accounts? Well this ticket suggests an interesting workaround. It's saying "yeah, use just one AWS account but if you need to use others, set up a role that allows access to other AWS accounts, accounts outside the one that role lives in."

We are working in a cross cloud environment. We talk to not just AWS but GCP and Azure clouds. We happen to host Polaris in AWS but this choice was arbitrary. We can give Polaris the ability to vend credentials for all clouds no matter where it sits.

Integration with Spark

It's the spark.sql.catalog.YOUR_CATALOG.warehouse SparkConf value that identifies the Polaris catalog.

The YOUR_CATALOG defines the namespace. In fact, the top level value, spark.sql.catalog.YOUR_CATALOG, tells Spark which catalog to use (Hive, Polaris, etc).

So, basically, your config should look something like:

spark.sql.catalog.azure.oauth2.token POLARIS_ACCESS_TOKEN

spark.sql.catalog.azure.client_secret s3cr3t

spark.sql.catalog.azure.uri http://localhost:8181/api/catalog

spark.sql.catalog.azure.token POLARIS_ACCESS_TOKEN

spark.sql.catalog.azure.type rest

spark.sql.catalog.azure.scope PRINCIPAL_ROLE:ALL

spark.sql.catalog.azure.client_id root

spark.sql.catalog.azure.warehouse azure

spark.sql.catalog.azure.header.X-Iceberg-Access-Delegation vended-credentials

spark.sql.catalog.azure.credential root:s3cr3t

spark.sql.catalog.azure.cache-enabled false

spark.sql.catalog.azure.rest.auth.oauth2.scope PRINCIPAL_ROLE:ALL

spark.sql.catalog.azure org.apache.iceberg.spark.SparkCatalog

This is the config specific to my Azure catalog. AWS and GCP would have very similar config.

One small issue [GitHub] is that I needed the Iceberg runtime lib to be the first in the Maven dependencies.

Local Debugging

Put:

"-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005",

in build.gradle.kts in the

tasks.named<QuarkusRun>("quarkusRun") {

jvmArgs =

listOf(

section then run with:

./gradlew --stop && ./gradlew run

then you'll then be able to remotely debug by attaching to port 5005.

Configuring Polaris in a remote environment

Note that Polaris heavily uses Quarkus. "Quarkus aggregates configuration properties from multiple sources, applying them in a specific order of precedence." [docs]. First, java -D... properties, environment variables, application.properties (first on the local filepath then in the dependencies) and finally the hard-coded values.

Polaris in integration tests

Naturally, you're going to want to write a suite of regression tests. This is where the wonderful TestContainers shines. You can fire up a Docker container of Polaris in Java code.

There are some configuration issues. AWS and Azure are easy to configure within Polaris. You must just pass them the credentials as environment variables. GCP is a little harder as it's expecting a file of JSON containing its credentials (the Application Default Credentials file). Fortunately, TestContainers allows you to copy that file over once the container has started running.

myContainer = new GenericContainer<>("apache/polaris:1.1.0-incubating")

// AWS

.withEnv("AWS_ACCESS_KEY_ID", AWS_ACCESS_KEY_ID)

.withEnv("AWS_SECRET_ACCESS_KEY", AWS_SECRET_ACCESS_KEY)

// Azure

.withEnv("AZURE_CLIENT_SECRET", AZURE_CLIENT_SECRET)

.withEnv("AZURE_CLIENT_ID", AZURE_CLIENT_ID)

.withEnv("AZURE_TENANT_ID", AZURE_TENANT_ID)

// Polaris

.withEnv("POLARIS_ID", POLARIS_ID)

.withEnv("POLARIS_SECRET", POLARIS_SECRET)

.withEnv("POLARIS_BOOTSTRAP_CREDENTIALS", format("POLARIS,%s,%s", POLARIS_ID, POLARIS_SECRET))

// GCP

.withEnv("GOOGLE_APPLICATION_CREDENTIALS", GOOGLE_FILE)

.waitingFor(Wait.forHttp("/q/health").forPort(8182).forStatusCode(200));

;

myContainer.setPortBindings(List.of("8181:8181", "8182:8182"));

myContainer.start();

myContainer.copyFileToContainer(Transferable.of(googleCreds.getBytes()), GOOGLE_FILE);

The other thing you want for a reliable suite of tests is to wait until Polaris starts. Fortunately, Polaris is cloud native and offers a health endpoint which TestContainers can poll.

Polaris in EKS

I found I had to mix both AWS's own library (software.amazon.awssdk:eks:2.34.6) with the official Kubernetes library (io.kubernetes:client-java:24.0.0) before I could interrogate the Kubernetes cluster in AWS from my laptop and look at the logs of the Polaris container.

EksClient eksClient = EksClient.builder()

.region(REGION)

.credentialsProvider(DefaultCredentialsProvider.create())

.build();

DescribeClusterResponse clusterInfo = eksClient.describeCluster(

DescribeClusterRequest.builder().name(clusterName).build());

AWSCredentials awsCredentials = new BasicAWSCredentials(

AWS_ACCESS_KEY_ID,

AWS_SECRET_ACCESS_KEY);

var authentication = new EKSAuthentication(new STSSessionCredentialsProvider(awsCredentials),

region.toString(),

clusterName);

ApiClient client = new ClientBuilder()

.setBasePath(clusterInfo.cluster().endpoint())

.setAuthentication(authentication)

.setVerifyingSsl(true)

.setCertificateAuthority(Base64.getDecoder().decode(clusterInfo.cluster().certificateAuthority().data()))

.build();

Configuration.setDefaultApiClient(client);

Now you'll be able to query and monitor Polaris from outside AWS's Kubernetes offering, EKS.

Configuring Polaris for Azure

2025-09-17T01:14:00.000-07:00

Azure Config

To dispense tokens, you need to register what Azure calls an app. You do this with:

az ad app create \

--display-name "MyMultiTenantApp" \

--sign-in-audience "AzureADMultipleOrgs"

The sign-in-audience is an enum that determines if we're allowing single- or multi-tenant access (plus some Microsoft specific accounts).

Create the service principle with:

az ad sp create --id <appId>

where <appId> was spat out by the create command.

Finally, you need to assign roles to the app:

az role assignment create --assignee <appId> --role "Storage Blob Data Contributor" --scope /subscriptions/YOUR_SUBSCRIPTION/resourceGroups/YOUR_RESOURCE_GROUP

One last thing: you might need the credentials for this app so you can pass them to Polaris. You can do this with:

az ad app credential reset --id <appId>

and it will tell you the password. This is AZURE_CLIENT_SECRET (see below).

Polaris

First, Polaris needs these environment variables set to access Azure:

AZURE_TENANT_ID

AZURE_CLIENT_ID

AZURE_CLIENT_SECRET

Note that these are the credentials of the user connecting to Azure and not those of the principal that will dispense tokens. That is, they're your app's credentials not your personal ones.

Configuring the Catalog

The values perculiar to Azure that you need to add to the storageConfigInfo that is common to all cloud providers. These are:

tenantId. You can find this by running az account show --query tenantId.
multiTenantAppName. This is the Application (client) ID that was generated when the app was created. You can see it in Microsoft Entra ID -> App Registrations -> All Applications in the Azure portal or using the CLI: az ad app list, find your app with the name you created above and use its appId.
consentUrl. I'm not entirely sure what this is but can be generated with APPID=$(az ad app list --display-name "MyMultiTenantApp" --query "[0].appId" -o tsv) && echo "https://login.microsoftonline.com/common/oauth2/v2.0/authorize?client_id=$APPID&response_type=code&redirect_uri=http://localhost:3000/redirect&response_mode=query&scope=https://graph.microsoft.com/.default&state=12345"

Find out which url to use:

$ az storage account show --name odinconsultants --resource-group my_resource --query "{kind:kind, isHnsEnabled:isHnsEnabled}" -o table

Kind IsHnsEnabled

--------- --------------

StorageV2 True

HNS stands for hierarchical nested structure.

For StorageV2 and HNS equal to True, use abfss in the allowedLocations part of the JSON sent to /api/management/v1/catalogs.

Testing the credentials

Check the validity of the SAS token with:

az storage blob list --account-name odinconsultants --container-name myblob --sas-token $SAS_TOKEN --output table

We get SAS_TOKEN by putting a break point in Iceberg's ADLSOutputStream constructor.

Iceberg bug?

Iceberg asserts that the key in the map of Azure properties sent by Polaris for the expiry time is the string is adls.sas-token-expires-at-ms.PROJECT_NAME. But Polaris is emitting the string expiration-time. I've raised a bug in the Iceberg project here (14069).

I also raised a concurrency bug here (14070) but closed it when I realised it had been fixed in the main branch even if it wasn't in the latest (1.9.2) release.

Anyway, the upshot is that my workaround is to mask the Iceberg code by putting this (almost identical) class first in my classpath.