Monday, November 3, 2025

AWS, Kubernetes and more

Setting up a 3 node Kubernetes cluster in AWS is as simple as:

eksctl create cluster --name $CLUSTERNAME --nodes 3

but this really hides a huge amount of what is going on. Apart from IAM, eksctl automatically creates: 
  • a new Virtual Private Cloud (VPC) in which sit the K8s control plane and workers. A VPC is "a logically isolated and secure network environment that is separate from the rest of the AWS cloud" [1]
  • two public subnets and two private subnets (best practice if you want high availability). By putting worker nodes in the private subnet, they cannot be maliciously scanned from the internet.
  • all necessary NAT Gateways to allow the private subnets to access the internet
  • Internet Gateways allowing the internet to talk to your public subnets.
  • Route Tables which are just rules for network traffic. It's the "Routers use a route table to determine the best path for data packets to take between networks" [2]
You can see some details with:

$ eksctl get cluster --name=$CLUSTERNAME --region=$REGION
NAME VERSION STATUS CREATED VPC SUBNETS SECURITYGROUPS PROVIDER
spark-cluster 1.32 ACTIVE 2025-10-27T10:36:02Z vpc-REDACTED subnet-REDACTED,subnet-REDACTED,subnet-REDACTED,subnet-REDACTED,subnet-READACTED,subnet-REDACTED sg-REDACTED EKS

Terraform

If you use Terraform, you might need to configure your local kubectl to talk to the EKS cluster by hand.

First, back up your old config with:

mv ~/.kube ~/.kube_bk

then run:

aws eks update-kubeconfig --name $CLUSTERNAME --region $REGION

But if you are running aws via Docker, this will have updated ~/.kube/config in the container, not the host. So, run:

docker run --rm -it  -v ~/.aws:/root/.aws -v ~/.kube:/root/.kube  amazon/aws-cli eks update-kubeconfig --name $CLUSTERNAME --region $REGION

Now it will write to your host's config but even then you'll have to change the command at the end of the file to point to a non-Docker version (yes, you'll have to install the AWS binary - preferably in a bespoke directory so you can continue using the Docker version).

Another issue I had was the connection to the new EKS cluster was different to my ~/.kube/config. This in itself was not a problem as you can put in (using Java and CDKTF):

LocalExecProvisioner.builder()
    .when("create") // Run only when the resource is created
    .command(String.format(
        "aws eks update-kubeconfig --name %s --region %s",
        CLUSTER_NAME,
        AWS_REGION)
    )
    .type("local-exec")
    .build()

which depends on the EksCluster and the DataAwsEksClusterAuth and in turn, the failing resource are to depend on it. 

However, this introduced other problems. 

First, I tried to get the reading of ~/.kube/config to depends_on the EKS cluster. That way, I'd only read it once the cluster was up and running, right? Well, no. This introduces a circular dependency as it's read before the cluster is started.

Any fiddling with the dependency tree leads to reading ~/.kube/config when it's stale. So, you need to initialize the Kubernetes details (which appears to be global and otherwise implicit) directly with:

String base64CertData = cluster.getCertificateAuthority().get(0).getData();
String encodedCert    = com.hashicorp.cdktf.Fn.base64decode(base64CertData);
KubernetesProvider kubernetesProvider = KubernetesProvider.Builder.create(this, "kubernetes")
    .host(cluster.getEndpoint())
    .clusterCaCertificate(encodedCert)
    .token(eksAuthData.getToken()) // Dynamically generated token
    .build();

Strangely, you still need to define the environment variable, KUBE_CONFIG_PATH as some resources need it, albeit after it has been correctly amended with the current cluster's details. 

Zombie Clusters

Running:

tofu destroy -auto-approve

just kept hanging. So, I ran:

tofu state list | grep -E "(nat_gateway|eip|eks_cluster)"

and found some EKS components running that I had to kill with:

tofu destroy -auto-approve -target=...

Finally, kubectl get pods barfed with no such host.

Load balancers

The next problem was the tofu destroy action was constantly saying:

aws_subnet.publicSubnet2: Still destroying... [id=subnet-XXX, 11m50s elapsed]

So, I ran:

aws ec2 describe-network-interfaces \
    --filters "Name=subnet-id,Values=subnet-XXX" \
    --query "NetworkInterfaces[].[NetworkInterfaceId, Description, InterfaceType, Status]" \
    --output table

and got an ENI that I tried to delete with:

aws ec2 delete-network-interface --network-interface-id eni-XXX

only to be told that it was still in use. Ho, hum:

$ aws ec2 describe-network-interfaces \
    --network-interface-ids eni-XXX \
    --query "NetworkInterfaces[0].{ID:NetworkInterfaceId, Description:Description, Status:Status, Attachment:Attachment}" \
    --output json
...
        "InstanceOwnerId": "amazon-elb",
...

So, let's see what that load balancer is:

$ aws elb describe-load-balancers \
    --query "LoadBalancerDescriptions[?contains(Subnets, 'subnet-XXX')].[LoadBalancerName]" \
    --output text

which gives me its name and now I can kill it with:

aws elb delete-load-balancer --load-balancer-name NAME

Finally, the destroy just wasn't working, failing ultimately with:

│ Error: deleting EC2 VPC (vpc-XXX): operation error EC2: DeleteVpc, https response error StatusCode: 400, RequestID: 8412a305-..., api error DependencyViolation: The vpc 'vpc-XXX' has dependencies and cannot be deleted.

Just going into the web console and deleting it there was the simple but curious solution.

[1] Architecting AWS with Terraform
[2] The Self-Taught Cloud Computing Engineer

No comments:

Post a Comment