Wednesday, March 19, 2025

Storage on Kubernetes Part 1

... is not easy. And it appears I'm not the only one to think so.
"Storage is hard. Fast, reliable storage is really hard" - Hubt on Discord
These are some notes I made while trying to set up a Kubernetes home lab on spare hardware.
 
K8s approach to Storage

Once more, K8s delegates to containers to manage storage.

"Applications that require persistent storage would fail to start because the cluster doesn’t yet have a storage driver. Like network drivers, several storage drivers are available for Kubernetes. The Container Storage Interface (CSI) provides the standard that storage drivers need to meet to work with Kubernetes. We’ll use Longhorn, a storage driver from Rancher; it’s easy to install and doesn’t require any underlying hard­ ware like extra block devices or access to cloud­based storage." [1] 

A prerequisite for Longhorn is that I need to run this on my boxes:

sudo apt install -y apt-transport-https open-iscsi nfs-common
sudo systemctl enable --now iscsid

Again, I need to allow all connections between my nodes, so time to fiddle with the firewall.

After installing Longhorn, running this reported that my replica count could not be satisfied:

kubectl -n longhorn-system logs -l app=longhorn-manager

with error "No available disk candidates to create a new replica of size 10737418240". Patching did not seem to help:

kubectl patch setting default-replica-count -n longhorn-system --type='merge' -p '{"value": "1"}'

Neither did:

kubectl edit storageclass longhorn

to edit the numberOfReplicas

(Note that Longhorn comes with a GUI that you can see if you port forward with:

kubectl port-forward -n longhorn-system svc/longhorn-frontend 8080:80

but this didn't help either).

So, instead, I downloaded the YAML, edited the numberOfReplicas by hand and deployed to the cluster.

Unfortunately, when I deleted my kafka and longhorn-service namespaces, the command would not terminated. It seemed that the kafka PVCs depended on the PVs that used Longhorn. 

Cyclic dependencies
I managed to finally kill the longhorn namespace that was constantly Terminating by manually deleting the PVs with kubectl edit and 

kubectl get namespace longhorn-system -o json > ns.json

deleting the finalizers in ns.json by hand and running:

kubectl replace --raw "/api/v1/namespaces/longhorn-system/finalize" -f ns.json

For the PVCs, I had to do the same things but since they depended on Longhorn webhooks, I needed to delete them first with:

kubectl get mutatingwebhookconfigurations
kubectl get validatingwebhookconfigurations
kubectl delete mutatingwebhookconfiguration <webhook-name>
kubectl delete validatingwebhookconfiguration <webhook-name>

Finally, 

kubectl get all -n <namespace>
kubectl get pvc -n <namespace>

indicated that everything had been cleaned up.

Phew! But now I'm back where I started and this was a lot of work (albeit a great way to understand Kubernetes). 

I then deployed Longhorn again only to have it complaining "failed to get backup target default: backuptarget.longhorn.io". Oh, boy.
"I like microk8s for having everything out of the box mostly running and turnable on with some little effort. Except metallb 😛" Darkwind on Discord
MicroK8s is a lightweight Kubernetes implementation that is great for CI/CD and (apparently) just works out-of-the-box. I might just install that...

[1] Book of Kubernetes

No comments:

Post a Comment