K3S: High Availability & Scaling
Understanding Horizontal Pod Autoscaling (HPA)
Your application is running fine—until traffic spikes, and suddenly everything catches fire. Enter Horizontal Pod Autoscaling (HPA), which automatically adjusts the number of running Pods based on demand.
What is HPA, and why is it important?
HPA dynamically scales your application up or down based on resource usage, ensuring you don’t have too many idle Pods eating resources—or too few, leading to performance meltdowns.
Configuring HPA to automatically scale Pods based on CPU usage
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp-deployment
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50Applying and verifying HPA
kubectl apply -f hpa.yaml
kubectl get hpaStress testing autoscaling with kubectl run stress
Want to see HPA in action? Stress test your cluster:
kubectl run stress --image=busybox -- /bin/sh -c "while true; do :; done"This will push CPU usage up, triggering HPA to scale up Pods. Just don’t forget to clean up after!
Implementing K3s Multi-Master Setup
If a single master node goes down, your cluster is doomed. A multi-master setup ensures redundancy and keeps your cluster running even if a node bites the dust.
Why use a multi-master setup for high availability?
- Redundancy: No single point of failure.
- Load balancing: Evenly distributes control plane tasks.
- Better sleep quality: Because waking up to a crashed cluster is a nightmare.
Setting up multiple K3s master nodes
curl -sfL https://get.k3s.io | sh -s - server --cluster-initJoining additional control plane nodes
curl -sfL https://get.k3s.io | sh -s - server --server https://<master-ip>:6443 --token <token>Verifying the multi-master cluster
kubectl get nodesIf you see multiple master nodes, congratulations! You now have a highly available control plane.
Managing Node Scaling in K3s Clusters
Scaling your application is one thing; scaling the actual cluster is another.
Manually adding and removing worker nodes
curl -sfL https://get.k3s.io | K3S_URL=https://<master-ip>:6443 K3S_TOKEN=<token> sh -Run this on any machine you want to join as a worker node.
Using Cluster Autoscaler for automatic node scaling
The Cluster Autoscaler automatically adds or removes worker nodes based on demand. If your Pods are unschedulable due to resource constraints, Autoscaler provisions more nodes.
Monitoring node scaling with:
kubectl get nodes -wWatch as new nodes join (or disappear) dynamically.
Disaster Recovery Strategies
No matter how many precautions you take, something will break. When it does, you’ll wish you had backups.
Backing up and restoring etcd in K3s
Backup your etcd database to ensure you can restore your cluster if things go south.
k3s etcd-snapshot save --name backup.dbRestoring from a snapshot
k3s server --cluster-reset --cluster-reset-restore-path=/var/lib/rancher/k3s/server/db/snapshots/backup.dbHigh availability best practices
- Distribute workloads across multiple nodes
- Regularly test backup and restore processes
- Implement monitoring and alerts for failures
Hands-On Exercise
Time to put your skills to the test:
- Set up Horizontal Pod Autoscaling (HPA) for a workload and watch it scale.
- Deploy a K3s Multi-Master setup to ensure high availability.
- Configure node scaling and test Cluster Autoscaler.
- Simulate and recover from a cluster failure using backups.
Master this, and your K3s cluster will be ready for anything—even the dreaded 3 AM outage. 🚀