3 min read | by Jordi Prats
On Kubernetes, scaling an application is just a matter of defining how many replicas we want:
$ kubectl scale deployment/demo --replicas=5
deployment.apps/demo scaled
Having to manually adjust the number of replicas is not really practical. Here's where the HorizontalPodAutoscaler (HPA) comes into play
An HPA can be configured to use resource metrics (metrics.k8s.io), custom metrics (custom.metrics.k8s.io) and external metrics (external.metrics.k8s.io). The most basic usage is using resource metrics provided by the metrics-server that will need to be installed. We can check it's availability using kubectl get apiservice:
$ kubectl get apiservice | grep metrics
v1beta1.metrics.k8s.io default/metrics-server True 15d
Once we have checked that it is available we will have to make sure the Pod have a resource request configured (or at least the namespace has a LimitRange in place). We can check it by taking a look at the Pod definition:
$ kubectl get pod ampa-voting-5bd8449967-sstrw -o yaml
apiVersion: v1
kind: Pod
metadata:
name: spin-clouddriver-8b84fcf99-4nb74
spec:
affinity: {}
containers:
- image: jordiprats/pet2cattle
name: pet2cattle
ports:
- containerPort: 8008
protocol: TCP
resources:
limits:
cpu: "2"
memory: 8000Mi
requests:
cpu: 200m
memory: 1000Mi
(...)
Once we have resource requests in place, we can create a new HPA imperatively using kubectl autoscale specifying which deployment we want to control. It's options are:
On the following example we are going to create a HPA that will keep the number of replicar between 2 and 10, scaling the application when the CPU actual usage goes beyond the 80% of the requested resources:
$ kubectl autoscale deployment ampa-voting --min=2 --max=10 --cpu-percent=80
horizontalpodautoscaler.autoscaling/ampa-voting autoscaled
Once we have it in place it's going to take a while to collect the statistics:
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
ampa-voting Deployment/ampa-voting <unknown>/80% 2 10 0 7s
After that it will start scaling the deployment based on the CPU usage of the existing Pods:
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
ampa-voting Deployment/ampa-voting 29%/80% 2 10 4 10m
Posted on 01/07/2021