Kubernetes: Autoscaling using Prometheus as a external metrics provider

kubernetes hpa prometheus external metrics

5 min read | by Jordi Prats

Using an external metrics provider (Kubernetes 1.10+) we can use an HorizontalPodAutoscaler to automatically scale applications using any metric collected by Prometheus. Let's take a look on how to configure it

To test this out we'll need to have prometheus and prometheus-adapter installed, to do so we can use it's community-supported helm chart. First we'll need to add it's repository:

$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

Then we'll have to install both charts:

$ helm install prometheus prometheus-community/prometheus --namespace prometheus
$ helm install prometheus-adapter prometheus-community/prometheus-adapter --namespace prometheus

We are going to use the prometheus-adapter to be able to present prometheus metrics to the cluster in a way it can understand and therefore suitable for autoscaling using an HorizontalPodAutoscaler

Let's suppose we have an application that exposes metrics using /metrics, first we'll need to let prometheus know this endpoint can be scrapped using the following annotations on the Pod template:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo-prometheus-enabled-app
spec:
  selector:
    matchLabels:
      app: metricsdemo
  template:
    metadata:
      annotations:
        prometheus.io/path: "/metrics"
        prometheus.io/scrape: "true"
        prometheus.io/port: "8000"
      labels:
        app: metricsdemo
    spec:
      containers:
        - name: metricsdemo
          image: "jordiprats/demoappprometrics:1.0"

Prometheus once it sees these annotations, it will start scrapping it's metrics. Once we have the metrics available on prometheus, they might not be entirely suitable to use it together with the HorizontalPodAutoscaler. We can transform them using some rules, to do so we'll have to update the prometheus-adapter values to include the rules we need using the rules.custom option.

For example, let's assume the application publishes a metricsdemo_requests that we want to average within a 2 minutes window and then calculate it's rate. To do so we would need to create the following rule:

rules:
  custom:
  - seriesQuery: "metricsdemo_requests"
    resources:
      overrides:
        kubernetes_namespace:
          resource: metricsdemo
        kubernetes_pod_name:
          resource: pod
    name:
      matches: "^(.*)"
      as: "\u0024{1}_avg"
    metricsQuery: "sum(rate(<<.Series>>[2m])) by (<<.GroupBy>>)"

Once we push this config to Kubernetes, it will take a while for it to become available. We can check when it's available using the following command:

$ kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1  | python -m json.tool
{
    "kind": "APIResourceList",
    "apiVersion": "v1",
    "groupVersion": "custom.metrics.k8s.io/v1beta1",
    "resources": [
        {
            "name": "pods/metricsdemo_requests_avg",
            "singularName": "",
            "namespaced": true,
            "kind": "MetricValueList",
            "verbs": [
                "get"
            ]
        },
        {
            "name": "namespaces/metricsdemo_requests_avg",
            "singularName": "",
            "namespaced": false,
            "kind": "MetricValueList",
            "verbs": [
                "get"
            ]
        }
    ]
}

Once it's available, we can start creating HorizontalPodAutoscaler objects using it, for example:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: demo-prometheus-enabled-app
  labels:
    {{- include "metricsdemo.labels" . | nindent 4 }}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: demo-prometheus-enabled-app
  minReplicas: 3
  maxReplicas: 30
  metrics:
    - type: Pods
      pods:
        metricName: metricsdemo_requests_avg
        targetAverageValue: 10

Once we apply it, it will control the number of replicas of the Deployment so we won't be able to use kubectl scale:

$ kubectl describe hpa metricsdemo -n metricsdemo
Name:                                     metricsdemo
Namespace:                                metricsdemo
(...)
Reference:                                Deployment/metricsdemo
Metrics:                                  ( current / target )
  "metricsdemo_requests_avg" on pods:  0 / 10
Min replicas:                             3
Max replicas:                             30
Deployment pods:                          3 current / 3 desired
Conditions:
  Type            Status  Reason            Message
  ----            ------  ------            -------
  AbleToScale     True    ReadyForNewScale  recommended size matches current size
  ScalingActive   True    ValidMetricFound  the HPA was able to successfully calculate a replica count from pods metric metricsdemo_requests_avg
  ScalingLimited  True    TooFewReplicas    the desired replica count is less than the minimum replica count
Events:           <none>

As soon as the metrics reaches the configured threshold, the HPA will kicks in changing the number of replicas of the Deployment (please notice the Deployment pods line)

$ kubectl describe hpa -n metricsdemo
Name:                                     metricsdemo
Namespace:                                metricsdemo
(...)
Reference:                                Deployment/metricsdemo
Metrics:                                  ( current / target )
  "metricsdemo_requests_avg" on pods:  794791m / 10
Min replicas:                             3
Max replicas:                             30
Deployment pods:                          3 current / 6 desired
Conditions:
  Type            Status  Reason            Message
  ----            ------  ------            -------
  AbleToScale     True    SucceededRescale  the HPA controller was able to update the target scale to 6
  ScalingActive   True    ValidMetricFound  the HPA was able to successfully calculate a replica count from pods metric metricsdemo_requests_avg
  ScalingLimited  True    ScaleUpLimit      the desired replica count is increasing faster than the maximum scale rate
(...)
$ kubectl describe hpa -n metricsdemo
Name:                                     metricsdemo
Namespace:                                metricsdemo
(...)
Reference:                                Deployment/metricsdemo
Metrics:                                  ( current / target )
  "metricsdemo_requests_avg" on pods:  794791m / 10
Min replicas:                             3
Max replicas:                             30
Deployment pods:                          6 current / 6 desired
Conditions:
  Type            Status  Reason             Message
  ----            ------  ------             -------
  AbleToScale     False   FailedUpdateScale  the HPA controller was unable to update the target scale: Operation cannot be fulfilled on deployments.apps "metricsdemo": the object has been modified; please apply your changes to the latest version and try again
  ScalingActive   True    ValidMetricFound   the HPA was able to successfully calculate a replica count from pods metric metricsdemo_requests_avg
  ScalingLimited  True    ScaleUpLimit       the desired replica count is increasing faster than the maximum scale rate
(...)
$ kubectl describe hpa -n metricsdemo
Name:                                     metricsdemo
Namespace:                                metricsdemo
(...)
Reference:                                Deployment/metricsdemo
Metrics:                                  ( current / target )
  "metricsdemo_requests_avg" on pods:  0 / 10
Min replicas:                             3
Max replicas:                             30
Deployment pods:                          24 current / 30 desired
Conditions:
  Type            Status  Reason            Message
  ----            ------  ------            -------
  AbleToScale     True    SucceededRescale  the HPA controller was able to update the target scale to 30
  ScalingActive   True    ValidMetricFound  the HPA was able to successfully calculate a replica count from pods metric metricsdemo_requests_avg
  ScalingLimited  True    TooManyReplicas   the desired replica count is more than the maximum replica count
Events:
  Type     Reason                        Age                    From                       Message
  ----     ------                        ----                   ----                       -------
  Normal   SuccessfulRescale             11m                    horizontal-pod-autoscaler  New size: 3; reason: Current number of replicas below Spec.MinReplicas
(...)

We can achieve similar results using datadog as an external metrics provider


Posted on 05/04/2022