Using AWS Karpenter with spot instances

3 min read | by Jordi Prats

One of the advantages of using AWS Karpenter is that makes straightforward using spot instances. But how do we handle termination notices coming from AWS?

AWS Karpenter is not supposed handle the termination notices, if we want to drain the node to gracefully relocate it's resources before the instance is terminated we will have to install AWS node termination handler.

Supposing we have configured Karpenter to be able to use spot instances bt setting the key as follows:

kind: Provisioner
  name: pet2cattle-workers
  ttlSecondsUntilExpired: 2592000

  ttlSecondsAfterEmpty: 30

    nodelabel: example

    - key: "" 
      operator: In
      values: ["m5a.large", "m5a.xlarge", "m5a.2xlarge"]
    - key: "" 
      operator: In
      values: ["es-west-1a", "eu-west-1b", "eu-west-1c"]
    - key: "" 
      operator: In
      values: ["arm64", "amd64"]
    - key: ""
      operator: In
      values: ["spot", "on-demand"]

    instanceProfile: 'eks_pet2cattle_worker-instance-profile'
      Name: 'eks_pet2cattle-worker'
      exampleTag: TagValue

      cpu: 1000

We can take advantatge to the fact that AWS Karpenter, by default, adds the label to the nodes specifying whether it is a spot instance or and on demand instance:

$ kubectl describe node
Roles:              <none>
Annotations: {"":"i-0caac9adebadda005"}

We can use this label to select the nodes where we want to schedule the termination handler DaemonSet. To do so, we can install the termination handler with helm using the following settings:

helm repo add eks
helm upgrade --install aws-node-termination-handler --namespace termination-handler \
  --set enableSpotInterruptionDraining=true \

If we already have the termination handler installed we'll have to modify it's values.yaml to se the following options:

enableSpotInterruptionDraining: "true"

nodeSelector: "spot"

Having both Karpenter and the termination handler we are making sure we are handling the spot instances lifecycle: Once we receive the notification from AWS that the node is going to be terminated the instance is stopped so the Pods can, as gracefully as possible, be rescheduled on another node (or on a new node)

Posted on 21/01/2022