Dynamically provisioned PersistentVolumes using StatefulSet

kubernetes PersistentVolume Deployment StatefulSet

6 min read | by Jordi Prats

The basic idea behind a StatefulSet is to be able to manage stateful workloads on Kubernetes, unlike Deployments, creating a unique identity for each Pod using a common spec.

With this in mind we might just copy the Pod's template from a Deployment to a StatefulSet object to make it stateful, but it's not always quite that simple.

You'll be able to find all the object's definitions on the pet2cattle/kubernetes-statefullset-vs-deployment repository on GitHub.

Given the following Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: deploy-test
spec:
  replicas: 1
  selector:
    matchLabels:
      component: deploy-test
  template:
    metadata:
      labels:
        component: deploy-test
    spec:
      volumes:
      - name: empty-dir
        emptyDir: {}
      containers:
      - name: file-generator
        image: "alpine:latest"
        command:
        - sleep
        - '24h'
        volumeMounts:
          - mountPath: /test
            name: empty-dir

Let's blindly copy it's spec.template into a StatefulSet:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: sts-test
spec:
  serviceName: default
  replicas: 1
  selector:
    matchLabels:
      component: sts-test
  volumeClaimTemplates:
  - metadata:
      name: sts-test
      labels:
        component: sts-test
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 10Gi
  template:
    metadata:
      labels:
        component: sts-test
    spec:
      volumes:
      - name: empty-dir
        emptyDir: {}
      containers:
      - name: file-generator
        image: "alpine:latest"
        command:
        - sleep
        - '24h'
        volumeMounts:
          - mountPath: /test
            name: empty-dir

If we deploy both objects, we will be able to see how the Deployment creates a Pod with hash on it's name. Meanwhile, the StatefulSet will give it a friendlier name:

$ kubectl get pods
NAME                           READY   STATUS    RESTARTS   AGE
deploy-test-57bb4d58bf-c67ck   1/1     Running   0          79s
sts-test-0                     1/1     Running   0          37s

The fact it is using a friendlier name doesn't really mean anything: What's important it to note is that a Deployment, Pods are meant to be interchanged (cattle). Using a StatefulSet, on the other hand, each Pod has it's own identity (pets).

Another important difference, we will be able to see how for the StatefulSet will create a PersistentVolume for the emptyDir volume:

$ kubectl get pv | grep sts
pvc-30131c25-2c6c-4883-9f30-58793c72b442   10Gi       RWO            Delete           Bound    test/sts-test-sts-test-0                                  ebs-gp2                 68s

Does this means that the data on the Volume is persistent? Actually, it's not.

Let's try to write some data on the volume using the Pod created by the Deployment and then delete the Pod:

$ kubectl get pods
NAME                           READY   STATUS    RESTARTS   AGE
deploy-test-57bb4d58bf-c67ck   1/1     Running   0          2m14s
$ kubectl exec -it deploy-test-57bb4d58bf-c67ck -- ls -l /test
total 0
$ kubectl exec -it deploy-test-57bb4d58bf-c67ck -- touch /test/persistence
$ kubectl exec -it deploy-test-57bb4d58bf-c67ck -- ls -l /test
total 0
-rw-r--r--    1 root     root             0 Feb 16 22:31 persistence
$ kubectl delete pod deploy-test-57bb4d58bf-c67ck
pod "deploy-test-57bb4d58bf-c67ck" deleted

As expected, if we check the volume again it will be empty:

$ kubectl get pods
NAME                           READY   STATUS    RESTARTS   AGE
deploy-test-57bb4d58bf-878x2   1/1     Running   0          57s
sts-test-0                     1/1     Running   0          4m35s
$ kubectl exec -it deploy-test-57bb4d58bf-878x2 -- ls -l /test
total 0

How is this going to be handled on a StatefulSet? The volume that will be used if going to be a PersistentVolume but it's data will be wiped at each container restart. We can repeat the same test on the Pod created by the StatefulSet:

$ kubectl exec -it sts-test-0 -- ls -l /test
total 0
$ kubectl exec -it sts-test-0 -- touch /test/persistence
$ kubectl exec -it sts-test-0 -- ls -l /test
total 0
-rw-r--r--    1 root     root             0 Feb 16 22:34 persistence
$ kubectl delete pod sts-test-0
pod "sts-test-0" deleted
$ kubectl get pods
NAME                           READY   STATUS    RESTARTS   AGE
deploy-test-57bb4d58bf-878x2   1/1     Running   0          3m27s
sts-test-0                     1/1     Running   0          17s
$ kubectl exec -it sts-test-0 -- ls -l /test
total 0

It's not emptying the data because we are deleting the Pod, even with a rollout restart we'll get the same result:

$ kubectl exec -it sts-test-0 -- ls -l /test
total 0
$ kubectl exec -it sts-test-0 -- touch /test/persistence
$ kubectl exec -it sts-test-0 -- ls -l /test
total 0
-rw-r--r--    1 root     root             0 Feb 16 22:39 persistence
$ kubectl rollout restart sts sts-test
statefulset.apps/sts-test restarted
$ kubectl get pods
NAME                           READY   STATUS        RESTARTS   AGE
sts-test-0                     1/1     Terminating   0          4m33s
$ kubectl get pods
NAME                           READY   STATUS    RESTARTS   AGE
sts-test-0                     1/1     Running   0          34s
$ kubectl exec -it sts-test-0 -- ls -l /test
total 0

The volume is wiped because we are using an emptyDir that will make sure that every time a Pod is deleted/restarted/... the data in the emptyDir is deleted permanently:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: sts-test
spec:
(...)
  template:
    metadata:
      labels:
        component: sts-test
    spec:
      volumes:
      - name: empty-dir
        emptyDir: {}
      containers:
      - name: file-generator
        image: "alpine:latest"
        command:
        - sleep
        - '24h'
        volumeMounts:
          - mountPath: /test
            name: empty-dir

If we want data to be persistent across restarts, what we really want is a dynamically provisioned PersistentVolume (one for each replica). To accomplish this we can use the volumeClaimTemplates. We only need to make sure it's name matches the name of the volume we are mounting using volumeMounts (no need to declare the volume under spec.template.spec.volumes)

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: sts-vt-test
spec:
  serviceName: default
  replicas: 1
  selector:
    matchLabels:
      component: sts-vt-test
  volumeClaimTemplates:
  - metadata:
      name: sts-vt-test
      labels:
        component: sts-vt-test
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 10Gi
  template:
    metadata:
      labels:
        component: sts-vt-test
    spec:
      containers:
      - name: file-generator
        image: "alpine:latest"
        command:
        - sleep
        - '24h'
        volumeMounts:
          - mountPath: /test
            name: sts-vt-test

Once this new StatefulSet object is deployed, we can create a test file and then delete the Pod as we previously did, but this time the file will be there:

$ kubectl apply -f sts-volume-template.yaml
statefulset.apps/sts-vt-test created
$ kubectl get pv | grep sts
pvc-1bde765e-cd2b-4de9-a1f5-92095dccc11a   10Gi       RWO            Delete           Bound    test/sts-vt-test-sts-vt-test-0                            ebs-gp2                 12s
$ kubectl exec -it sts-vt-test-0 -- ls -l /test
total 16
drwx------    2 root     root         16384 Feb 16 23:54 lost+found
$ kubectl exec -it sts-vt-test-0 -- touch /test/persistence
$ kubectl exec -it sts-vt-test-0 -- ls -l /test
total 16
drwx------    2 root     root         16384 Feb 16 23:54 lost+found
-rw-r--r--    1 root     root             0 Feb 16 23:55 persistence
$ kubectl delete pod sts-vt-test-0
pod "sts-vt-test-0" deleted
$ kubectl exec -it sts-vt-test-0 -- ls -l /test
total 16
drwx------    2 root     root         16384 Feb 16 23:54 lost+found
-rw-r--r--    1 root     root             0 Feb 16 23:55 persistence

Posted on 21/02/2022