Kubernetes: Enforcing policies using the OPA gatekeeper

Kubernetes Policy enforcement OPA gatekeeper

6 min read | by Jordi Prats

We might call it best-practices or policies but most organizations have some rules about how their applications should run, for example: Do not use the latest tag. Some others might even be required to meet certain compliance requirements to reach some security standard, for example: Do not use NodePort services.

To be able to enforce these policies we can use a policy engine like OPA.

OPA gatekeeper is a validating webhook that enforces CRD-based policies executed by the Open Policy Agent.

To implement these policies as CRDs it uses two objects:

  • A ConstraintTemplate that defines the rule
  • The actual instance of the defined template, allowing you to control (using parameter) when and how it is actually applied the rule

The validating webhook, by default, is ignored when it is returning an error (webhook down or unreachable): During this time constraints will not be enforced, but the audit process is expected to highlight any invalid resources that made it into the cluster.

We can also set it to fail closed by changing the failurePolicy of the ValidatingWebhookConfiguration object to Fail (with the helm chart it is controlled using the validatingWebhookFailurePolicy option) Even though you can always edit or delete a ValidatingWebhookConfiguration since these operations are not subjected to admission webhooks, it can cause circular dependencies where the actions to fix the webhook itself depend on some other action that failed because the webhook is failing: This needs to be carefully considered

OPA gatekeeper installation

To install OPA gatekeeper we can either install the manifests or go for a helm based install, which can't be more straightforward:

helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts
helm install gatekeeper/gatekeeper --name-template=gatekeeper --namespace gatekeeper-system --create-namespace

Policy configuration

In order to start applying our rules we can start writing them from scratch or look at the gatekeeper-library for rules that somebody have already implemented.

They provide a way of installing them using kustomize as follows:

$ kustomize build github.com/open-policy-agent/gatekeeper-library/library | kubectl apply --filename -

But we can choose hand pick which ConstraintTemplate we want to install. For example, to prevent the usage of NodePort we will have to install the following definition:

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8sblocknodeport
  annotations:
    description: >-
      Disallows all Services with type NodePort.
      https://kubernetes.io/docs/concepts/services-networking/service/#nodeport
spec:
  crd:
    spec:
      names:
        kind: K8sBlockNodePort
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8sblocknodeport
        violation[{"msg": msg}] {
          input.review.kind.kind == "Service"
          input.review.object.spec.type == "NodePort"
          msg := "User is not allowed to create service of type NodePort"
        }

Once this constraint is installed, we will be able to create instances of this rule by creating the K8sBlockNodePort object as follows:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sBlockNodePort
metadata:
  name: block-node-port
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Service"]    

With this in place we won't be able to create the following object:

apiVersion: v1
kind: Service
metadata:
  name: demo-nodeport
spec:
  type: NodePort
  ports:
  - port: 80
    targetPort: 80
    protocol: TCP
    name: http
  selector:
    app: pet2cattle

If we try, we will get the following error message:

$ kubectl apply -f nodeportdemo.yaml 
Error from server ([block-node-port] User is not allowed to create service of type NodePort): error when creating "nodeportdemo.yaml": admission webhook "validation.gatekeeper.sh" denied the request: [block-node-port] User is not allowed to create service of type NodePort

We can use these objects to fine-tune what we want to block, for example, we can also apply this restriction to a specific namespace as follows:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sBlockNodePort
metadata:
  name: block-node-port
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Service"]
    namespaces:
      - "test"

With this configuration we will be able to create NodePort services on the namespaces that are not included on the list:

$ kubectl apply -f demonodeport.yaml -n test
Error from server ([block-node-port] User is not allowed to create service of type NodePort): error when creating "demonodeport.yaml": admission webhook "validation.gatekeeper.sh" denied the request: [block-node-port] User is not allowed to create service of type NodePort
$ kubectl apply -f demonodeport.yaml -n trial
service/demo-nodeport created

Audit existing resources

With the OPA gatekeeper installation, there is a Pod that performs periodic evaluations of existing resources against constraints, detecting pre-existing misconfigurations or any object that somehow managed to get past the validating webhook.

We have tree main ways of checking the audit results:

  • Prometheus Metrics: provides an aggregated look at the number of audit violations
  • Constraint Status: Violations are listed in the status field of the corresponding constraint
  • Audit Logs: At each run, the audit pod emits JSON-formatted logs with information about the violations

For example, if we had an existing NodePort service before applying the K8sBlockNodePort constraint we will be able to see the violations on it's status:

$ kubectl describe K8sBlockNodePort block-node-port
Name:         block-node-port
Namespace:    
Labels:       <none>
Annotations:  helm.sh/hook: post-install,post-upgrade
API Version:  constraints.gatekeeper.sh/v1beta1
Kind:         K8sBlockNodePort
(...)
Spec:
  Match:
    Kinds:
      API Groups:

      Kinds:
        Service
Status:
  Audit Timestamp:  2022-03-28T21:34:21Z
  By Pod:
    Constraint UID:       3640d4af-dead-beef-a123-26bc96100f3e
    Enforced:             true
    Id:                   gatekeeper-audit-b8af6ac78c-pf6pr
    Observed Generation:  3
    Operations:
      audit
      mutation-status
      status
    Constraint UID:       3640d4af-dead-beef-a123-26bc96100f3e
    Enforced:             true
    Id:                   gatekeeper-controller-manager-b8af6ac78c-9btw4
    Observed Generation:  3
    Operations:
      mutation-webhook
      webhook
    Constraint UID:       3640d4af-dead-beef-a123-26bc96100f3e
    Enforced:             true
    Id:                   gatekeeper-controller-manager-b8af6ac78c-w9dw7
    Observed Generation:  3
    Operations:
      mutation-webhook
      webhook
    Constraint UID:       3640d4af-dead-beef-a123-26bc96100f3e
    Enforced:             true
    Id:                   gatekeeper-controller-manager-b8af6ac78c-qq847
    Observed Generation:  3
    Operations:
      mutation-webhook
      webhook
  Total Violations:  1
  Violations:
    Enforcement Action:  deny
    Kind:                Service
    Message:             User is not allowed to create service of type NodePort
    Name:                demo-nodeport
    Namespace:           trial
Events:                  <none>

And on the audit pod as well:

$ kubectl get pods -n gatekeeper-system
NAME                                             READY   STATUS    RESTARTS   AGE
gatekeeper-audit-b8af6ac78c-pf6pr                1/1     Running   0          43h
gatekeeper-controller-manager-b8af6ac78c-9btw4   1/1     Running   0          43h
gatekeeper-controller-manager-b8af6ac78c-w9dw7   1/1     Running   0          43h
gatekeeper-controller-manager-b8af6ac78c-qq847   1/1     Running   0          9h
$ kubectl logs gatekeeper-audit-b8af6ac78c-pf6pr -n gatekeeper-system
(...)
{"level":"info","ts":1648496316.0117562,"logger":"controller","msg":"starting update constraints loop","process":"audit","audit_id":"2022-03-28T21:38:36Z","constraints to update":"map[{K8sBlockNodePort constraints.gatekeeper.sh/v1beta1 block-node-port}:{map[apiVersion:constraints.gatekeeper.sh/v1beta1 kind:K8sBlockNodePort metadata:map[annotations:map[helm.sh/hook:post-install,post-upgrade] creationTimestamp:2022-03-23T22:11:42Z generation:3 managedFields:[map[apiVersion:constraints.gatekeeper.sh/v1beta1 fieldsType:FieldsV1 fieldsV1:map[f:status:map[]] manager:gatekeeper operation:Update time:2022-03-23T22:11:42Z] map[apiVersion:constraints.gatekeeper.sh/v1beta1 fieldsType:FieldsV1 fieldsV1:map[f:metadata:map[f:annotations:map[.:map[] f:helm.sh/hook:map[]]] f:spec:map[.:map[] f:match:map[.:map[] f:kinds:map[]]]] manager:terraform-provider-helm_v2.4.1_x5 operation:Update time:2022-03-23T22:11:42Z]] name:block-node-port resourceVersion:166046384 uid:0364a4df-d8e9-44fe-b131-0e602961f3bc] spec:map[match:map[kinds:[map[apiGroups:[] kinds:[Service]]]]] status:map[auditTimestamp:2022-03-28T13:32:21Z byPod:[map[constraintUID:0364a4df-d8e9-44fe-b131-0e602961f3bc enforced:true id:gatekeeper-audit-6c78cb88bf-9btw4 observedGeneration:3 operations:[audit mutation-status status]] map[constraintUID:0364a4df-d8e9-44fe-b131-0e602961f3bc enforced:true id:gatekeeper-controller-manager-b8af6ac78c-w9dw7 observedGeneration:3 operations:[mutation-webhook webhook]] map[constraintUID:0364a4df-d8e9-44fe-b131-0e602961f3bc enforced:true id:gatekeeper-controller-manager-b8af6ac78c-qq847 observedGeneration:3 operations:[mutation-webhook webhook]] map[constraintUID:0364a4df-d8e9-44fe-b131-0e602961f3bc enforced:true id:gatekeeper-controller-manager-b8af6ac78c-nr6pn observedGeneration:3 operations:[mutation-webhook webhook]]] totalViolations:1 violations:[map[enforcementAction:deny kind:Service message:User is not allowed to create service of type NodePort name:demo-nodeport namespace:trial]]]]}]"}

Posted on 29/03/2022