How to debug a crossplane provider

crossplane kubernetes troubleshooting debug

4 min read | by Jordi Prats

To be able to setup a crossplane provider there are some pieces that need to be aligned to be able to use it. For example, if we want to setup the AWS provider using an IAM Role for ServiceAccount. If something is missaligned, we might end up with an error while creating resources that doesn't really clarify what's the actual error:

$ kubectl describe bucket.s3.aws.crossplane.io/test-bucket
Name:         test-bucket
Namespace:    
Labels:       <none>
Annotations:  crossplane.io/external-name: pet2cattle-demo
API Version:  s3.aws.crossplane.io/v1beta1
Kind:         Bucket
Metadata:
(...)
Spec:
(...)
  Provider Config Ref:
    Name:  aws-provider
Status:
  At Provider:
    Arn:  
  Conditions:
    Last Transition Time:  2022-02-22T21:43:23Z
    Message:               observe failed: failed to query Bucket: api error MovedPermanently: Moved Permanently
    Reason:                ReconcileError
    Status:                False
    Type:                  Synced
Events:
  Type     Reason                         Age               From                                 Message
  ----     ------                         ----              ----                                 -------
  Warning  CannotObserveExternalResource  7s (x6 over 36s)  managed/bucket.s3.aws.crossplane.io  failed to query Bucket: api error MovedPermanently: Moved Permanently

Assuming we have the following configuration for the AWS provider:

apiVersion: pkg.crossplane.io/v1alpha1
kind: ControllerConfig
metadata:
  name: aws-config
spec:
  podSecurityContext:
    fsGroup: 2000
---
apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
  name: provider-aws
spec:
  package: crossplane/provider-aws:v0.24.1
  controllerConfigRef:
    name: aws-config
---
apiVersion: aws.crossplane.io/v1beta1
kind: ProviderConfig
metadata:
  name: aws-provider
spec:
  credentials:
    source: InjectedIdentity

What we can do is run a shell on the AWS provider pod to check for the processes' command-line arguments:

$ kubectl get pods
NAME                                       READY   STATUS    RESTARTS   AGE
crossplane-c7b774b95-t8t6n                 1/1     Running   0          6h31m
crossplane-rbac-manager-6cff7fbf67-nz7bb   1/1     Running   0          6h31m
provider-aws-f78664a342f1-575cccfd-g629p   1/1     Running   0          8m21s
$ kubectl exec -it provider-aws-f78664a342f1-575cccfd-g629p -- sh
/ $ ps
PID   USER     TIME  COMMAND
    1 2000      1:14 crossplane-aws-provider
   17 2000      0:00 sh
   24 2000      0:00 ps
/ $ crossplane-aws-provider --help
usage: crossplane-aws-provider [<flags>]

AWS support for Crossplane.

Flags:
      --help             Show context-sensitive help (also try --help-long and --help-man).
  -d, --debug            Run with debug logging.
  -s, --sync=1h          Sync interval controls how often all resources will be double checked for drift.
      --poll=1m          Poll interval controls how often an individual resource should be checked for drift.
  -l, --leader-election  Use leader election for the conroller manager.

/ $ 

Since the provider's deployment is managed by the crossplane controller we won't be able to patch it using neither kubectl edit nor kubectl patch. Instead, we can use the ControllerConfig. We can use spec.args to add a -debug flag:

apiVersion: pkg.crossplane.io/v1alpha1
kind: ControllerConfig
metadata:
  name: aws-config
spec:
  podSecurityContext:
    fsGroup: 2000
  args:
  - '--debug'

If we check for the container's logs we'll be able to see more output but it might not be enough to understand the problem:

$ kubectl logs provider-aws-f78664a342f1-68999bf77b-xg7pt -f
(...)
2022-02-22T22:26:19.457Z  DEBUG provider-aws  Reconciling {"controller": "managed/bucket.s3.aws.crossplane.io", "request": "/test-bucket"}
2022-02-22T22:26:19.565Z  DEBUG provider-aws  Cannot observe external resource  {"controller": "managed/bucket.s3.aws.crossplane.io", "request": "/test-bucket", "uid": "2e5c0379-ce37-4546-89f1-16a833808ecc", "version": "145001855", "external-name": "pet2cattle-demo", "error": "failed to query Bucket: api error BadRequest: Bad Request", "errorVerbose": "api error BadRequest: Bad Request\nfailed to query Bucket\ngithub.com/crossplane/provider-aws/pkg/clients.Wrap\n\t/home/runner/work/provider-aws/provider-aws/pkg/clients/aws.go:976\ngithub.com/crossplane/provider-aws/pkg/controller/s3.(*external).Observe\n\t/home/runner/work/provider-aws/provider-aws/pkg/controller/s3/bucket.go:108\ngithub.com/crossplane/crossplane-runtime/pkg/reconciler/managed.(*Reconciler).Reconcile\n\t/home/runner/work/provider-aws/provider-aws/vendor/github.com/crossplane/crossplane-runtime/pkg/reconciler/managed/reconciler.go:681\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/work/provider-aws/provider-aws/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/work/provider-aws/provider-aws/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/work/provider-aws/provider-aws/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214\nruntime.goexit\n\t/opt/hostedtoolcache/go/1.17.6/x64/src/runtime/asm_amd64.s:1581"}
2022-02-22T22:26:19.568Z  DEBUG controller-runtime.manager.events Warning {"object": {"kind":"Bucket","name":"test-bucket","uid":"2e5c0379-ce37-4546-89f1-16a833808ecc","apiVersion":"s3.aws.crossplane.io/v1beta1","resourceVersion":"145001855"}, "reason": "CannotObserveExternalResource", "message": "failed to query Bucket: api error BadRequest: Bad Request"}
(...)

We can also set spec.securityContext.runAsUser to 0 to be able to install the AWS cli on the container to get the actual error. That would look like this:

apiVersion: pkg.crossplane.io/v1alpha1
kind: ControllerConfig
metadata:
  name: aws-config
spec:
  podSecurityContext:
    fsGroup: 2000
  args:
  - '--debug'
  securityContext:
    runAsUser: 0

Once the provider's pod have been refreshed we can spawn a shell on it to install awscli as follows:

$ kubectl exec -it provider-aws-f78664a342f1-5bb5d9f9d8-ch2kv -- sh 
/ # apk --no-cache add python3 py3-pip; pip3 install --upgrade pip; pip3 install --no-cache-dir awscli; aws s3 ls
(...)
An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity

This error might better point us on the right direction to fix whatever is causing it


Posted on 23/02/2022