Getting started with Argo Workflows

argo workflows kubernetes containerized jobs DAGs directed acyclic graphs

6 min read | by Jordi Prats

Argo Workflows is a open-source container-native workflow engine designed to run containerized jobs in Kubernetes clusters, similar to tekton.

It uses DAGs (Directed Acyclic Graphs) or step-based workflows that will run each in a container.

Installing Argo Workflows

First, we need to set the version we want to install, we can do it by setting the environment variable ARGO_WORKFLOWS_VERSION to the desired version:

export ARGO_WORKFLOWS_VERSION="v3.5.11"

Then we can install it to the argo namespace using the following commands:

kubectl create namespace argo
kubectl apply -n argo -f "https://github.com/argoproj/argo-workflows/releases/download/${ARGO_WORKFLOWS_VERSION}/quick-start-minimal.yaml"

We can now wait for the pods to be up and running, keep an eye on the argo-server pod:

$ kubectl get pods
NAME                                   READY   STATUS    RESTARTS   AGE
argo-server-76c65cd446-2lc82           0/1     Running   0          30s
httpbin-7b48b49985-fg7mb               1/1     Running   0          30s
minio-68dc5544c4-7lbsl                 1/1     Running   0          30s
workflow-controller-68d7b854cc-6spmj   1/1     Running   0          30s
$ get pods
NAME                                   READY   STATUS    RESTARTS   AGE
argo-server-76c65cd446-2lc82           1/1     Running   0          96s
httpbin-7b48b49985-fg7mb               1/1     Running   0          96s
minio-68dc5544c4-7lbsl                 1/1     Running   0          96s
workflow-controller-68d7b854cc-6spmj   1/1     Running   0          96s

Argo workflows GUI

We can access the Argo Workflows GUI by port-forwarding the argo-server service:

kubectl -n argo port-forward service/argo-server 2746:2746

We'll be able to access the GUI by visiting the localhost on port 2746. Make sure to explicitly set the protocol to https. So, navigate to https://localhost:2746 in your browser.

Hello World workflow

We are going to create a simple workflow that will run a container that will print hello world using the docker/whalesay image. This would be the yaml definition for the workflow:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: whalesay-
spec:
  entrypoint: whalesay
  templates:
    - name: whalesay
      container:
        image: docker/whalesay:latest
        command: [cowsay]
        args: ["hello world"]

Bear in mind that since we want to be able to run the workflow multiple times, we need to set the generateName field in the metadata to a value that will be unique for each run. This also means that we'll need to use kubectl create instead of kubectl apply to create the workflow:

$ kubectl create -f hello-world.yaml ; kubectl get workflow -w
workflow.argoproj.io/whalesay-gsmmr created
NAME             STATUS      AGE   MESSAGE
whalesay-gsmmr   Running     1s
whalesay-gsmmr   Succeeded   10s

As soon as it finishes, we can check the logs of the pod to see the output:

$ kubectl get pods
NAME                                   READY   STATUS      RESTARTS   AGE
argo-server-76c65cd446-2lc82           1/1     Running     0          49m
httpbin-7b48b49985-fg7mb               1/1     Running     0          49m
minio-68dc5544c4-7lbsl                 1/1     Running     0          49m
whalesay-gsmmr                         0/2     Completed   0          28s
workflow-controller-68d7b854cc-6spmj   1/1     Running     0          49m
$ kubectl logs whalesay-gsmmr
time="2024-10-21T04:04:24.975Z" level=info msg="capturing logs" argo=true
 _____________
< hello world >
 -------------
    \
     \
      \
                    ##        .
              ## ## ##       ==
           ## ## ## ##      ===
       /""""""""""""""""___/ ===
  ~~~ {~~ ~~~~ ~~~ ~~~~ ~~ ~ /  ===- ~~~
       \______ o          __/
        \    \        __/
          \____\______/
time="2024-10-21T04:04:25.976Z" level=info msg="sub-process exited" argo=true error="<nil>"

Multi-step workflow

We are now going to create a multi-step workflow that will generate a random number and create a configmap with that number.

To do so, first we'll need to assign additional permissions to the default service account in the argo namespace. We can do so by creating the following role and role binding that will grant the necessary permissions:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: argo
  name: configmap-manager
rules:
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["create", "get", "list", "watch", "update", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: configmap-manager-binding
  namespace: argo
subjects:
  - kind: ServiceAccount
    name: default
    namespace: argo
roleRef:
  kind: Role
  name: configmap-manager
  apiGroup: rbac.authorization.k8s.io

We can apply the above yaml file to create both objects:

$ kubectl apply -f sa-cm.yaml
role.rbac.authorization.k8s.io/configmap-manager created
rolebinding.rbac.authorization.k8s.io/configmap-manager-binding created

Once we have this in place, we can create the workflow that will generate a random number and create a configmap with that number. Please notice how we are using the generate-and-create entrypoint that will run two steps: generate-random and create-configmap. The generate-random step will run a python script that will generate a random number between 1 and 100, while the create-configmap step will create a configmap with that number:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: random-configmap-
spec:
  entrypoint: generate-and-create
  templates:
    - name: generate-and-create
      steps:
        - - name: generate-random-number
            template: generate-random
        - - name: create-configmap
            template: create-configmap
            arguments:
              parameters:
                - name: random-number
                  value: "{{steps.generate-random-number.outputs.result}}"

    - name: generate-random
      script:
        image: python:3.9
        command: [python]
        source: |
          import random
          random_number = random.randint(1, 100)
          print(random_number)
      outputs:
        result: "{{outputs.result}}"

    - name: create-configmap
      inputs:
        parameters:
          - name: random-number
      container:
        image: bitnami/kubectl:latest
        command: [sh, -c]
        args:
          - |
            kubectl create configmap random-configmap \
              --from-literal=random-number={{inputs.parameters.random-number}} \
              -n argo

We can now create the workflow using the following command:

$ kubectl create -f random-configmap.yaml ; kubectl get workflow -w
workflow.argoproj.io/random-configmap-kg8xk created
NAME                     STATUS      AGE   MESSAGE
whalesay-gsmmr           Succeeded   20m
random-configmap-kg8xk   Running     0s
random-configmap-kg8xk   Running     10s
random-configmap-kg8xk   Succeeded   20s

Once it finishes, we can check the configmap to see the random number that was generated:

$ kubectl get cm random-configmap
NAME               DATA   AGE
random-configmap   1      40s
$ kubectl get cm random-configmap -o yaml
apiVersion: v1
data:
  random-number: "34"
kind: ConfigMap
metadata:
  creationTimestamp: "2024-10-21T04:24:36Z"
  name: random-configmap
  namespace: argo
  resourceVersion: "424750"
  uid: 6e00ce29-b3b9-4c43-a3e2-33f12106d743

Checking the pods, we can see how it ran the two steps in different Pods:

$ kubectl get pods
NAME                                                 READY   STATUS      RESTARTS   AGE
argo-server-76c65cd446-2lc82                         1/1     Running     0          71m
httpbin-7b48b49985-fg7mb                             1/1     Running     0          71m
minio-68dc5544c4-7lbsl                               1/1     Running     0          71m
random-configmap-kg8xk-create-configmap-2480388875   0/2     Completed   0          2m39s
random-configmap-kg8xk-generate-random-1513880492    0/2     Completed   0          2m49s
whalesay-gsmmr                                       0/2     Completed   0          22m
workflow-controller-68d7b854cc-6spmj                 1/1     Running     0          71m

In the status field of the workflow, we can see the inputs and outputs of the steps that were run:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  annotations:
    workflows.argoproj.io/pod-name-format: v2
  creationTimestamp: "2024-10-21T04:24:23Z"
  generateName: random-configmap-
  generation: 4
  labels:
    workflows.argoproj.io/completed: "true"
    workflows.argoproj.io/phase: Succeeded
  name: random-configmap-kg8xk
  namespace: argo
  resourceVersion: "424770"
  uid: 7488c865-e03d-449a-bef0-d9d27669c529
spec:
  (...)
status:
  (...)
  finishedAt: "2024-10-21T04:24:43Z"
  nodes:
    random-configmap-kg8xk:
      (...)
    random-configmap-kg8xk-1513880492:
      (...)
      outputs:
        artifacts:
        - name: main-logs
          s3:
            key: random-configmap-kg8xk/random-configmap-kg8xk-generate-random-1513880492/main.log
        exitCode: "0"
        result: "34"
      (...)
    random-configmap-kg8xk-2480388875:
      (...)
      inputs:
        parameters:
        - name: random-number
          value: "34"
      (...)
    random-configmap-kg8xk-3320070688:
      (...)
    random-configmap-kg8xk-3387034069:
      (...)
  phase: Succeeded
  progress: 2/2
  resourcesDuration:
    cpu: 0
    memory: 4
  startedAt: "2024-10-21T04:24:23Z"
  taskResultsCompletionStatus:
    random-configmap-kg8xk-1513880492: true
    random-configmap-kg8xk-2480388875: true

Posted on 22/10/2024