Platform Engineering on Kubernetes with Terraform, Argo, and FastAPI - Part 2

Platform Engineering on Kubernetes with Terraform, Argo, and FastAPI - Part 2

2024, Jun 16    

With the groundwork established in PART 1, Part 2 dives deeper into the implementation and integration of the key components that will power our platform’s capabilities: Argo Events for event-driven automation, Argo Workflows for CI/CD pipelines, and FastAPI as the frontend API layer.

Table of Contents

Event-driven Automation

To add event-driven automation capabilities within our platform, we will configure Argo Events to integrate with an event source and trigger automated Argo workflows. Our aim is to trigger a Workflow upon receiving an HTTP POST request from FastAPI, such as when a developer initiates a request to provision a new environment or deploy an application. To achieve this, we need to create the following resources in Kubernetes.

EventBus

The EventBus resource acts as the transport layer within Argo Events, facilitating the communication between EventSources and Sensors. It serves as a central hub where EventSources publish events, and Sensors subscribe to those events to execute triggers and automated workflows. Argo Events supports three implementations of the EventBus: NATS, Jetstream, and Kafka. In our case, we are using the NATS implementation, which provides a lightweight and efficient messaging system for event distribution.

apiVersion: argoproj.io/v1alpha1
kind: EventBus
metadata:
  name: eventbus-nats
  namespace: argo-events
spec:
  nats:
    native:
      // Optional, defaults to 3. If it is < 3, set it to 3, that is the minimal requirement.
      replicas: 3
      // Optional, authen strategy, "none" or "token", defaults to "none"
      auth: token

In this configuration:

  • We define an EventBus resource named eventbus-nats in the argo-events namespace.
  • The spec.nats.native section specifies that we are using the NATS implementation of the EventBus.
  • The spec.nats.replicas field is set to 3
  • The spec.nats.auth field is set to token, enabling token-based authentication for secure communication between EventSources, Sensors, and the EventBus.

To create the EventBus resource, run:

kubectl apply -f idp/core/tools/argo/events/sources/eventbus.yaml

By creating an EventBus, we have established a reliable and secure messaging system to facilitate the flow of events within our platform. EventSources, such as the Webhook we are going to create in the next section, will publish events to the EventBus. Sensors, which we will configure later, will subscribe to specific events and trigger the corresponding Argo Workflows.

EventSource

The EventSource resource specifies the configurations required to consume events from external sources, such as webhooks, and transform them into CloudEvents. In our case, we are configuring a Webhook EventSource to listen for incoming HTTP requests on specific endpoints. This will allow us to trigger automated workflows based on events generated by developers interacting with FastAPI.

apiVersion: argoproj.io/v1alpha1
kind: EventSource
metadata:
  name: webhook
  namespace: argo-events
spec:
  eventBusName: eventbus-nats
  service:
    ports:
      - port: 12000
        targetPort: 12000
  webhook:
    // Defines the endpoints that will trigger events
    storage:
      port: "12000"
      endpoint: /storage
      method: POST
    compute:
      port: "12000"
      endpoint: /compute
      method: POST
    database:
      port: "12000"
      endpoint: /database
      method: POST

In this configuration:

  • The EventSource is named webhook and resides in the argo-events namespace.
  • It is associated with the eventbus-nats eventbus.
  • The Webhook event source will listen on port 12000.
  • The webhook section defines the specific endpoints that will trigger events:
    • /storage: This endpoint will trigger an event when an HTTP POST request is received, to provision Azure storage resources such as Blob storage accounts and File shares
    • /compute: This endpoint will trigger an event when an HTTP POST request is received, to provision Azure compute resources such as virtual machines, Kubernetes clusters.
    • /database: This endpoint will trigger an event when an HTTP POST request is received, to provision Azure database resources such as SQL databases, NoSQL databases, or caches.

By defining these endpoints, we have mapped specific API calls from FastAPI to corresponding events within Argo Events. These events will then be processed and used to trigger automated workflows, such as provisioning infrastructure, deploying applications, or configuring application settings, using Argo Workflows. This configuration seamlessly integrates our platform API with Argo Events, enabling event-driven automation.

To create the EventSource resource, run:

kubectl apply -f idp/core/tools/argo/events/sources/eventsource.yaml

Sensor

The Sensor defines a set of event dependencies (inputs) and triggers (outputs). It listens to events on the eventbus and acts as an event dependency manager to resolve and execute the triggers. A dependency is an event the sensor is waiting to happen. Based on the platform capabilities we described in PART 1, we are going to create the following Sensor resources in Argo Events.

  • Compute Provisioning Sensor

    The compute-provision-sensor listens for events from the /compute webhook endpoint and triggers an Argo Workflow named compute-provision-workflow. The workflow executes a series of steps to provision the requested resources using Terraform.

apiVersion: argoproj.io/v1alpha1
kind: Sensor
metadata:
  name: compute-provision-sensor
  namespace: argo-events
spec:
  eventBusName: eventbus-main
  template:
    serviceAccountName: sa-argo-workflow
  dependencies:
    - name: webhook
      eventSourceName: webhook
      // Listen for events from the /compute webhook endpoint
      eventName: compute
      eventBusName: eventbus-main
  triggers:
    - template:
        name: terraform
        k8s:
          // Trigger Argo workflow
          operation: create
          source:
            resource:
              apiVersion: argoproj.io/v1alpha1
              kind: Workflow
              metadata:
                generateName: compute-provision-
                namespace: argo-events
              spec:
                entrypoint: terraform
                serviceAccountName: sa-argo-workflow
                imagePullSecrets:
                  - name: rpspeastus2acr
                arguments:
                  // The default parameter values overridden by the incoming webhook payload
                  parameters:   
                    - name: region
                      value: "eastus2" 
                    - name: cloud_provider
                      value: "azure"
                    - name: resource_type
                      value: "kubernetes"
                    - name: environment
                      value: "dev"
                    - name: requester_name
                      value: "Jim Musana"
                    - name: requester_email
                      value: "[email protected]" 
                templates:
                - name: terraform
                  dag: 
                    tasks:
                    - name: terraform-plan
                      templateRef:
                        name: compute-provision-workflow
                        template: plan
                      arguments:
                          parameters:
                            - name: region
                              value: ""
                            - name: cloud_provider
                              value: ""
                            - name: resource_type
                              value: ""
                            - name: environment
                              value: ""
                            - name: requester_name
                              value: ""
                            - name: requester_email
                              value: ""

                    // ... removed for brevity ...

                    - name: terraform-apply
                      templateRef:
                        name: compute-provision-workflow
                        template: apply
                      arguments:
                          parameters:
                            - name: region
                              value: ""
                            - name: cloud_provider
                              value: ""
                            - name: resource_type
                              value: ""
                            - name: environment
                              value: ""
                            - name: requester_name
                              value: ""
                            - name: requester_email
                              value: ""
                      dependencies: ["terraform-plan"]

          // Extract data from the JSON payload of the incoming webhook HTTP POST
          parameters:
            - src:
                dependencyName: webhook
                dataKey: body.region
              dest: spec.arguments.parameters.0.value
            - src:
                dependencyName: webhook
                dataKey: body.cloud_provider
              dest: spec.arguments.parameters.1.value
            - src:
                dependencyName: webhook
                dataKey: body.resource_type
              dest: spec.arguments.parameters.2.value
            - src:
                dependencyName: webhook
                dataKey: body.environment
              dest: spec.arguments.parameters.3.value
            - src:
                dependencyName: webhook
                dataKey: body.requester_name
              dest: spec.arguments.parameters.4.value
            - src:
                dependencyName: webhook
                dataKey: body.requester_email
              dest: spec.arguments.parameters.5.value

The parameters section defines how the compute-provision-workflow receives and maps input data from the webhook event triggered by the compute-provision-sensor. These parameters allow the workflow to dynamically configure itself based on the request payload received from the webhook. Additionally, they map the values from the webhook payload to the input variables that Terraform expects for provisioning the requested resources. For example, the parameter:

- src:
    dependencyName: webhook
    dataKey: body.region
  dest: spec.arguments.parameters.0.value

maps the value from the region field in the webhook payload’s body to the first input parameter that Terraform expects for the region variable. This allows Terraform to provision the resources in the specified region based on the value provided by the developer through the webhook request.

In addition to the Compute Provisioning Sensor, we will create additional sensors as follows;

Storage Provisioning Sensor: This sensor listens for events from the /storage webhook endpoint and triggers an Argo Workflow named storage-provision-workflow. The workflow executes a series of steps to provision the requested storage resources using Terraform.

Database Provisioning Sensor: This Sensor listens for events from the /database webhook endpoint and triggers an Argo Workflow named database-provision-workflow. The workflow executes steps to provision the requested database resources using the database_config payload from the event.

To create all the sensors described above, run:

kubectl apply -f idp/core/tools/argo/events/sensors

Workflow Orchestration

To add CI/CD and Workflow Orchestration capabilities within our platform, we will configure Argo Workflows to orchestrate the end-to-end processes triggered by developer requests received via the Argo Events sensors. For example, when a developer calls the /compute endpoint in FastAPI, to provision a new compute resource, Argo Workflows will execute a series of steps required to fulfill the request using Terraform.

  • Workflow Images

    Argo Workflows provides different types of templates to define the steps within a workflow. Two commonly used templates are the script template and the container template. Both execute containers based on specified Docker images. This means we need a Docker image that includes all the necessary packages, tools, and dependencies required to execute our workflows. This custom image should include Terraform for provisioning cloud infrastructure, the Azure CLI for interacting with Azure services, the Argo CLI for managing Argo Workflows, Ansible for configuration management, and any other utilities or libraries our workflows might require.

FROM alpine:latest

RUN apk update && apk add --no-cache \
    sudo \
    bash \
    wget \
    curl \
    git \
    ansible \
    python3 \
    py3-pip \
    gcc \
    python3-dev \
    musl-dev \
    linux-headers \
    shadow \
    && pip3 install --upgrade pip --break-system-packages \
    && pip3 install azure-cli --break-system-packages \
    && wget https://releases.hashicorp.com/terraform/1.8.5/terraform_1.8.5_linux_amd64.zip \
    && unzip terraform_1.8.5_linux_amd64.zip -d /usr/local/bin/ \
    && rm terraform_1.8.5_linux_amd64.zip \
    && curl -LO https://dl.k8s.io/release/v1.30.0/bin/linux/amd64/kubectl \
    && install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl \
    && rm kubectl \
    && curl -sSL -o /usr/local/bin/argocd https://github.com/argoproj/argo-cd/releases/latest/download/argocd-linux-amd64 \
    && chmod +x /usr/local/bin/argocd \
    && curl -sSL -o /usr/local/bin/kubelogin https://github.com/Azure/kubelogin/releases/latest/download/kubelogin-linux-amd64.zip \
    && unzip /usr/local/bin/kubelogin -d /usr/local/bin/ \
    && rm /usr/local/bin/kubelogin

RUN adduser -D -h /home/devops devops \
    && echo "devops ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers

WORKDIR /home/devops
RUN chown -R devops:devops /home/devops
USER devops

You can use this custom image directly or adapt for your use case. Build the custom image and publish it to your container registry

// Navigate to the directory containing the Dockerfile 
cd idp/core/tools/argo
// Build the image
docker build -t musanaengineering/platformtools:terraform .
// Push the image to your container registry
docker push musanaengineering/platformtools:terraform
  • Workflow Templates

    The compute-provision-workflow template is triggered by the compute-provision-sensor in response to events received from the /compute webhook endpoint. Upon receiving the event from the sensor, the workflow executes a series of steps to provision the requested resources using Terraform.

The workflow follows these general stages:

  • The workflow first validates the input parameters received from the webhook event, ensuring that all required information is provided, such as the cloud provider, resource type, region etc.
  • Based on the requested resource type and cloud provider, the workflow retrieves the appropriate Terraform configuration files or modules from a centralized repository.
  • The workflow initializes the Terraform working directory by downloading the required provider plugins and setting up the necessary backend configuration.
  • Terraform performs a dry-run plan operation, generating an execution plan that outlines the changes required to provision the requested resources.
  • Depending on the configuration, the workflow may include an approval step where manual intervention or approval is required before proceeding with the actual resource provisioning.
  • If the plan is approved (or if approval is not required), the workflow executes the Terraform apply command, provisioning the requested resources in the specified cloud environment.
  • Upon successful provisioning, the workflow generates output artifacts containing information about the provisioned resources, such as IP addresses, resource IDs, or connection details. Additionally, it can send notifications to the requesting developer or relevant stakeholders, informing them of the provisioning completion.
  • Finally, the workflow cleans up any temporary files or directories used during the provisioning process.
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: compute-provision-workflow
  namespace: argo-events
spec:
  entrypoint: plan
  templates: 
    - name: plan
      inputs:
        // The default parameter values overridden by the incoming webhook payload
        parameters:   
          - name: region
            value: "eastus2" 
          - name: cloud_provider
            value: "azure"
          - name: resource_type
            value: "kubernetes"
          - name: environment
            value: "dev"
          - name: requester_name
            value: "Jim Musana"
          - name: requester_email
            value: "[email protected]"
      script:
        imagePullPolicy: "Always"
        image: "musanaengineering/platformtools:terraform-v1.0.0"
        command: ["/bin/bash"]
        source: |
          sudo chown devops:devops /home/devops -R 
          sudo chmod 775 /home/devops -R 
          sudo chmod 400 /home/devops/.ssh/id_rsa

      // ... removed for brevity ...

    - name: apply
      inputs:
        // The default parameter values overridden by the incoming webhook payload
        parameters:   
          - name: region
            value: "eastus2" 
          - name: cloud_provider
            value: "azure"
          - name: resource_type
            value: "kubernetes"
          - name: environment
            value: "dev"
          - name: requester_name
            value: "Jim Musana"
          - name: requester_email
            value: "[email protected]"
      script:
        imagePullPolicy: "Always"
        image: "musanaengineering/platformtools:terraform-v1.0.0"
        command: [/bin/bash]
        source: |
          sudo chown devops:devops /home/devops -R 
          sudo chmod 775 /home/devops -R 
          sudo chmod 400 /home/devops/.ssh/id_rsa

      // ... removed for brevity ...

    - name: approve
      suspend: {}

      // Extract data from the JSON payload of the incoming webhook HTTP POST
      parameters:
        - src:
            dependencyName: webhook
            dataKey: body.region
          dest: spec.arguments.parameters.0.value
        - src:
            dependencyName: webhook
            dataKey: body.cloud_provider
          dest: spec.arguments.parameters.1.value
        - src:
            dependencyName: webhook
            dataKey: body.resource_type
          dest: spec.arguments.parameters.2.value
        - src:
            dependencyName: webhook
            dataKey: body.environment
          dest: spec.arguments.parameters.3.value
        - src:
            dependencyName: webhook
            dataKey: body.requester_name
          dest: spec.arguments.parameters.4.value
        - src:
            dependencyName: webhook
            dataKey: body.requester_email
          dest: spec.arguments.parameters.5.value
  • Artifact Repositories

    Some of the workflows executed on our Platform will use input and output artifacts. To enable this, we will configure an Azure Storage Account as our artifacts store.

  • Workflow Volumes

    We leverage Kubernetes Secrets to securely store and manage sensitive information, such as credentials for infrastructure provisioners (e.g., cloud provider credentials), database server passwords and other confidential data. The secrets are retrieved from Azure Key Vault using the External Secrets solution we created in PART 1. To inject these Secrets into our workflow, we mount them as Kubernetes Volumes within our workflow definition like this:

volumes:
  - name: platformsecrets
    secret:
      secretName: platformsecrets

To create all the workflow templates described above, run:

kubectl apply -f idp/core/tools/argo/events/workflows

Summary

In this second part of the series, we enhanced our internal developer platform by integrating Argo Events for event-driven automation and Argo Workflows for workflow orchestration. This enables developers to trigger automated provisioning of infrastructure and applications through a seamless API experience.

The key implementations are:

  • EventBus: A NATS-based central messaging hub for efficient event distribution between EventSources and Sensors.
  • EventSource: A Webhook EventSource configured to listen for HTTP POST requests from FastAPI.
  • Sensors:
    • Compute Provisioning Sensor listens for /compute events to trigger compute-provision-workflow for provisioning compute resources like VMs and Kubernetes clusters using Terraform.
    • Storage Provisioning Sensor listens for /storage events to trigger storage-provision-workflow for provisioning storage resources like Blob Storage and File Shares.
    • Database Provisioning Sensor listens for /database events to trigger database-provision-workflow for provisioning database resources like SQL and NoSQL databases.
  • Workflow Templates: Defining stages for provisioning requested resources using Terraform based on dynamic parameters
  • Artifact Repositories and Volumes: An Azure Storage Account configured as an artifact store, and Kubernetes Secrets used to securely manage sensitive data like cloud provider credentials.

This integration enables event-driven automation, where developer requests received via the FastAPI frontend trigger corresponding Argo Workflows to provision and manage infrastructure and applications on Kubernetes using Terraform. Our platform now handles the intricate details of provisioning, deployment, and management in a streamlined manner. In the upcoming part of this series, we will dive deeper into the implementation of the FastAPI frontend API layer, providing developers with an intuitive interface to interact with the platform’s capabilities.