Skip to content

Run some example in Kubernetes execution mode in CI#1127

Merged
pankajastro merged 64 commits into
mainfrom
kube_mode_ci
Aug 15, 2024
Merged

Run some example in Kubernetes execution mode in CI#1127
pankajastro merged 64 commits into
mainfrom
kube_mode_ci

Conversation

@pankajastro
Copy link
Copy Markdown
Contributor

@pankajastro pankajastro commented Jul 29, 2024

Description

Migrate example from cosmos-example

The cosmos-example repository currently contains several examples, including those that run in Kubernetes execution mode. This setup has made testing local changes in Kubernetes execution mode challenging and keeping the documentation up-to-date is also not easy. Therefore, it makes sense to migrate the Kubernetes examples from cosmos-example to this repository. This PR resolved the below issue in this regard

  • Migrate the jaffle_shop_kubernetes example DAG to the this repository.
  • Moved the Dockerfile from cosmos-example to this repository to build the image with the necessary DAGs and DBT projects
    I also adjusted both the example DAG and Dockerfile to work within this repository.

Automate running locally

I introduce some scripts to make running Kubernetes DAG easy.

postgres-deployment.yaml: Kubernetes resource file for spinning up PostgreSQL and creating Kubernetes secrets.

integration-kubernetes.sh: Runs the Kubernetes DAG using pytest.

kubernetes-setup.sh:

  • Builds the Docker image with the Jaffle Shop dbt project and DAG, and loads the Docker image into the local registry.
  • Creates Kubernetes resources such as PostgreSQL deployment, service, and secret.

Run DAG locally
Prerequisites:

  • Docker Desktop
  • KinD (Kubernetes in Docker)
  • kubectl

Steps:

  1. Create cluster: kind create cluster
  2. Create Resource: scripts/test/kubernetes-setup.sh (This will set up PostgreSQL and load the DBT project into the local registry)
  3. Run DAG: cd dev && scripts/test/integration-kubernetes.sh this will execute this DAG with a pytest you can also run directly with airflow command given that project is installed in your virtual env
time AIRFLOW__COSMOS__PROPAGATE_LOGS=0 AIRFLOW__COSMOS__ENABLE_CACHE=1 AIRFLOW__COSMOS__CACHE_DIR=/tmp/ AIRFLOW_CONN_EXAMPLE_
CONN="postgres://postgres:postgres@0.0.0.0:5432/postgres" PYTHONPATH=`pwd` AIRFLOW_HOME=`pwd` AIRFLOW__CORE__DAGBAG_IMPORT_TIMEOUT=20000 AIRFLOW__CORE__DAG_FILE_PROCESSOR_TIMEOUT=20000 airflow dags test jaffle_shop_kubernetes  `date -Iseconds`

Run jaffle_shop_kubernetes in CI

To avoid regression we have automated running the jaffle_shop_kubernetes in CI

  • Set up the GitHub Actions infrastructure to run DAGs using Kubernetes execution mode
  • Use container-tools/kind-action@v1 to create a KinD cluster.
  • Used the bash script to streamline the creation of Kubernetes resources, build and load the image into a local registry, and execute tests.
  • At the moment I'm running the pytest from virtual env

Documentation changes

Given that the DAG jaffle_shop_kubernetes is now part of this repository, I have automated the example rendering for Kubernetes execution mode. This ensures that we avoid displaying outdated example code.

https://astronomer.github.io/astronomer-cosmos/getting_started/execution-modes.html#kubernetes
Screenshot 2024-08-15 at 8 03 59 PM

https://astronomer.github.io/astronomer-cosmos/getting_started/kubernetes.html#kubernetes

Screenshot 2024-08-15 at 8 04 22 PM

Future work

  • Use the hatch target to run the test. I have introduced the hatch target to run the Kubernetes example with hatch, but it's currently not working due to a mismatch between the local and container DBT project paths. This requires a bit more work.
  • Remove the virtual environment step (Install packages and dependencies) in the CI configuration for Run-Kubernetes-Tests and use hatch instead.
  • Update the profile YAML to use environment variables for the port, as it is currently hardcoded.
  • Remove the host from the Kubernetes secret and replace it with the username and make corresponding change in DAG
  • Currently, we need to export both POSTGRES_DATABASE and POSTGRES_DB in the Dockerfile because both are used in the project. To ensure consistency, avoid exporting both and instead make the environment variables consistent across the repository
  • Not a big deal in this context, but we have some hardcoded values for secrets. It would be better to parameterize them

GH issue for future improvement: #1160

Example CI Run

Related Issue(s)

closes: #535

Breaking Change?

No

Checklist

  • I have made corresponding changes to the documentation (if required)
  • I have added tests that prove my fix is effective or that my feature works

@netlify
Copy link
Copy Markdown

netlify Bot commented Jul 29, 2024

Deploy Preview for sunny-pastelito-5ecb04 canceled.

Name Link
🔨 Latest commit 05cedca
🔍 Latest deploy log https://app.netlify.com/sites/sunny-pastelito-5ecb04/deploys/66be350363b2af0007e7a6e4

@pankajastro pankajastro changed the title Kube mode ci @pankajastro Run some example in Kubernetes execution mode in CI Jul 29, 2024
@pankajastro pankajastro changed the title @pankajastro Run some example in Kubernetes execution mode in CI Run some example in Kubernetes execution mode in CI Jul 29, 2024
@pankajastro pankajastro changed the title Run some example in Kubernetes execution mode in CI [WIP] Run some example in Kubernetes execution mode in CI Jul 29, 2024
@codecov
Copy link
Copy Markdown

codecov Bot commented Jul 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.37%. Comparing base (a89389d) to head (05cedca).
Report is 3 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1127   +/-   ##
=======================================
  Coverage   96.37%   96.37%           
=======================================
  Files          64       64           
  Lines        3424     3424           
=======================================
  Hits         3300     3300           
  Misses        124      124           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment thread .github/workflows/test.yml Outdated
Comment thread .github/workflows/test.yml Outdated
Copy link
Copy Markdown
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, @pankajastro , really happy this is automated, it will save us lots of time!

@pankajkoti pankajkoti mentioned this pull request Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:ci Related to CI, Github Actions, or other continuous integration tools area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc execution:kubernetes Related to Kubernetes execution environment lgtm This PR has been approved by a maintainer profile:postgres Related to Postgres ProfileConfig size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create ExecutionMode.KUBERNETES example DAG & setup CI

2 participants