Skip to content

Add DBT project to run benchmark for async execution mode#11

Merged
pankajastro merged 9 commits into
mainfrom
run_async_test
Apr 30, 2026
Merged

Add DBT project to run benchmark for async execution mode#11
pankajastro merged 9 commits into
mainfrom
run_async_test

Conversation

@pankajastro
Copy link
Copy Markdown
Contributor

@pankajastro pankajastro commented Oct 15, 2025

This PR introduces a DBT project setup to benchmark async execution mode performance.

  • The project currently includes 4 additional models, 1 seed + jaffle_shop project resouces.
  • A script benchmark/auto_generate_models.sh is available to generate additional models if needed for scaling benchmarks.
  • The DAG takes approximately 13–14 minutes to complete if 30 models are generated and run
  • Benchmark runtime can be increased by adjusting the configuration in dbt/altered_jaffle_shop/seeds/model_params.csv, which allows simulation of time-consuming transformation.
Screenshot 2026-04-28 at 2 28 10 PM Screenshot 2026-04-30 at 5 58 27 PM

This setup will help in evaluating how async execution mode scales with larger DAGs and more resource-intensive transformations.

closes: https://github.com/astronomer/oss-integrations-private/issues/175

Copilot AI review requested due to automatic review settings October 15, 2025 21:04
@pankajastro pankajastro requested a review from tatiana October 15, 2025 21:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds a focused dbt project and Airflow DAG to benchmark async execution performance using Cosmos and BigQuery, with seeds and “slow” models to inflate runtime and a script to scale model count.

  • Introduces altered_jaffle_shop dbt project with seeds, staging, core, and heavy benchmarking models.
  • Adds an Airflow DbtDag configured for AIRFLOW_ASYNC execution and a script to auto-generate additional long-running models.
  • Provides configurable model parameters via a seed to control workload size.

Reviewed Changes

Copilot reviewed 24 out of 29 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
dbt/altered_jaffle_shop/seeds/raw_payments.csv Seed data for payments used by staging model.
dbt/altered_jaffle_shop/seeds/raw_orders.csv Seed data for orders used by staging model.
dbt/altered_jaffle_shop/seeds/raw_customers.csv Seed data for customers used by staging model.
dbt/altered_jaffle_shop/seeds/model_params.csv Seed controlling workload parameters for long models.
dbt/altered_jaffle_shop/profiles.yml Example dbt profile for BigQuery (likely unused with Cosmos mapping).
dbt/altered_jaffle_shop/models/staging/stg_payments.sql Staging model converting cents to dollars.
dbt/altered_jaffle_shop/models/staging/stg_orders.sql Staging model for orders.
dbt/altered_jaffle_shop/models/staging/stg_customers.sql Staging model for customers.
dbt/altered_jaffle_shop/models/schema.yml Schema tests and docs for orders and customers.
dbt/altered_jaffle_shop/models/orders.sql Orders fact model aggregating payments by method.
dbt/altered_jaffle_shop/models/long_model_text_processing.sql Heavy text-processing benchmark model.
dbt/altered_jaffle_shop/models/long_model_subquery_windows.sql Heavy windowing benchmark model.
dbt/altered_jaffle_shop/models/long_model_cross_random.sql Heavy cross-join benchmark model.
dbt/altered_jaffle_shop/models/docs.md Docs for order status values.
dbt/altered_jaffle_shop/models/customers_slow_query.sql Heavy windowing workload over customers.
dbt/altered_jaffle_shop/models/customers.sql Customers mart model.
dbt/altered_jaffle_shop/macros/get_model_param.sql Macro intended to read params from a seed.
dbt/altered_jaffle_shop/dbt_project.yml Project config (profile name, paths, defaults).
dbt/altered_jaffle_shop/README.md Project usage notes.
dbt/altered_jaffle_shop/.user.yml User metadata.
dbt/altered_jaffle_shop/.gitignore Ignore dbt artifacts.
dags/cosmos_async_dag.py Airflow DAG using Cosmos with AIRFLOW_ASYNC.
benchmark/auto_generate_models.sh Script to replicate long models for scaling.
README.md Repository-level readme mentions new project and how to scale models.

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread dags/cosmos_async_dag.py
Comment thread dbt/altered_jaffle_shop/models/orders.sql
Comment thread dbt/altered_jaffle_shop/models/customers.sql Outdated
Comment thread dbt/altered_jaffle_shop/models/docs.md Outdated
Comment thread dbt/altered_jaffle_shop/macros/get_model_param.sql Outdated
Comment thread dags/cosmos_async_dag.py Outdated
Comment thread dbt/altered_jaffle_shop/dbt_project.yml Outdated
Comment thread dbt/altered_jaffle_shop/profiles.yml Outdated
Comment thread README.md Outdated
Comment thread dbt/altered_jaffle_shop/macros/get_model_param.sql Outdated
Comment thread dbt/altered_jaffle_shop/dbt_project.yml Outdated
Comment thread dbt/altered_jaffle_shop/models/schema.yml
Comment thread dbt/altered_jaffle_shop/profiles.yml Outdated
Comment thread benchmark/auto_generate_models.sh
Comment thread dags/cosmos_async_dag.py Outdated
Comment thread dbt/altered_jaffle_shop/models/schema.yml Outdated
Comment thread dags/cosmos_async_dag.py Outdated
Comment thread dags/cosmos_async_dag.py Outdated
Comment thread README.md Outdated
pankajastro and others added 2 commits April 28, 2026 15:26
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@pankajastro pankajastro requested a review from a team as a code owner April 30, 2026 12:32
Copy link
Copy Markdown
Collaborator

@pankajkoti pankajkoti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but have some comments inline if you could check and address once before merging.

Hoping that we have tested the benchmark with these scripts.

Comment thread dags/cosmos_async_dag.py Outdated
Comment thread dbt/altered_jaffle_shop/macros/get_model_param.sql Outdated
Comment thread dbt/altered_jaffle_shop/macros/get_model_param.sql Outdated
Comment thread Dockerfile
Comment thread Dockerfile Outdated
Comment thread README.md Outdated
@pankajkoti
Copy link
Copy Markdown
Collaborator

Happy to merge once we have tested benchmarking with the latest changes and the comments are addressed.

pankajastro and others added 4 commits April 30, 2026 18:44
Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
@pankajastro
Copy link
Copy Markdown
Contributor Author

Happy to merge once we have tested benchmarking with the latest changes and the comments are addressed.

Tested against updated commit
Screenshot 2026-04-30 at 6 50 10 PM

@pankajastro pankajastro merged commit 31116a7 into main Apr 30, 2026
@pankajastro pankajastro deleted the run_async_test branch April 30, 2026 13:21
pankajastro added a commit that referenced this pull request May 8, 2026
<img width="1703" height="1008" alt="Screenshot 2026-04-28 at 4 05
36 PM"
src="https://github.com/user-attachments/assets/ee2785fc-34e2-4cee-b092-33daf8480fec"
/>

<img width="1680" height="758" alt="Screenshot 2026-05-05 at 1 24 00 PM"
src="https://github.com/user-attachments/assets/e8e6b3df-9e38-46e7-9278-79485b986d08"
/>

Depend on
- astronomer/astronomer-cosmos#2616
- #11

closes:
astronomer/oss-integrations-private#176

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants