-
Notifications
You must be signed in to change notification settings - Fork 0
Add DBT project to run benchmark for async execution mode #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
93b44a1
Add DBT project to run benchmark for async execution mode
pankajastro 6f139dd
Apply review feedback
pankajastro fde45dc
Ignore .user.yml and remove it from tracking
pankajastro cab79b1
Add kube script
pankajastro 820a027
Disable dataset
pankajastro 1dc1ab5
Delete unused macros
pankajastro b7faf00
Apply suggestion from @pankajkoti
pankajastro db3baba
Apply suggestion from @pankajkoti
pankajastro fe9a7d0
Apply suggestion from @pankajkoti
pankajastro File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,3 +2,4 @@ | |
| __pycache__ | ||
| benchmark/pre-process/key.json | ||
| dbt/logs | ||
| dbt/altered_jaffle_shop/.user.yml | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
|
|
||
| : "${1:?Usage: $0 <number_of_copies>}" | ||
|
|
||
| slow_models=( | ||
|
pankajkoti marked this conversation as resolved.
|
||
| "dbt/altered_jaffle_shop/models/customers_slow_query.sql" | ||
|
pankajkoti marked this conversation as resolved.
|
||
| "dbt/altered_jaffle_shop/models/long_model_cross_random.sql" | ||
| "dbt/altered_jaffle_shop/models/long_model_subquery_windows.sql" | ||
| "dbt/altered_jaffle_shop/models/long_model_text_processing.sql" | ||
| ) | ||
|
|
||
| for file in "${slow_models[@]}"; do | ||
| for ((i = 1; i <= $1; i++)); do | ||
| cp -n "$file" "${file%.sql}${i}.sql" | ||
| done | ||
| done | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| apiVersion: batch/v1 | ||
| kind: Job | ||
| metadata: | ||
| name: airflow-test-cosmos-async | ||
| spec: | ||
| template: | ||
| spec: | ||
| containers: | ||
| - name: airflow | ||
| image: cosmos-benchmark:0.0.3 | ||
| imagePullPolicy: Never | ||
| command: ["airflow", "dags", "test", "cosmos_bq_async"] | ||
| env: | ||
| - name: AIRFLOW_CONN_GCP_GS_CONN | ||
| valueFrom: | ||
| secretKeyRef: | ||
| name: gcp-credentials | ||
| key: airflow-conn | ||
| resources: | ||
| # Equivalent to Astro's A10 instance | ||
| requests: | ||
| cpu: "2" | ||
| memory: "4Gi" | ||
| limits: | ||
| cpu: "2" | ||
| memory: "4Gi" | ||
| restartPolicy: Never | ||
| backoffLimit: 0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,40 @@ | ||
| from datetime import datetime | ||
| from pathlib import Path | ||
|
|
||
| from cosmos import DbtDag, ExecutionConfig, ExecutionMode, ProfileConfig, ProjectConfig | ||
| from cosmos.profiles import GoogleCloudServiceAccountDictProfileMapping | ||
| from include.constants import BIGQUERY_DATASET, DBT_ADAPTER_VERSION, GCP_PROJECT_ID | ||
|
|
||
| DBT_PROJECT_PATH = Path("/usr/local/airflow/dbt/altered_jaffle_shop") | ||
|
|
||
|
|
||
|
|
||
| profile_config = ProfileConfig( | ||
| profile_name="altered_jaffle_shop", | ||
| target_name="dev", | ||
| profile_mapping=GoogleCloudServiceAccountDictProfileMapping( | ||
| conn_id="gcp_gs_conn", profile_args={"dataset": BIGQUERY_DATASET, "project": GCP_PROJECT_ID} | ||
| ), | ||
| ) | ||
|
pankajastro marked this conversation as resolved.
|
||
|
|
||
|
|
||
| cosmos_bq_async = DbtDag( | ||
| # dbt/cosmos-specific parameters | ||
| project_config=ProjectConfig(DBT_PROJECT_PATH), | ||
| profile_config=profile_config, | ||
| execution_config=ExecutionConfig( | ||
| execution_mode=ExecutionMode.AIRFLOW_ASYNC, | ||
| async_py_requirements=[f"dbt-bigquery=={DBT_ADAPTER_VERSION}"], | ||
| ), | ||
| # normal dag parameters | ||
| schedule=None, | ||
| start_date=datetime(2026, 1, 1), | ||
| catchup=False, | ||
| dag_id="cosmos_bq_async", | ||
| tags=["simple"], | ||
| operator_args={ | ||
| "location": "US", | ||
| "install_deps": True, | ||
| "full_refresh": True, | ||
| }, | ||
| ) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
|
|
||
| target/ | ||
| dbt_packages/ | ||
| logs/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| Welcome to your new dbt project! | ||
|
|
||
| ### Using the starter project | ||
|
|
||
| Try running the following commands: | ||
| - dbt run | ||
| - dbt test | ||
|
|
||
|
|
||
| ### Resources: | ||
| - Learn more about dbt [in the docs](https://docs.getdbt.com/docs/introduction) | ||
| - Check out [Discourse](https://discourse.getdbt.com/) for commonly asked questions and answers | ||
| - Join the [chat](https://community.getdbt.com/) on Slack for live discussions and support | ||
| - Find [dbt events](https://events.getdbt.com) near you | ||
| - Check out [the blog](https://blog.getdbt.com/) for the latest news on dbt's development and best practices |
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
|
|
||
| # Name your project! Project names should contain only lowercase characters | ||
| # and underscores. A good package name should reflect your organization's | ||
| # name or the intended use of these models | ||
| name: 'altered_jaffle_shop' | ||
| version: '1.0.0' | ||
|
|
||
| # This setting configures which "profile" dbt uses for this project. | ||
| profile: 'altered_jaffle_shop' | ||
|
|
||
| # These configurations specify where dbt should look for different types of files. | ||
| # found in the "models/" directory. You probably won't need to change these! | ||
| model-paths: ["models"] | ||
| analysis-paths: ["analyses"] | ||
| test-paths: ["tests"] | ||
| seed-paths: ["seeds"] | ||
| macro-paths: ["macros"] | ||
| snapshot-paths: ["snapshots"] | ||
|
|
||
| clean-targets: # directories to be removed by `dbt clean` | ||
| - "target" | ||
| - "dbt_packages" |
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| {{ config(tags=["customers"]) }} | ||
|
|
||
| with customers as ( | ||
|
|
||
| select * from {{ ref('stg_customers') }} | ||
|
|
||
| ), | ||
|
|
||
| orders as ( | ||
|
|
||
| select * from {{ ref('stg_orders') }} | ||
|
|
||
| ), | ||
|
|
||
| payments as ( | ||
|
|
||
| select * from {{ ref('stg_payments') }} | ||
|
|
||
| ), | ||
|
|
||
| customer_orders as ( | ||
|
|
||
| select | ||
| customer_id, | ||
|
|
||
| min(order_date) as first_order, | ||
| max(order_date) as most_recent_order, | ||
| count(order_id) as number_of_orders | ||
| from orders | ||
|
|
||
| group by customer_id | ||
|
|
||
| ), | ||
|
|
||
| customer_payments as ( | ||
|
|
||
| select | ||
| orders.customer_id, | ||
| sum(amount) as total_amount | ||
|
|
||
| from payments | ||
|
|
||
| left join orders on | ||
| payments.order_id = orders.order_id | ||
|
|
||
| group by orders.customer_id | ||
|
|
||
| ), | ||
|
|
||
| final as ( | ||
|
|
||
| select | ||
| customers.customer_id, | ||
| customers.first_name, | ||
| customers.last_name, | ||
| customer_orders.first_order, | ||
| customer_orders.most_recent_order, | ||
| customer_orders.number_of_orders, | ||
| customer_payments.total_amount as total_order_amount | ||
|
|
||
| from customers | ||
|
|
||
| left join customer_orders | ||
| on customers.customer_id = customer_orders.customer_id | ||
|
|
||
| left join customer_payments | ||
| on customers.customer_id = customer_payments.customer_id | ||
|
|
||
| ) | ||
|
|
||
| select * from final |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| {{ config( | ||
| materialized = "table" | ||
| ) }} | ||
|
|
||
| WITH params AS ( | ||
| SELECT array_x, array_y | ||
| FROM {{ ref('model_params') }} | ||
| WHERE model_name = 'customers_slow_query' | ||
| LIMIT 1 | ||
| ), | ||
| base AS ( | ||
| SELECT * FROM {{ ref('customers') }} | ||
| ), | ||
| expanded AS ( | ||
| SELECT | ||
| b.*, | ||
| x AS extra_x, | ||
| y AS extra_y | ||
| FROM base b | ||
| CROSS JOIN params p | ||
| CROSS JOIN UNNEST(GENERATE_ARRAY(1, p.array_x)) AS x | ||
| CROSS JOIN UNNEST(GENERATE_ARRAY(1, p.array_y)) AS y | ||
| ), | ||
| windowed AS ( | ||
| SELECT | ||
| *, | ||
| AVG(LENGTH(CAST(customer_id AS STRING))) OVER ( | ||
| PARTITION BY customer_id | ||
| ORDER BY extra_x, extra_y | ||
| ) AS avg_len | ||
| FROM expanded | ||
| ) | ||
| SELECT | ||
| customer_id, | ||
| SUM(avg_len) AS sum_len | ||
| FROM windowed | ||
| GROUP BY customer_id |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| {% docs orders_status %} | ||
|
|
||
| Orders can be one of the following statuses: | ||
|
|
||
| | status | description | | ||
| |----------------|------------------------------------------------------------------------------------------------------------------------| | ||
| | placed | The order has been placed but has not yet left the warehouse | | ||
| | shipped | The order has been shipped to the customer and is currently in transit | | ||
| | completed | The order has been received by the customer | | ||
| | return_pending | The customer has indicated that they would like to return the order, but it has not yet been received at the warehouse | | ||
| | returned | The order has been returned by the customer and received at the warehouse | | ||
|
|
||
|
|
||
| {% enddocs %} |
30 changes: 30 additions & 0 deletions
30
dbt/altered_jaffle_shop/models/long_model_cross_random.sql
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| {{ config( | ||
| materialized = "table" | ||
| ) }} | ||
|
|
||
| WITH params AS ( | ||
| SELECT array_x, array_y | ||
| FROM {{ ref('model_params') }} | ||
| WHERE model_name = 'long_model_cross_random' | ||
| LIMIT 1 | ||
| ), | ||
| base AS ( | ||
| SELECT * FROM {{ ref('customers') }} | ||
| ), | ||
| inflated AS ( | ||
| SELECT | ||
| b.customer_id, | ||
| x AS x_val, | ||
| y AS y_val, | ||
| RAND() * x * y AS random_val | ||
| FROM base b | ||
| CROSS JOIN params p | ||
| CROSS JOIN UNNEST(GENERATE_ARRAY(1, p.array_x)) AS x | ||
| CROSS JOIN UNNEST(GENERATE_ARRAY(1, p.array_y)) AS y | ||
| ) | ||
| SELECT | ||
| customer_id, | ||
| COUNT(*) AS row_count, | ||
| AVG(random_val) AS avg_val | ||
| FROM inflated | ||
| GROUP BY customer_id |
55 changes: 55 additions & 0 deletions
55
dbt/altered_jaffle_shop/models/long_model_subquery_windows.sql
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| {{ config( | ||
| materialized = "table" | ||
| ) }} | ||
|
|
||
| WITH params AS ( | ||
| SELECT array_x | ||
| FROM {{ ref('model_params') }} | ||
| WHERE model_name = 'long_model_subquery_windows' | ||
| LIMIT 1 | ||
| ), | ||
|
|
||
| -- Inflate base rows by duplicating each customer 100x | ||
| base AS ( | ||
| SELECT | ||
| c.customer_id, | ||
| d AS duplication_id | ||
| FROM {{ ref('customers') }} c | ||
| CROSS JOIN UNNEST(GENERATE_ARRAY(1, 100)) AS d | ||
| ), | ||
|
|
||
| -- Generate large expansion using params | ||
| expanded AS ( | ||
| SELECT | ||
| b.customer_id, | ||
| x AS factor, | ||
| POW(x, 0.5) AS sqrt_x, | ||
| SAFE_DIVIDE(x, NULLIF(MOD(x, 10), 0)) AS ratio, | ||
| LOG(x + 1) + SIN(x) + COS(x) AS expensive_math | ||
| FROM base b | ||
| CROSS JOIN params p | ||
| CROSS JOIN UNNEST(GENERATE_ARRAY(1, p.array_x)) AS x | ||
| ), | ||
|
|
||
| -- Complex windowing across a large range | ||
| windowed AS ( | ||
| SELECT | ||
| *, | ||
| AVG(expensive_math) OVER ( | ||
| PARTITION BY customer_id | ||
| ORDER BY factor | ||
| ROWS BETWEEN 10000 PRECEDING AND CURRENT ROW | ||
| ) AS moving_avg | ||
| FROM expanded | ||
| ), | ||
|
|
||
| aggregated AS ( | ||
| SELECT | ||
| customer_id, | ||
| SUM(moving_avg) AS total_avg, | ||
| COUNT(*) AS cnt | ||
| FROM windowed | ||
| GROUP BY customer_id | ||
| ) | ||
|
|
||
| SELECT * FROM aggregated |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.