Add sample dbt_packages to validate incremental dbt deps#1669
Merged
Conversation
✅ Deploy Preview for sunny-pastelito-5ecb04 canceled.
|
Contributor
There was a problem hiding this comment.
Copilot reviewed 65 out of 76 changed files in this pull request and generated 1 comment.
Files not reviewed (11)
- dev/dags/dbt/simple/dbt_packages/dbt_date/.gitignore: Language not supported
- dev/dags/dbt/simple/dbt_packages/dbt_date/LICENSE: Language not supported
- dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/docker-start.sh: Language not supported
- dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/docker-stop.sh: Language not supported
- dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/docker/hive-site.xml: Language not supported
- dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/docker/spark-defaults.conf: Language not supported
- dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/docker/trino/catalog/memory.properties: Language not supported
- dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/macros/expression_is_true.sql: Language not supported
- dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/macros/get_custom_schema.sql: Language not supported
- dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/macros/get_test_dates.sql: Language not supported
- dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/models/dates.sql: Language not supported
Deploying astronomer-cosmos with
|
| Latest commit: |
4c49f42
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://1f1295ed.astronomer-cosmos.pages.dev |
| Branch Preview URL: | https://issue-1630-simple-dbt-packag.astronomer-cosmos.pages.dev |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1669 +/- ##
==========================================
- Coverage 97.43% 97.07% -0.37%
==========================================
Files 80 80
Lines 4991 4991
==========================================
- Hits 4863 4845 -18
- Misses 128 146 +18 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
pankajastro
approved these changes
Apr 16, 2025
tatiana
added a commit
that referenced
this pull request
Apr 16, 2025
…` during DAG parsing (#1668) Support running `dbt deps` incrementally to pre-calculated `dbt_packages` during DAG parsing. This was a use case requested by an Astro customer. Before this change, Cosmos supported two types of configuration: * If users choose `RenderConfig. dbt_deps=False` or `ProjectConfig.install_dbt_deps=False`, Cosmos would create a symbolic link for the user's pre-defined `dbt_packages` (background: #488, #600, #730) * If users choose `RenderConfig. dbt_deps=True` or `ProjectConfig.install_dbt_deps=True` (default), Cosmos would ignore any user-predefined ' dbt_packages` and do a run `dbt deps` from scratch from a temporary folder. An Astronomer customer requested to reuse the defined initially `dbt_packages` directory and run `dbt deps` (incrementally). We do not run dbt commands directly in the original dbt project folder with Cosmos because some users use read-only filesystems (#414). We also decided to use symbolic links instead of copying the directory due to performance issues (#488). Since we did not want to introduce a breaking change in a minor Cosmos release by changing the existing Cosmos 1.x behaviour to meet this new use case, this PR supports: * Copying the dbt deps related files (dbt packages folder and symbolic link) to the Cosmos temporary folder; and * Running `dbt deps`. So this is not a breaking change, users must opt into this behaviour by either: - Changing individual `DbtDag` or `DbtTaskGroup` instances, using `ProjectConfig.copy_dbt_packages=True` (new configuration) and `RenderConfig. dbt_deps=True` or `ProjectConfig.install_dbt_deps=True`; or - Changing the behaviour globally, by setting the Airflow configuration either via the environment variable `AIRFLOW__COSMOS__DEFAULT_COPY_DBT_PACKAGES_VALUE=True`. or via the `airflow.cfg`: ``` [cosmos] default_copy_dbt_packages_value=True ``` The following two missing parts are being added as part of a separate PR: - Mimic this behaviour during task execution; - Update documentation to be representative of both changes. Depends on #1669 Related to: #1630
tatiana
added a commit
that referenced
this pull request
Apr 16, 2025
…` during task execution (#1670) Support running `dbt deps` incrementally to pre-calculated `dbt_packages` during the task execution. This was a use case requested by an Astro customer. Before this change, Cosmos supported two types of configuration: * If users choose `operator_args={"install_deps": False}` or `ProjectConfig.install_dbt_deps=False`, Cosmos would create a symbolic link for the user's pre-defined `dbt_packages` (background: #488, #600, #730) * If users choose `operator_args={"install_deps": True}` or `ProjectConfig.install_dbt_deps=True` (default), Cosmos would ignore any user-predefined ' dbt_packages` and do a run `dbt deps` from scratch from a temporary folder. An Astronomer customer requested to reuse the defined initially `dbt_packages` directory and run `dbt deps` (incrementally). # Implementation We do not run dbt commands directly in the original dbt project folder with Cosmos because some users use read-only filesystems (#414). We also decided to use symbolic links instead of copying the directory due to performance issues (#488). Since we did not want to introduce a breaking change in a minor Cosmos release by changing the existing Cosmos 1.x behaviour to meet this new use case, this PR supports: * Copying the dbt deps related files (dbt packages folder and symbolic link) to the Cosmos temporary folder; and * Running `dbt deps`. So this is not a breaking change, users must opt into this behaviour by using `ProjectConfig.copy_dbt_packages=True` (new configuration) and `operator_args={"install_dbt_deps": True}` or `ProjectConfig. install_dbt_deps =True` and one of the following: - Changing the operator to receive the argument `copy_dbt_packages=True` - Changing individual `DbtDag` or `DbtTaskGroup` instances to use the new configuration `ProjectConfig.copy_dbt_packages=True` - Changing the behaviour globally, by setting the Airflow configuration either via the environment variable `AIRFLOW__COSMOS__DEFAULT_COPY_DBT_PACKAGES_VALUE=True` or via the `airflow.cfg`: ``` [cosmos] default_copy_dbt_packages_value=True ``` # How this was tested To validate the end-to-end behaviour, we run the following dag from `dev/dags`: ``` airflow dags test dbt_deps_example ``` # Related tickets This is a follow-up to #1668 and #1669. I'll make a follow-up PR covering the documentation. Closes: #1630
Merged
tatiana
added a commit
that referenced
this pull request
May 1, 2025
Features * Airflow 3 support * Support running ``dbt deps`` incrementally to pre-defined ``dbt_packages`` by @tatiana in #1668 and #1670 * Add ``DuckDB`` profile mapping by @prithvijitguha and @pankajastro in #1553 * Implement DBT exposure selector by ghjklw #1717 Bug Fixes * Fix ``test_indirect_selection`` flag to be propagated in case of ``TestBehavior.BUILD`` by @corsettigyg in #1663 * Fix ``select`` clause in the case of detached tests by @anyapriya in #1680 * Operator argument fixes by @johnhoran in #1648 Airflow 3 Support * Support rendering DbtDag in Airflow 3 by @tatiana and @ashb in #1657 * Refactor Rendered Task Instance Fields (RTIF) handling for Airflow 2.x and 3.x by @pankajkoti in #1661 * Run cosmos operator in Airflow 3 by @pankajastro in #1642 * Fix ``python_virtualenv.prepare_env`` top-level import for Airflow 3 by @pankajkoti in #1678 * Fix Variable not found issue in Airflow 3 by @tatiana in #1684 * Disable CosmosPlugin on Airflow 3 setup by @pankajkoti in #1692, #1698 * Use ``schedule`` param in example DAGs instead of the 2.10 deprecated and 3.0 removed ``schedule_interval`` by @pankajkoti in #1701 * Ensure ``virtualenv_dir`` path exists by @pankajkoti in #1724 * Support emitting Assets with Airflow 3 by @tatiana in #1713 * Add docs on Airflow 3 compatibility by @pankajkoti and @tatiana in #1731 * Introduce, test and document asset/dataset breaking change by @tatiana in #1672 * Improve dataset/asset driven scheduling documentation by @tatiana in #1729 Enhancements * Allow multiple callbacks by @corsettigyg #1693 * Refactor kubernetes warning callback handling by @canbekley in #1681 Documentation * Add documentation related to ``copy_dbt_packages`` by @tatiana in #1671 * Make wording and command consistent in the contributing doc by @pankajkoti in #1697 * Add MonteCarlo callback example for importing dbt artifacts by @corsettigyg #1695 * Change async feature to be non-experimental by @tatiana in #1732 Others * Add sample ``dbt_packages`` to validate incremental ``dbt deps`` by @tatiana in #1669 * Add kubernetes execution mode example in Airflow 3 by @pankajastro in #1667 * Check only major version until Airflow 3 stable release by @pankajastro in #1665 * Install Airflow from main branch by @pankajastro in #1660 * Add dev tool for Airflow 3 by @pankajastro and @tatiana in #1627 * Improve Airflow 3 tooling by @pankajastro in #1656 * Skip associating ``openlineage_events_completes`` to ``ti`` in Airflow 3 by @pankajkoti in #1662 * Add .gitignore file for the scripts/airflow3 directory by @pankajkoti in #1658 * Remove ``original_jaffle_shop`` dbt project by @pankajkoti in #1676 * Fix or ignore type check error by @pankajastro in #1687 * Run virtualenv example with Airflow 3 tooling by @pankajastro in #1686 * Enable running setup/teardown tasks with Async execution DAG with Airflow 3 tooling by @pankajastro in #1696 * Enable integration tests for the DuckDB adapter by @pankajastro in #1699 * Add Airflow 3 tests matrix entries in CI by @pankajkoti in #1646 * Use a different way to get tasks count for asserting test_perf_dag by @pankajkoti in #1714 * Reinstall Airflow 3 dependency on ``pydantic>=2.11`` for dbt adapter versions 1.6 & 1.9 by @pankajkoti in #1715 * Fix outdated ``echo`` in Airflow 3 tooling script #1700 * Add files not needed for git tracking to .gitignore by @pankajkoti in #1723 * Use latest minor versions for dbt adapters to get in compatibility fixes by @pankajkoti in #1719 * Fix Airflow 3 tests raising generate_run_id() takes 0 positional arguments by @tatiana in #1725 * Fix dataset tests failing in Airflow 3 by @tatiana in #1716 * Enable example DAGs to run in CI that were disabled in PR #1646 by @pankajkoti in #1726 * Pre-commit updates: #1666, #1653, #1641, #1682, #1720 Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com> Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com> --------- Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The files in
dbt_packages/dbt_datewere automatically generated by running:From within the
dev/dags/dbt/simpledbt project folder.Dependency for #1668
Related to: #1630