Skip to content

Add sample dbt_packages to validate incremental dbt deps#1669

Merged
tatiana merged 1 commit into
mainfrom
issue-1630-simple-dbt-packages
Apr 16, 2025
Merged

Add sample dbt_packages to validate incremental dbt deps#1669
tatiana merged 1 commit into
mainfrom
issue-1630-simple-dbt-packages

Conversation

@tatiana
Copy link
Copy Markdown
Collaborator

@tatiana tatiana commented Apr 16, 2025

The files in dbt_packages/dbt_date were automatically generated by running:

dbt deps

From within the dev/dags/dbt/simple dbt project folder.

Dependency for #1668

Related to: #1630

Copilot AI review requested due to automatic review settings April 16, 2025 08:56
@dosubot dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Apr 16, 2025
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 16, 2025

Deploy Preview for sunny-pastelito-5ecb04 canceled.

Name Link
🔨 Latest commit 4c49f42
🔍 Latest deploy log https://app.netlify.com/sites/sunny-pastelito-5ecb04/deploys/67ff70a3a407d70008e49f20

@dosubot dosubot Bot added area:dependencies Related to dependencies, like Python packages, library versions, etc area:testing Related to testing, like unit tests, integration tests, etc dbt:deps Primarily related to dbt deps command or functionality labels Apr 16, 2025
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 65 out of 76 changed files in this pull request and generated 1 comment.

Files not reviewed (11)
  • dev/dags/dbt/simple/dbt_packages/dbt_date/.gitignore: Language not supported
  • dev/dags/dbt/simple/dbt_packages/dbt_date/LICENSE: Language not supported
  • dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/docker-start.sh: Language not supported
  • dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/docker-stop.sh: Language not supported
  • dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/docker/hive-site.xml: Language not supported
  • dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/docker/spark-defaults.conf: Language not supported
  • dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/docker/trino/catalog/memory.properties: Language not supported
  • dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/macros/expression_is_true.sql: Language not supported
  • dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/macros/get_custom_schema.sql: Language not supported
  • dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/macros/get_test_dates.sql: Language not supported
  • dev/dags/dbt/simple/dbt_packages/dbt_date/integration_tests/models/dates.sql: Language not supported

Comment thread dev/dags/dbt/simple/dbt_packages/dbt_date/.circleci/config.yml
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying astronomer-cosmos with  Cloudflare Pages  Cloudflare Pages

Latest commit: 4c49f42
Status: ✅  Deploy successful!
Preview URL: https://1f1295ed.astronomer-cosmos.pages.dev
Branch Preview URL: https://issue-1630-simple-dbt-packag.astronomer-cosmos.pages.dev

View logs

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.07%. Comparing base (f7fc8ef) to head (4c49f42).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1669      +/-   ##
==========================================
- Coverage   97.43%   97.07%   -0.37%     
==========================================
  Files          80       80              
  Lines        4991     4991              
==========================================
- Hits         4863     4845      -18     
- Misses        128      146      +18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tatiana tatiana merged commit 8704e28 into main Apr 16, 2025
71 of 72 checks passed
@tatiana tatiana deleted the issue-1630-simple-dbt-packages branch April 16, 2025 09:54
tatiana added a commit that referenced this pull request Apr 16, 2025
…` during DAG parsing (#1668)

Support running `dbt deps` incrementally to pre-calculated
`dbt_packages` during DAG parsing. This was a use case requested by an
Astro customer.

Before this change, Cosmos supported two types of configuration:
* If users choose `RenderConfig. dbt_deps=False` or
`ProjectConfig.install_dbt_deps=False`, Cosmos would create a symbolic
link for the user's pre-defined `dbt_packages` (background: #488, #600,
#730)
* If users choose `RenderConfig. dbt_deps=True` or
`ProjectConfig.install_dbt_deps=True` (default), Cosmos would ignore any
user-predefined ' dbt_packages` and do a run `dbt deps` from scratch
from a temporary folder.

An Astronomer customer requested to reuse the defined initially
`dbt_packages` directory and run `dbt deps` (incrementally).

We do not run dbt commands directly in the original dbt project folder
with Cosmos because some users use read-only filesystems (#414). We also
decided to use symbolic links instead of copying the directory due to
performance issues (#488). Since we did not want to introduce a breaking
change in a minor Cosmos release by changing the existing Cosmos 1.x
behaviour to meet this new use case, this PR supports:
* Copying the dbt deps related files (dbt packages folder and symbolic
link) to the Cosmos temporary folder; and
* Running `dbt deps`.

So this is not a breaking change, users must opt into this behaviour by
either:
- Changing individual `DbtDag` or `DbtTaskGroup` instances, using
`ProjectConfig.copy_dbt_packages=True` (new configuration) and
`RenderConfig. dbt_deps=True` or `ProjectConfig.install_dbt_deps=True`;
or
- Changing the behaviour globally, by setting the Airflow configuration
either via the environment variable
`AIRFLOW__COSMOS__DEFAULT_COPY_DBT_PACKAGES_VALUE=True`. or via the
`airflow.cfg`:
```
[cosmos]
default_copy_dbt_packages_value=True
```

The following two missing parts are being added as part of a separate
PR:
- Mimic this behaviour during task execution;
- Update documentation to be representative of both changes.

Depends on #1669 

Related to: #1630
tatiana added a commit that referenced this pull request Apr 16, 2025
…` during task execution (#1670)

Support running `dbt deps` incrementally to pre-calculated
`dbt_packages` during the task execution. This was a use case requested
by an Astro customer.

Before this change, Cosmos supported two types of configuration:
* If users choose `operator_args={"install_deps": False}` or
`ProjectConfig.install_dbt_deps=False`, Cosmos would create a symbolic
link for the user's pre-defined `dbt_packages` (background: #488, #600,
#730)
* If users choose `operator_args={"install_deps": True}` or
`ProjectConfig.install_dbt_deps=True` (default), Cosmos would ignore any
user-predefined ' dbt_packages` and do a run `dbt deps` from scratch
from a temporary folder.

An Astronomer customer requested to reuse the defined initially
`dbt_packages` directory and run `dbt deps` (incrementally).

# Implementation

We do not run dbt commands directly in the original dbt project folder
with Cosmos because some users use read-only filesystems (#414). We also
decided to use symbolic links instead of copying the directory due to
performance issues (#488). Since we did not want to introduce a breaking
change in a minor Cosmos release by changing the existing Cosmos 1.x
behaviour to meet this new use case, this PR supports:
* Copying the dbt deps related files (dbt packages folder and symbolic
link) to the Cosmos temporary folder; and
* Running `dbt deps`.

So this is not a breaking change, users must opt into this behaviour by
using `ProjectConfig.copy_dbt_packages=True` (new configuration) and
`operator_args={"install_dbt_deps": True}` or `ProjectConfig.
install_dbt_deps =True` and one of the following:
- Changing the operator to receive the argument `copy_dbt_packages=True`
- Changing individual `DbtDag` or `DbtTaskGroup` instances to use the
new configuration `ProjectConfig.copy_dbt_packages=True`
- Changing the behaviour globally, by setting the Airflow configuration
either via the environment variable
`AIRFLOW__COSMOS__DEFAULT_COPY_DBT_PACKAGES_VALUE=True` or via the
`airflow.cfg`:
```
[cosmos]
default_copy_dbt_packages_value=True
```

# How this was tested

To validate the end-to-end behaviour, we run the following dag from
`dev/dags`:
```
airflow dags test dbt_deps_example
```

# Related tickets

This is a follow-up to #1668 and #1669.

I'll make a follow-up PR covering the documentation.

Closes: #1630
@tatiana tatiana mentioned this pull request Apr 16, 2025
tatiana added a commit that referenced this pull request Apr 17, 2025
Rename user-facing global configuration
`default_copy_dbt_packages_value` to `default_copy_dbt_packages`.

This is a follow-up to #1668, #1669 and #1670.
@tatiana tatiana added this to the Cosmos 1.10.0 milestone Apr 17, 2025
tatiana added a commit that referenced this pull request May 1, 2025
Features

* Airflow 3 support
* Support running ``dbt deps`` incrementally to pre-defined
``dbt_packages`` by @tatiana in #1668 and #1670
* Add ``DuckDB`` profile mapping by @prithvijitguha and @pankajastro in
#1553
* Implement DBT exposure selector by ghjklw #1717

Bug Fixes

* Fix ``test_indirect_selection`` flag to be propagated in case of
``TestBehavior.BUILD`` by @corsettigyg in #1663
* Fix ``select`` clause in the case of detached tests by @anyapriya in
#1680
* Operator argument fixes by @johnhoran in #1648


Airflow 3 Support

* Support rendering DbtDag in Airflow 3 by @tatiana and @ashb in #1657
* Refactor Rendered Task Instance Fields (RTIF) handling for Airflow 2.x
and 3.x by @pankajkoti in #1661
* Run cosmos operator in Airflow 3 by @pankajastro in #1642
* Fix ``python_virtualenv.prepare_env`` top-level import for Airflow 3
by @pankajkoti in #1678
* Fix Variable not found issue in Airflow 3 by @tatiana in #1684
* Disable CosmosPlugin on Airflow 3 setup by @pankajkoti in #1692, #1698
* Use ``schedule`` param in example DAGs instead of the 2.10 deprecated
and 3.0 removed ``schedule_interval`` by @pankajkoti in #1701
* Ensure ``virtualenv_dir`` path exists by @pankajkoti in #1724
* Support emitting Assets with Airflow 3 by @tatiana in #1713
* Add docs on Airflow 3 compatibility by @pankajkoti and @tatiana in
#1731
* Introduce, test and document asset/dataset breaking change by @tatiana
in #1672
* Improve dataset/asset driven scheduling documentation by @tatiana in
#1729

Enhancements

* Allow multiple callbacks by @corsettigyg #1693
* Refactor kubernetes warning callback handling by @canbekley in #1681

Documentation

* Add documentation related to ``copy_dbt_packages`` by @tatiana in
#1671
* Make wording and command consistent in the contributing doc by
@pankajkoti in #1697
* Add MonteCarlo callback example for importing dbt artifacts by
@corsettigyg #1695
* Change async feature to be non-experimental by @tatiana in #1732

Others

* Add sample ``dbt_packages`` to validate incremental ``dbt deps`` by
@tatiana in #1669
* Add kubernetes execution mode example in Airflow 3 by @pankajastro in
#1667
* Check only major version until Airflow 3 stable release by
@pankajastro in #1665
* Install Airflow from main branch by @pankajastro in #1660
* Add dev tool for Airflow 3 by @pankajastro and @tatiana in #1627
* Improve Airflow 3 tooling by @pankajastro in #1656
* Skip associating ``openlineage_events_completes`` to ``ti`` in Airflow
3 by @pankajkoti in #1662
* Add .gitignore file for the scripts/airflow3 directory by @pankajkoti
in #1658
* Remove ``original_jaffle_shop`` dbt project by @pankajkoti in #1676
* Fix or ignore type check error by @pankajastro in #1687
* Run virtualenv example with Airflow 3 tooling by @pankajastro in #1686
* Enable running setup/teardown tasks with Async execution DAG with
Airflow 3 tooling by @pankajastro in #1696
* Enable integration tests for the DuckDB adapter by @pankajastro in
#1699
* Add Airflow 3 tests matrix entries in CI by @pankajkoti in #1646
* Use a different way to get tasks count for asserting test_perf_dag by
@pankajkoti in #1714
* Reinstall Airflow 3 dependency on ``pydantic>=2.11`` for dbt adapter
versions 1.6 & 1.9 by @pankajkoti in #1715
* Fix outdated ``echo`` in Airflow 3 tooling script #1700
* Add files not needed for git tracking to .gitignore by @pankajkoti in
#1723
* Use latest minor versions for dbt adapters to get in compatibility
fixes by @pankajkoti in #1719
* Fix Airflow 3 tests raising generate_run_id() takes 0 positional
arguments by @tatiana in #1725
* Fix dataset tests failing in Airflow 3 by @tatiana in #1716
* Enable example DAGs to run in CI that were disabled in PR #1646 by
@pankajkoti in #1726
* Pre-commit updates: #1666, #1653, #1641, #1682, #1720


Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
Co-authored-by: Pankaj Singh
<98807258+pankajastro@users.noreply.github.com>

---------

Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dependencies Related to dependencies, like Python packages, library versions, etc area:testing Related to testing, like unit tests, integration tests, etc dbt:deps Primarily related to dbt deps command or functionality size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants