Skip to content

Disable dbt static parser during Airflow task execution using dbt runner#1760

Merged
tatiana merged 10 commits into
mainfrom
disable-static-parser-dbtrunner-task-execution
May 20, 2025
Merged

Disable dbt static parser during Airflow task execution using dbt runner#1760
tatiana merged 10 commits into
mainfrom
disable-static-parser-dbtrunner-task-execution

Conversation

@pankajkoti
Copy link
Copy Markdown
Contributor

@pankajkoti pankajkoti commented May 9, 2025

This PR adds support for conditionally applying the --no-static-parser dbt flag in Cosmos operators, ensuring it is included only when InvocationMode.DBT_RUNNER is used during task execution.

Rationale

Static Parser Issue: User reports and investigation revealed that, starting with Cosmos 1.9.0 (see PR #1484), using dbtRunner for both DAG parsing and task execution in Airflow 2.x can cause task hangs. This is due to dbt's static parser interacting poorly with Cosmos's use of temporary project directories, especially when the temp paths differ between parsing and execution.
Workaround: Adding the --no-static-parser flag when invoking dbtRunner during task execution avoids these hangs and ensures reliable operation. This flag is not needed (and should not be added) when using the subprocess invocation mode.

closes: #1751
related: #1750


Co-authored-by: Tatiana Al-Chueyr tatiana.alchueyr@gmail.com

@netlify
Copy link
Copy Markdown

netlify Bot commented May 9, 2025

Deploy Preview for sunny-pastelito-5ecb04 canceled.

Name Link
🔨 Latest commit b00ae62
🔍 Latest deploy log https://app.netlify.com/projects/sunny-pastelito-5ecb04/deploys/682c4f9421ff420008db5e7e

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 9, 2025

Deploying astronomer-cosmos with  Cloudflare Pages  Cloudflare Pages

Latest commit: b00ae62
Status: ✅  Deploy successful!
Preview URL: https://a125282d.astronomer-cosmos.pages.dev
Branch Preview URL: https://disable-static-parser-dbtrun.astronomer-cosmos.pages.dev

View logs

@codecov
Copy link
Copy Markdown

codecov Bot commented May 9, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.71%. Comparing base (81e248a) to head (b00ae62).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1760   +/-   ##
=======================================
  Coverage   97.71%   97.71%           
=======================================
  Files          84       84           
  Lines        5250     5252    +2     
=======================================
+ Hits         5130     5132    +2     
  Misses        120      120           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread cosmos/__init__.py Outdated
pre-commit-ci Bot and others added 6 commits May 20, 2025 15:15
<!--pre-commit.ci start-->
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.11.8 →
v0.11.9](astral-sh/ruff-pre-commit@v0.11.8...v0.11.9)
<!--pre-commit.ci end-->

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…ing local directory (#1740)

Ensure remote target directory are created when copying files when using
local directory.

When configuring a remote target directory that points to a local path
while using AIRFLOW ASYNC, like so:

```bash
AIRFLOW__COSMOS__REMOTE_TARGET_PATH=/usr/local/airflow/cosmos
AIRFLOW__COSMOS__REMOTE_TARGET_PATH_CONN_ID=file_default
```

We might face this issue:

```bash
FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/airflow/cosmos/simple_dag_async__dbt_async/run/jaffle_shop/models/example/my_second_dbt_model.sql'
```


Closes #1739

Co-authored-by: Giovanni Corsetti <155465603+corsettigyg@users.noreply.github.com>
The feature introduced in #1670 (Support running `dbt deps`
incrementally to pre-defined `dbt_packages` during task execution) did
not work as expected if users had defined a custom path for
`packages-install-path`. It only worked if the default (`dbt_packages`
was being used. This PR aims to solve the issue.
<!--pre-commit.ci start-->
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.11.9 →
v0.11.10](astral-sh/ruff-pre-commit@v0.11.9...v0.11.10)
<!--pre-commit.ci end-->

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…DEBUG` (#1764)

Recently, there have been some concerns that Cosmos may modify the
`packages.yml` content, leading to errors.

If users set `AIRFLOW__LOGGING__LOGGING_LEVEL=DEBUG`, they should now be
able to confirm the content of the file.

Example of log output:

```
[2025-05-12T10:52:38.183+0100] {local.py:481} DEBUG - Checking for the packages.yml dependencies file.
[2025-05-12T10:52:38.184+0100] {local.py:484} DEBUG - Contents of the </var/folders/td/522y78v91d1f5wgh67mj3p0m0000gn/T/tmp_4q53rv2/packages.yml> dependencies file:
packages:
  - package: dbt-labs/dbt_utils
    version: "1.1.1"

```
Comment thread cosmos/operators/local.py
@pankajkoti pankajkoti marked this pull request as ready for review May 20, 2025 10:33
@dosubot dosubot Bot added the size:M This PR changes 30-99 lines, ignoring generated files. label May 20, 2025
@pankajkoti pankajkoti requested review from pankajastro and tatiana May 20, 2025 10:33
@dosubot dosubot Bot added area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc area:parsing Related to parsing DAG/DBT improvement, issues, or fixes execution:local Related to Local execution environment parsing:custom Related to custom parsing, like custom DAG parsing, custom DBT parsing, etc labels May 20, 2025
Copy link
Copy Markdown
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pankajkoti its great you found this alternative than setting SUBPROCESS as the standard InvocationMethod during task execution. I'm glad we have a path to avoid the hanging behaviour some customers were facing.

Would it be worth for us to add a follow-up ticket for us to review in the future if we can find alternatives to this? According to dbt there can be significant performance improvements by enabling this setting, and we may be able to find alternative solutions in the future for this problem

@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label May 20, 2025
@pankajkoti
Copy link
Copy Markdown
Contributor Author

pankajkoti commented May 20, 2025

Would it be worth for us to add a follow-up ticket for us to review in the future if we can find alternatives to this? According to dbt there can be significant performance improvements by enabling this setting, and we may be able to find alternative solutions in the future for this problem

Yes @tatiana, although I have not found any significant improvement with using static parsing, nor a degradation upon disabling it :), I have logged a follow-up ticket #1771 to investigate that

@tatiana tatiana merged commit 15a8d91 into main May 20, 2025
177 of 178 checks passed
@tatiana tatiana deleted the disable-static-parser-dbtrunner-task-execution branch May 20, 2025 13:56
pankajkoti added a commit that referenced this pull request May 21, 2025
…ner (#1760)

This PR adds support for conditionally applying the `--no-static-parser`
dbt flag in Cosmos operators, ensuring it is included only when
InvocationMode.DBT_RUNNER is used during task execution.

**Static Parser Issue**: User reports and investigation revealed that,
starting with Cosmos 1.9.0 (see PR #1484), using dbtRunner for both DAG
parsing and task execution in Airflow 2.x can cause task hangs. This is
due to dbt's static parser interacting poorly with Cosmos's use of
temporary project directories, especially when the temp paths differ
between parsing and execution.
**Workaround**: Adding the `--no-static-parser` flag when invoking
dbtRunner during task execution avoids these hangs and ensures reliable
operation. This flag is not needed (and should not be added) when using
the subprocess invocation mode.

closes: #1751
related: #1750

Co-authored-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com>
(cherry picked from commit 15a8d91)
pankajkoti added a commit that referenced this pull request May 21, 2025
Bug Fixes

* Fix ``full_refresh`` parameter in ``AIRFLOW_ASYNC``
``ExecutionConfig`` mode by @tuantran0910 in #1738
* Fix dbt ls invocation method log message by @tatiana and @dstandish in
#1749
* Ensure remote target directory is created when copying files when
using local directory by @tuantran0910 and @corsettigyg in #1740
* Support custom ``packages-install-path`` by @tatiana in #1768
* Disable dbt static parser during Airflow task execution using dbt
runner by @pankajkoti and @tatiana in #1760
* Fix ``ExecutionMode.LOCAL`` to leverage
``ProjectConfig.manifest_path`` by @tatiana in #1772
* Refactor ``AIRFLOW_ASYNC`` so that the path in the remote object store
is specific per DAG run by @tuantran0910 in #1741
* Optimise memory usage with optional explicit imports by @pankajkoti
and @tatiana in #1769

Documentation

* Fix documentation rendering for ``use_dataset_airflow3_uri_standard``
by @pankajastro in #1742
* Correct custom callback example by @walter9388 in #1747

Others

* Re-enable integration tests durations to troubleshoot performance
degradation by @tatiana in #1735
* Run listener tests for Airflow 3 by @pankajastro in #1743
* Add Airflow 3 db files to ignore from git tracking by @pankajkoti in
#1755
* Log contents of ``packages.yml`` when
``AIRFLOW__LOGGING__LOGGING_LEVEL=DEBUG`` by @tatiana in #1764
* Fix Airflow dependencies in the CI by @tatiana in #1773
* Pre-commit updates: #1744, #1765, #1770
pankajkoti added a commit that referenced this pull request May 21, 2025
Bug Fixes

* Fix ``full_refresh`` parameter in ``AIRFLOW_ASYNC``
``ExecutionConfig`` mode by @tuantran0910 in #1738
* Fix dbt ls invocation method log message by @tatiana and @dstandish in
#1749
* Ensure remote target directory is created when copying files when
using local directory by @tuantran0910 and @corsettigyg in #1740
* Support custom ``packages-install-path`` by @tatiana in #1768
* Disable dbt static parser during Airflow task execution using dbt
runner by @pankajkoti and @tatiana in #1760
* Fix ``ExecutionMode.LOCAL`` to leverage
``ProjectConfig.manifest_path`` by @tatiana in #1772
* Refactor ``AIRFLOW_ASYNC`` so that the path in the remote object store
is specific per DAG run by @tuantran0910 in #1741
* Optimise memory usage with optional explicit imports by @pankajkoti
and @tatiana in #1769

Documentation

* Fix documentation rendering for ``use_dataset_airflow3_uri_standard``
by @pankajastro in #1742
* Correct custom callback example by @walter9388 in #1747

Others

* Re-enable integration tests durations to troubleshoot performance
degradation by @tatiana in #1735
* Run listener tests for Airflow 3 by @pankajastro in #1743
* Add Airflow 3 db files to ignore from git tracking by @pankajkoti in
#1755
* Log contents of ``packages.yml`` when
``AIRFLOW__LOGGING__LOGGING_LEVEL=DEBUG`` by @tatiana in #1764
* Fix Airflow dependencies in the CI by @tatiana in #1773
* Pre-commit updates: #1744, #1765, #1770


---------

(cherry picked from commit 430be00)
pankajkoti added a commit that referenced this pull request May 26, 2026
Reference #1751 (original report of dbt_runner task hangs) and #1760
(the fix that introduced the --no-static-parser injection) so readers
can trace when and how the workaround was reached.
pankajkoti added a commit that referenced this pull request May 26, 2026
Reference #1751 (original report of dbt_runner task hangs) and #1760
(the fix that introduced the --no-static-parser injection) so readers
can trace when and how the workaround was reached.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc area:parsing Related to parsing DAG/DBT improvement, issues, or fixes execution:local Related to Local execution environment lgtm This PR has been approved by a maintainer parsing:custom Related to custom parsing, like custom DAG parsing, custom DBT parsing, etc size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Task execution hanging on Airflow 2 when using dbt runner in DAG processing

3 participants