Disable dbt static parser during Airflow task execution using dbt runner#1760
Conversation
✅ Deploy Preview for sunny-pastelito-5ecb04 canceled.
|
Deploying astronomer-cosmos with
|
| Latest commit: |
b00ae62
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://a125282d.astronomer-cosmos.pages.dev |
| Branch Preview URL: | https://disable-static-parser-dbtrun.astronomer-cosmos.pages.dev |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1760 +/- ##
=======================================
Coverage 97.71% 97.71%
=======================================
Files 84 84
Lines 5250 5252 +2
=======================================
+ Hits 5130 5132 +2
Misses 120 120 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
<!--pre-commit.ci start--> updates: - [github.com/astral-sh/ruff-pre-commit: v0.11.8 → v0.11.9](astral-sh/ruff-pre-commit@v0.11.8...v0.11.9) <!--pre-commit.ci end--> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…ing local directory (#1740) Ensure remote target directory are created when copying files when using local directory. When configuring a remote target directory that points to a local path while using AIRFLOW ASYNC, like so: ```bash AIRFLOW__COSMOS__REMOTE_TARGET_PATH=/usr/local/airflow/cosmos AIRFLOW__COSMOS__REMOTE_TARGET_PATH_CONN_ID=file_default ``` We might face this issue: ```bash FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/airflow/cosmos/simple_dag_async__dbt_async/run/jaffle_shop/models/example/my_second_dbt_model.sql' ``` Closes #1739 Co-authored-by: Giovanni Corsetti <155465603+corsettigyg@users.noreply.github.com>
The feature introduced in #1670 (Support running `dbt deps` incrementally to pre-defined `dbt_packages` during task execution) did not work as expected if users had defined a custom path for `packages-install-path`. It only worked if the default (`dbt_packages` was being used. This PR aims to solve the issue.
<!--pre-commit.ci start--> updates: - [github.com/astral-sh/ruff-pre-commit: v0.11.9 → v0.11.10](astral-sh/ruff-pre-commit@v0.11.9...v0.11.10) <!--pre-commit.ci end--> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…DEBUG` (#1764) Recently, there have been some concerns that Cosmos may modify the `packages.yml` content, leading to errors. If users set `AIRFLOW__LOGGING__LOGGING_LEVEL=DEBUG`, they should now be able to confirm the content of the file. Example of log output: ``` [2025-05-12T10:52:38.183+0100] {local.py:481} DEBUG - Checking for the packages.yml dependencies file. [2025-05-12T10:52:38.184+0100] {local.py:484} DEBUG - Contents of the </var/folders/td/522y78v91d1f5wgh67mj3p0m0000gn/T/tmp_4q53rv2/packages.yml> dependencies file: packages: - package: dbt-labs/dbt_utils version: "1.1.1" ```
There was a problem hiding this comment.
@pankajkoti its great you found this alternative than setting SUBPROCESS as the standard InvocationMethod during task execution. I'm glad we have a path to avoid the hanging behaviour some customers were facing.
Would it be worth for us to add a follow-up ticket for us to review in the future if we can find alternatives to this? According to dbt there can be significant performance improvements by enabling this setting, and we may be able to find alternative solutions in the future for this problem
Yes @tatiana, although I have not found any significant improvement with using static parsing, nor a degradation upon disabling it :), I have logged a follow-up ticket #1771 to investigate that |
…ner (#1760) This PR adds support for conditionally applying the `--no-static-parser` dbt flag in Cosmos operators, ensuring it is included only when InvocationMode.DBT_RUNNER is used during task execution. **Static Parser Issue**: User reports and investigation revealed that, starting with Cosmos 1.9.0 (see PR #1484), using dbtRunner for both DAG parsing and task execution in Airflow 2.x can cause task hangs. This is due to dbt's static parser interacting poorly with Cosmos's use of temporary project directories, especially when the temp paths differ between parsing and execution. **Workaround**: Adding the `--no-static-parser` flag when invoking dbtRunner during task execution avoids these hangs and ensures reliable operation. This flag is not needed (and should not be added) when using the subprocess invocation mode. closes: #1751 related: #1750 Co-authored-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com> (cherry picked from commit 15a8d91)
Bug Fixes * Fix ``full_refresh`` parameter in ``AIRFLOW_ASYNC`` ``ExecutionConfig`` mode by @tuantran0910 in #1738 * Fix dbt ls invocation method log message by @tatiana and @dstandish in #1749 * Ensure remote target directory is created when copying files when using local directory by @tuantran0910 and @corsettigyg in #1740 * Support custom ``packages-install-path`` by @tatiana in #1768 * Disable dbt static parser during Airflow task execution using dbt runner by @pankajkoti and @tatiana in #1760 * Fix ``ExecutionMode.LOCAL`` to leverage ``ProjectConfig.manifest_path`` by @tatiana in #1772 * Refactor ``AIRFLOW_ASYNC`` so that the path in the remote object store is specific per DAG run by @tuantran0910 in #1741 * Optimise memory usage with optional explicit imports by @pankajkoti and @tatiana in #1769 Documentation * Fix documentation rendering for ``use_dataset_airflow3_uri_standard`` by @pankajastro in #1742 * Correct custom callback example by @walter9388 in #1747 Others * Re-enable integration tests durations to troubleshoot performance degradation by @tatiana in #1735 * Run listener tests for Airflow 3 by @pankajastro in #1743 * Add Airflow 3 db files to ignore from git tracking by @pankajkoti in #1755 * Log contents of ``packages.yml`` when ``AIRFLOW__LOGGING__LOGGING_LEVEL=DEBUG`` by @tatiana in #1764 * Fix Airflow dependencies in the CI by @tatiana in #1773 * Pre-commit updates: #1744, #1765, #1770
Bug Fixes * Fix ``full_refresh`` parameter in ``AIRFLOW_ASYNC`` ``ExecutionConfig`` mode by @tuantran0910 in #1738 * Fix dbt ls invocation method log message by @tatiana and @dstandish in #1749 * Ensure remote target directory is created when copying files when using local directory by @tuantran0910 and @corsettigyg in #1740 * Support custom ``packages-install-path`` by @tatiana in #1768 * Disable dbt static parser during Airflow task execution using dbt runner by @pankajkoti and @tatiana in #1760 * Fix ``ExecutionMode.LOCAL`` to leverage ``ProjectConfig.manifest_path`` by @tatiana in #1772 * Refactor ``AIRFLOW_ASYNC`` so that the path in the remote object store is specific per DAG run by @tuantran0910 in #1741 * Optimise memory usage with optional explicit imports by @pankajkoti and @tatiana in #1769 Documentation * Fix documentation rendering for ``use_dataset_airflow3_uri_standard`` by @pankajastro in #1742 * Correct custom callback example by @walter9388 in #1747 Others * Re-enable integration tests durations to troubleshoot performance degradation by @tatiana in #1735 * Run listener tests for Airflow 3 by @pankajastro in #1743 * Add Airflow 3 db files to ignore from git tracking by @pankajkoti in #1755 * Log contents of ``packages.yml`` when ``AIRFLOW__LOGGING__LOGGING_LEVEL=DEBUG`` by @tatiana in #1764 * Fix Airflow dependencies in the CI by @tatiana in #1773 * Pre-commit updates: #1744, #1765, #1770 --------- (cherry picked from commit 430be00)
This PR adds support for conditionally applying the
--no-static-parserdbt flag in Cosmos operators, ensuring it is included only when InvocationMode.DBT_RUNNER is used during task execution.Rationale
Static Parser Issue: User reports and investigation revealed that, starting with Cosmos 1.9.0 (see PR #1484), using dbtRunner for both DAG parsing and task execution in Airflow 2.x can cause task hangs. This is due to dbt's static parser interacting poorly with Cosmos's use of temporary project directories, especially when the temp paths differ between parsing and execution.
Workaround: Adding the
--no-static-parserflag when invoking dbtRunner during task execution avoids these hangs and ensures reliable operation. This flag is not needed (and should not be added) when using the subprocess invocation mode.closes: #1751
related: #1750
Co-authored-by: Tatiana Al-Chueyr tatiana.alchueyr@gmail.com