Skip to content

Add deferrable support for Watcher Sensor#2084

Merged
pankajastro merged 11 commits into
mainfrom
async_watch_2059
Nov 7, 2025
Merged

Add deferrable support for Watcher Sensor#2084
pankajastro merged 11 commits into
mainfrom
async_watch_2059

Conversation

@pankajastro
Copy link
Copy Markdown
Contributor

@pankajastro pankajastro commented Nov 5, 2025

This PR adds async/deferrable support to the Watcher sensor

Benefits

  • Reduces resource usage for long-running watcher tasks.
  • Improves task scheduling and throughput.

Limitation

  • Supports deferrable execution for model or run commands only.
  • Retry execution remains synchronous, and deferrable happens only for try no 0.

Future

closes: #2059

Screenshot 2025-11-06 at 12 28 27 AM Screenshot 2025-11-06 at 2 44 31 PM

@codecov
Copy link
Copy Markdown

codecov Bot commented Nov 6, 2025

Codecov Report

❌ Patch coverage is 98.87640% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 97.81%. Comparing base (a95c81b) to head (288274d).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
cosmos/_triggers/watcher.py 98.57% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2084      +/-   ##
==========================================
+ Coverage   97.80%   97.81%   +0.01%     
==========================================
  Files          91       92       +1     
  Lines        5871     5948      +77     
==========================================
+ Hits         5742     5818      +76     
- Misses        129      130       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@pankajastro pankajastro marked this pull request as ready for review November 6, 2025 11:06
Copilot AI review requested due to automatic review settings November 6, 2025 11:06
@pankajastro pankajastro marked this pull request as draft November 6, 2025 11:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds deferrable (async) execution support to the DbtConsumerWatcherSensor by introducing a new WatcherTrigger class. When the sensor's initial poke returns False, it defers execution to the trigger which asynchronously polls XCom for model status updates.

  • Implemented WatcherTrigger class for async polling of dbt model status
  • Added execute and execute_complete methods to DbtConsumerWatcherSensor for deferrable behavior
  • Refactored XCom decompression logic into a reusable _parse_compressed_xcom utility function
  • Added comprehensive test coverage for the trigger and sensor deferral behavior

Reviewed Changes

Copilot reviewed 5 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
cosmos/_triggers/watcher.py New trigger class for async polling of dbt model status from XCom
cosmos/_triggers/init.py Empty init file for the _triggers module
cosmos/operators/watcher.py Added execute/execute_complete methods and refactored to use shared decompression utility
tests/_triggers/watcher.py Comprehensive test coverage for WatcherTrigger async behavior
tests/_triggers/init.py Empty init file for test triggers module
tests/operators/test_watcher.py Tests for sensor deferral and execute_complete method
pyproject.toml Added pytest-asyncio dependency for async test support

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cosmos/operators/watcher.py Outdated
@pankajastro pankajastro changed the title Add async support for watcher sensor Add deferrable support for Watcher Sensor Nov 6, 2025
@pankajastro pankajastro marked this pull request as ready for review November 6, 2025 12:47
Copy link
Copy Markdown
Contributor

@pankajkoti pankajkoti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking great to me. Have a couple of comments inline.

Also, looks like a few tests are failing in the CI.

Comment thread cosmos/_triggers/watcher.py
Comment thread cosmos/_triggers/watcher.py Outdated
@pankajkoti pankajkoti requested a review from tatiana November 6, 2025 13:09
Copy link
Copy Markdown
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pankajastro, this is really exciting, thanks for finding a solution for it!
Do you see any downsides or risks with always using the deferrable implementation? What happens if the end-user does not have a triggerer in their Airflow deployment?
Should we also support people opting out of using the deferrable mode?
Finally, are there any performance improvements when switching to deferrable, when running ExecutionMode.WATCHER with our benchmark project?

@tatiana tatiana added this to the Cosmos 1.12.0 milestone Nov 6, 2025
@pankajastro
Copy link
Copy Markdown
Contributor Author

pankajastro commented Nov 6, 2025

Do you see any downsides or risks with always using the deferrable implementation? What happens if the end-user does not have a triggerer in their Airflow deployment?

I’m a bit concerned about the number of database accesses from the trigger.

Should we also support people opting out of using the deferrable mode?

I’ve thought about it, and I think we should expose a parameter at the sensor level to conditionally defer. However, it would be better to do this once #1973 is solved.

Finally, are there any performance improvements when switching to deferrable, when running ExecutionMode.WATCHER with our benchmark project?

I have not done any performance testing.

Copy link
Copy Markdown
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you very much, @pankajastro !

@pankajastro pankajastro merged commit 975ebae into main Nov 7, 2025
81 checks passed
@pankajastro pankajastro deleted the async_watch_2059 branch November 7, 2025 17:22
pankajkoti added a commit that referenced this pull request Nov 24, 2025
Following PR #2089, do similarly for the Airflow version constant in the
cosmos/_triggers/watcher.py module that was added in PR #2084 (probably
it was worked upon in parallel with PR 2089 and hence was not identified
then).
pankajkoti added a commit that referenced this pull request Dec 18, 2025
Breaking changes

* Introduced in the PR #2080. The following functions are expected to be
used internally only to Cosmos, so we hope these won't impact end-users,
but we are documenting the changes just in case:
- ``generate_task_or_group`` receives ``render_config`` instead of its
individual configurations, such as ``test_behavior``,
``source_rendering_behavior`` and ``enable_owner_inheritance``
- ``create_task_metadata`` receives ``render_config`` instead of its
individual configurations, such as ``test_behavior``,
``source_rendering_behavior`` and ``enable_owner_inheritance``
- ``create_task_metadata`` now expects the ``node_converters`` argument
* Drop Python 3.9 support by @pankajastro in #2118
* Drop Airflow 2.4 support by @pankajastro in #2161
* Drop Airflow 2.5 support by @pankajastro in #2165

Features

* Support applying ``node_converter`` at a task level instead of task
group level by @anyapriya in #1759
* Allow overriding ``DbtProducerWatcherOperator`` parameters via
``ExecutionConfig.setup_operator_args`` by @pankajastro in #2133
* Use deferrable sensors by default in ``ExecutionMode.WATCHER`` by
@pankajastro in #2084
* Support real-time consumer updates when using
``ExecutionMode.WATCHER`` and ``InvocationMode.SUBPROCESS`` by
@pankajastro in #2152
* Update telemetry to v3 format with query parameters by @pankajkoti in
#2192
* Add initial set of telemetry task listener metrics for Cosmos
operators by @pankajkoti in #2195

Enhancements

* Unify Airflow version handling into ``constants.py`` by @tatiana in
#2089
* Refactor ``airflow/graph.py`` to simplify the code base by @tatiana in
#2080
* Force watcher producer retries to zero by @pankajkoti in #2114
* Fail ``ExecutionMode.WATCHER`` consumer sensors immediately when the
producer fails using Airflow context by @pankajkoti in #2126
* ``ExecutonMode.WATCHER``: fetch producer status asynchronously from
the Airflow runtime so deferrable sensors fail immediately when the
producer task fails by @pankajkoti in #2144
* Refactor ``ExecutionMode.WATCHER`` ``InvocationMode.SUBPROCESS`` log
parser by @tatiana in #2183
* Replace map_index with is_mapped_task boolean in task telemetry
metrics by @pankajkoti in #2210
* Collect cosmos profile metrics in task telemetry metrics by
@pankajastro in #2198
* Remove unnecessary information from telemetry by @tatiana in #2211

Bug fixes

* Clarify ``ExecutionMode.WATCHER`` deferrable failure messaging by
@pankajkoti in #2124
* Remove empty test tasks when all tests are detached by @anyapriya in
#2010
* Fix forwarding ``DbtProducerWatcherOperator`` ``dbt build`` flags by
@michal-mrazek in #2127
* Add databricks oauth mock profile by @fjmacagno in #2164
* Register listeners in Airflow 3 plugin implementation by @pankajastro
in #2187
* Fix resolution of ``packages-install-path`` when it uses ``env_var``
by @tatiana in #2194
* Fix ``template_fields`` in ``DbtConsumerWatcherSensor`` to include
``DbtRunLocalOperator`` template_fields`` by @tiovader and @emanuel-luis
in #2209
* Emit asset events in ExecutionMode.AIRFLOW_ASYNC mode by @pankajastro
in #2184
* Remove dag_run_id from telemetry tests by @tatiana in #2213

Docs

* Document dataset-event limitation when using
``ExecutionMode.AIRFLOW_ASYNC`` by @varaprasadregani in #2143
* Expand ``ExecutionMode.KUBERNETES`` guidance by @tatiana  in #2139
* Add docs for deferrable ``DbtConsumerWatcherSensor`` by @pankajastro
in #2115
* Fix reStructuredText formatting by @dnskr in #2132
* Add docs for ``setup_operator_args`` param by @pankajastro in #2136
* Remove experimental flag for ``ExecutionMode.AIRFLOW_ASYNC`` by
@pankajastro in #2153
* Clarify ``ExecutionMode.AIRFLOW_ASYNC`` dataset limits by @pankajkoti
in #2167
* Update PRIVACY_NOTICE.rst by @tatiana in #2212

Others

* Drop Python 3.9 support by @pankajastro in #2118
* Drop Airflow 2.4 support by @pankajastro in #2161
* Drop Airflow 2.5 support by @pankajastro in #2165
* Improve example DAG ``jaffle_shop_kubernetes.py`` by @tatiana in #2140
* Enable tests for Python 3.13 by @pankajastro in #2154
* Add Python 3.12 to CI integration tests matrix by @pankajastro in
#2168
* Retry flaky Telemetry success test to stabilise CI by @pankajkoti in
#2138
* Drop unused producer state xcom handling in ``ExecutionMode.WATCHER``
by @pankajkoti in #2145
* Remove unused Python3.9 uses from Github action CI by @pankajastro in
#2117
* Run pre-commit on ``ExecutionMode.WATCHER`` modules by @pankajkoti in
#2150
* Refactor: Use shared airflow version constant by @pankajkoti in #2157
* Pin ``pydantic<2.0`` for Airflow 2.6 compatibility by @pankajastro in
#2172
* Remove duplicate ``dbt-duckdb`` dependency by @pankajastro in #2170
* Add targeted ``type: ignore`` for untyped decorators to fix ``mypy``
errors by @pankajastro in #2174
* Replace Legacy typing Aliases with Built-in Types for Python 3.10+ by
@pankajastro in #2175
* Refactor to reuse ``load_method_from_module`` from
``_utils/importer.py`` by @pankajastro in #2176
* Remove try except block for cache import and unused python_version
variable by @pankajastro in #2186
* Unpin Airflow to satisfy GitHub Security tab requirements by
@pankajastro in #2171
* Update Python version for ``pyupgrade`` in ``pre-commit`` config by
@pankajastro in #2190
* Add cooldown config in ``dependabot`` config by @pankajastro in #2189
* Adjust pre-commit so Python 3.10 or higher can be used by @tatiana in
#2196
* Remove empty variables emission from telemetry metrics by @pankajkoti
in #2197
* Reformat documented comments for historical URL formats by @pankajkoti
in #2199
* Bump ``actions/checkout`` from ``5.0.0`` to ``5.0.1`` by @dependabot
in #2135
* Bump ``actions/checkout`` to ``6.0.0`` in GitHub workflows by
@dependabot in #2147
* Bump ``zizmorcore/zizmor-action`` from ``0.2.0`` to ``0.3.0`` by
@dependabot in #2156
* Bump ``actions/checkout`` from ``5.0.1`` to ``6.0.0`` by @dependabot
in #2155
* Bump ``actions/checkout`` from ``6.0.0`` to ``6.0.1`` by @dependabot
in #2178
* Bump ``codecov/codecov-action`` from ``5.5.1`` to ``5.5.2`` by
@dependabot in #2208
* pre-commit autoupdate by @pre-commit-ci[bot] in #2134, #2162, #2173,
#2191, #2202

closes:
astronomer/oss-integrations-private#275
tatiana added a commit that referenced this pull request Dec 29, 2025
Since #2084, the
`ExecutionMode.WATCHER` logic started spreading throughout the Cosmos
code base.

When implementing Airflow providers, it is common practice to split
Airflow code into operators and triggerers folders. However, in the
specific case of Cosmos, I believe this approach makes it harder to read
and maintain the project and that it is better to centralise the
execution mode code.

Historically, Cosmos execution modes were defined in the operators
folder because the code for each execution mode was operator-only. With
the introduction of the async and watcher execution modes, the execution
mode implementation became more complex, requiring a custom triggerer
for the watcher. As a result, the watcher execution mode code began
spreading throughout the Cosmos codebase, complicating code distribution
more than needed.

This PR brings the watcher triggerer logic closer to the operator, so
that most of the watcher-specific logic lives near the execution mode
implementation.

In Cosmos 2.0, we can review how we name the Cosmos execution mode
folder, potentially renaming this folder from operators to
execution_modes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement] Support using deferrable sensors on ExecutionMode.WATCHER

4 participants