Improve watcher producer task priority both in scheduling and in the UI#2237
Conversation
Some users mentioned that the producer task would be listed after consumer tasks in the Airflow UI. We also received the report that sometimes the producer task was scheduled after the consumer tasks, even though it has a higher priority weight. I discussed these issues with @ashb and he advised we instantiated the producer task before other Airflow tasks while building the task. This PR aims to accomplish this.
✅ Deploy Preview for astronomer-cosmos canceled.
|
There was a problem hiding this comment.
Pull request overview
This PR refactors the watcher producer task creation to ensure it is instantiated before consumer tasks, addressing both UI display order and task scheduling priority issues. The main change splits the _add_producer_watcher_and_dependencies function into two separate methods for clearer separation of concerns.
Key Changes:
- Split task creation from dependency configuration by breaking
_add_producer_watcher_and_dependenciesinto_add_watcher_producer_taskand_add_watcher_dependencies - Move producer task instantiation to occur before the loop that creates consumer tasks in
build_airflow_graph
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2237 +/- ##
=======================================
Coverage 97.98% 97.98%
=======================================
Files 95 95
Lines 6192 6197 +5
=======================================
+ Hits 6067 6072 +5
Misses 125 125 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
pankajkoti
left a comment
There was a problem hiding this comment.
Looks solid to me. Should not harm us and only benefit, I believe, if our tests on DAG rendering/running conclude that
Features * Support cross-referencing models across dbt projects using dbt-loom by @pankajkoti in #2271 * Support use of YAML selectors when using ``LoadMode.DBT_MANIFEST`` by @YourRoyalLinus in #2261 * Introduce ``ExecutionMode.WATCHER_KUBERNETES`` to use the watcher with ``KubernetesPodOperator`` by @tatiana in #2207 * Add support for StarRocks profile mapping by @kurkim0661 in #2256 * Allow pushing URIs as XComs for Cosmos tasks by @corsettigyg in #2275 * Support defining custom callbacks alongside the ``WATCHER_KUBERNETES`` callback by @johnhoran in #2307 Enhancements * Refactor: remove duplicate ``_construct_dest_file_path`` by @jx2lee in #2077 * Leverage Airflow ``::group::`` to group logs associated with DAG parsing by @tatiana in #2235 * Refactor ``DbtConsumerWatcherSensor`` for reusability by @tatiana in #2245 * Restore plain text output when using ``ExecutionMode.WATCHER`` by @tiovader in #2241 Bug Fixes * Fix running empty models or ephemeral nodes in ``ExecutionMode.WATCHER`` by @tatiana in #2279 * Improve watcher producer task priority in scheduling and the UI by @tatiana in #2237 * Fix typos and formatting issues in documentation by @pankajkoti in #2259 * Allow watcher producer retries without erroring by @tatiana in #2283 * Fix ``TestBehavior.AFTER_ALL`` is missing project_name information when loading project using manifest file by @tuantran0910 in #2242 * Fix duplicate log lines in watcher subprocess execution and format timestamps by @pankajkoti in #2301 Docs * Add Watcher Kubernetes documentation by @tatiana in #2303 * Document newly added telemetry metrics in the privacy notice by @pankajkoti in #2249 * Add compatibility policy document by @pankajastro in #2251 * Improve watcher documentation related to dbt threads by @tatiana in #2273 * Fix link in watcher execution mode documentation by @jedcunningham in #2277 * Update Apache Airflow minimum compatibility policy by @tatiana in #2285 * Clarify Cosmos runtime support until "End of Basic Support" by @jedcunningham in #2286 * Update watcher docs by @tatiana in #2298 * Update watcher kubernetes documentation by @tatiana in #2306 Others * Add Airflow 3 DAG versioning tests for Cosmos by @michal-mrazek in #2177 * Add dbt Core 1.11 to the test matrix by @tatiana in #2230 * Add integration tests using InvocationMode.SUBPROCESS and validate output by @tatiana in #2287 * Fix main branch failing tests by @tatiana in #2296 * Update pre-commit hooks to the latest versions by @jedcunningham in #2289 * Pre-commit autoupdates by @pre-commit in #2222, #2264, #2274 and #2290 * Dependabot updates by @dependabot in #2218, #2219, #2220, #2280 and #2284 * Add Scarf metrics to understand Cosmos feature usage patterns - Add telemetry tracking for dbt docs plugin usage by @pankajkoti in #2240 - Add DAG run telemetry metrics for load mode, invocation, and render_config parameters by @pankajkoti in #2223 - Collect profile metrics for DAG runs by @pankajastro in #2228 - Compress telemetry metadata to reduce serialized DAG size by @pankajkoti in #2252 - Skip storing telemetry metadata when emission is disabled by @pankajkoti in #2278 - Hide telemetry metadata parameters from the Airflow trigger UI by @pankajkoti in #2247 closes: astronomer/oss-integrations-private#317 --------- Co-authored-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com>
Some users noted that the producer task would appear after the consumer tasks in the Airflow UI. We also received a report that the producer task was sometimes scheduled after the consumer tasks, even though it has a higher priority weight.
I discussed these issues with @ashb, and he advised that we instantiate the producer task before the consumer tasks. This PR aims to accomplish this by breaking down the method
_add_producer_watcher_and_dependenciesinto two methods:_add_watcher_producer_task_add_watcher_dependencies