From c0c395931899aaebd58ae365a555758fac954d5c Mon Sep 17 00:00:00 2001 From: Tatiana Al-Chueyr Date: Thu, 29 Jan 2026 14:52:30 +0000 Subject: [PATCH 1/2] Add Watcher Kubernetes documentation --- docs/getting_started/execution-modes.rst | 15 ++ docs/getting_started/index.rst | 2 + .../watcher-kubernetes-execution-mode.rst | 160 ++++++++++++++++++ 3 files changed, 177 insertions(+) create mode 100644 docs/getting_started/watcher-kubernetes-execution-mode.rst diff --git a/docs/getting_started/execution-modes.rst b/docs/getting_started/execution-modes.rst index 58a0f3421e..ea6a03f283 100644 --- a/docs/getting_started/execution-modes.rst +++ b/docs/getting_started/execution-modes.rst @@ -15,6 +15,7 @@ Cosmos can run ``dbt`` commands using several different approaches, called ``exe 8. **aws_ecs**: Run ``dbt`` commands from AWS ECS instances managed by Cosmos (requires a pre-existing Docker image) 9. **airflow_async**: (stable since Cosmos 1.9.0) Run the dbt resources from your dbt project asynchronously, by submitting the corresponding compiled SQLs to Apache Airflow's `Deferrable operators `__ 10. **watcher**: (experimental since Cosmos 1.11.0) Run a single ``dbt build`` command from a producer task and have sensor tasks to watch the progress of the producer, with improved DAG run time while maintaining the tasks lineage in the Airflow UI, and ability to retry failed tasks. Check the :ref:`watcher-execution-mode` for more details. +11. **watcher_kubernetes**: (experimental since Cosmos 1.13.0) Combines the speed of the watcher execution mode with the isolation of Kubernetes. Check the :ref:`watcher-kubernetes-execution-mode` for more details. The choice of the ``execution mode`` can vary based on each user's needs and concerns. For more details, check each execution mode described below. @@ -68,6 +69,10 @@ The choice of the ``execution mode`` can vary based on each user's needs and con - Very Fast - None - Yes + * - Watcher Kubernetes + - Fast + - High + - No Local ----- @@ -328,6 +333,16 @@ It is designed to improve DAG run time while maintaining the tasks lineage in th Check the :ref:`watcher-execution-mode` for more details. +Watcher Kubernetes Execution Mode (Experimental) +------------------------------------------------ + +.. versionadded:: 1.13.0 + +The ``watcher_kubernetes`` execution mode combines the speed of the ``watcher`` execution mode with the isolation of the ``kubernetes`` execution mode. It runs a single ``dbt build`` command from a producer task inside a Kubernetes pod and has sensor tasks to watch the progress of the producer. + +Check the :ref:`watcher-kubernetes-execution-mode` for more details. + + .. _invocation_modes: Invocation Modes diff --git a/docs/getting_started/index.rst b/docs/getting_started/index.rst index d5880b0ffb..2bb43dfa3f 100644 --- a/docs/getting_started/index.rst +++ b/docs/getting_started/index.rst @@ -16,6 +16,7 @@ GCP Cloud Run Job Execution Mode Airflow Async Execution Mode Watcher Execution Mode + Watcher Kubernetes Execution Mode dbt and Airflow Similar Concepts Operators Custom Airflow Properties @@ -46,6 +47,7 @@ For specific guides, see the following: - `Executing dbt DAGs with Docker Operators `__ - `Executing dbt DAGs with KubernetesPodOperators `__ +- `Executing dbt DAGs with Watcher Kubernetes Mode `__ - `Executing dbt DAGs with AzureContainerInstancesOperators `__ - `Executing dbt DAGs with GcpCloudRunExecuteJobOperators `__ diff --git a/docs/getting_started/watcher-kubernetes-execution-mode.rst b/docs/getting_started/watcher-kubernetes-execution-mode.rst new file mode 100644 index 0000000000..da1d972beb --- /dev/null +++ b/docs/getting_started/watcher-kubernetes-execution-mode.rst @@ -0,0 +1,160 @@ +.. _watcher-kubernetes-execution-mode: + +``ExecutionMode.WATCHER_KUBERNETES``: High-Performance dbt Execution in Kubernetes +=================================================================================== + +.. versionadded:: 1.13.0 + +The ``ExecutionMode.WATCHER_KUBERNETES`` combines the **speed of the** :ref:`watcher-execution-mode` **with the isolation of** :ref:`kubernetes`. + +This execution mode is ideal for users who: + +* Want to leverage the performance benefits of the watcher execution mode +* Need to run dbt in isolated Kubernetes pods +* Prefer not to install dbt in their Airflow deployment + +------------------------------------------------------------------------------- + +Background +---------- + +The :ref:`watcher-execution-mode` introduced in Cosmos 1.11.0 significantly reduces dbt pipeline run times by running dbt as a single command while maintaining model-level observability in Airflow. + +However, the original ``ExecutionMode.WATCHER`` requires dbt to be installed alongside Airflow. The ``ExecutionMode.WATCHER_KUBERNETES`` removes this limitation by running the dbt command inside Kubernetes pods, similar to ``ExecutionMode.KUBERNETES``. + +For more details on the watcher concept and how it works, please refer to the :ref:`watcher-execution-mode` documentation. + +------------------------------------------------------------------------------- + +How to Use +---------- + +Users previously using ``ExecutionMode.KUBERNETES`` can simply replace the ``execution_mode`` to use ``ExecutionMode.WATCHER_KUBERNETES``. + +The following example shows how to configure a ``DbtDag`` with ``ExecutionMode.WATCHER_KUBERNETES``: + +.. code-block:: python + + from cosmos import DbtDag + from cosmos.config import ExecutionConfig + from cosmos.constants import ExecutionMode + + dag = DbtDag( + dag_id="jaffle_shop_watcher_kubernetes", + # ... other DAG parameters ... + execution_config=ExecutionConfig( + execution_mode=ExecutionMode.WATCHER_KUBERNETES, + dbt_project_path=K8S_PROJECT_DIR, + ), + operator_args=operator_args, + ) + +**Key differences from** ``ExecutionMode.KUBERNETES``: + +* The ``execution_mode`` is set to ``ExecutionMode.WATCHER_KUBERNETES`` instead of ``ExecutionMode.KUBERNETES`` +* The producer task runs the entire ``dbt build`` command in a single Kubernetes pod +* Consumer tasks (sensors) watch for the completion of their corresponding dbt models + +For the complete setup including Kubernetes secrets, Docker image configuration, and profile setup, refer to the :ref:`kubernetes` documentation. + +------------------------------------------------------------------------------- + +Performance Gains +----------------- + +Early benchmarks using the ``jaffle_shop_watcher_kubernetes`` DAG show significant improvements: + ++-----------------------------------------------+------------------+ +| Execution Mode | Total Runtime | ++===============================================+==================+ +| ``ExecutionMode.KUBERNETES`` | 00:00:32.155 | ++-----------------------------------------------+------------------+ +| ``ExecutionMode.WATCHER_KUBERNETES`` | 00:00:11.783 | ++-----------------------------------------------+------------------+ + +This represents approximately a **63% reduction** in total DAG runtime. + +The performance improvement comes from: + +* Running dbt as a single command (reducing Kubernetes pod startup overhead) +* Leveraging dbt's native threading capabilities +* Eliminating repeated dbt initialization for each model + +------------------------------------------------------------------------------- + +Known Limitations +----------------- + +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Kubernetes Provider Version Compatibility +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``ExecutionMode.WATCHER_KUBERNETES`` does not work with older versions of the ``apache-airflow-providers-cncf-kubernetes`` provider (<=10.7.0). + +Please ensure you have a compatible version installed: + +.. code-block:: bash + + pip install "apache-airflow-providers-cncf-kubernetes>10.7.0" + +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Producer watcher does not support deferrable mode +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Similar to ``ExecutionMode.WATCHER``, the ``ExecutionMode.WATCHER_KUBERNETES`` producer task, ``DbtProducerWatcherKubernetesSensor``,runs using synchronous mode (``deferrable=False``). + +This was a limitation in the Airflow kubernetes provider, which was fixed in `this PR `_, and we'll be updating Cosmos once it is released. + +Conversely, the consumer tasks, ``DbtConsumerWatcherKubernetesSensor``, run in deferrable mode by default when they operate as sensors. + + +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Other Inherited Limitations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following limitations from ``ExecutionMode.WATCHER`` also apply to ``ExecutionMode.WATCHER_KUBERNETES``: + +* **Individual dbt Operators**: Only ``DbtSeedWatcherKubernetesOperator``, ``DbtSnapshotWatcherKubernetesOperator``, and ``DbtRunWatcherKubernetesOperator`` are implemented. The ``DbtTestWatcherKubernetesOperator`` is currently a placeholder. + +* **Test behavior**: The ``TestBehavior.AFTER_EACH`` is not supported. Tests are run as part of the ``dbt build`` command by the producer task. + +* **Source freshness nodes**: The ``dbt build`` command does not run source freshness checks. + +For more details on these limitations, refer to the :ref:`watcher-execution-mode` documentation. + +------------------------------------------------------------------------------- + +Example DAG +----------- + +Below is a complete example of a DAG using ``ExecutionMode.WATCHER_KUBERNETES``: + +.. literalinclude:: ../../dev/dags/jaffle_shop_watcher_kubernetes.py + :language: python + +------------------------------------------------------------------------------- + +Prerequisites +------------- + +Before using ``ExecutionMode.WATCHER_KUBERNETES``, ensure you have: + +1. A Kubernetes cluster configured and accessible from your Airflow deployment +2. A Docker image containing your dbt project and profile +3. The ``apache-airflow-providers-cncf-kubernetes`` provider installed (version >10.7.0) + +For detailed setup instructions, refer to the :ref:`kubernetes` documentation. + +------------------------------------------------------------------------------- + +Summary +------- + +``ExecutionMode.WATCHER_KUBERNETES`` provides: + +* ✅ **~63% faster** dbt DAG runs compared to ``ExecutionMode.KUBERNETES`` +* ✅ **Isolation** between dbt and Airflow dependencies +* ✅ **Model-level visibility** in Airflow +* ✅ **Easy migration** from ``ExecutionMode.KUBERNETES`` + +This execution mode is ideal for teams who want the performance benefits of the watcher mode while maintaining the isolation provided by Kubernetes execution. From 62160efac459b40996d199d5d7ab605c888a87dc Mon Sep 17 00:00:00 2001 From: Tatiana Al-Chueyr Date: Thu, 29 Jan 2026 15:43:01 +0000 Subject: [PATCH 2/2] Apply suggestion from @tatiana --- docs/getting_started/watcher-kubernetes-execution-mode.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/getting_started/watcher-kubernetes-execution-mode.rst b/docs/getting_started/watcher-kubernetes-execution-mode.rst index da1d972beb..a09efb165a 100644 --- a/docs/getting_started/watcher-kubernetes-execution-mode.rst +++ b/docs/getting_started/watcher-kubernetes-execution-mode.rst @@ -101,7 +101,7 @@ Please ensure you have a compatible version installed: Producer watcher does not support deferrable mode ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Similar to ``ExecutionMode.WATCHER``, the ``ExecutionMode.WATCHER_KUBERNETES`` producer task, ``DbtProducerWatcherKubernetesSensor``,runs using synchronous mode (``deferrable=False``). +Similar to ``ExecutionMode.WATCHER``, the ``ExecutionMode.WATCHER_KUBERNETES`` producer task, ``DbtProducerWatcherKubernetesSensor``, runs using synchronous mode (``deferrable=False``). This was a limitation in the Airflow kubernetes provider, which was fixed in `this PR `_, and we'll be updating Cosmos once it is released.