Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
a70f72b
align redirects
lzdanski Mar 6, 2026
a658998
update airflow-worker execution modes
lzdanski Mar 6, 2026
f842942
Add container execution modes
lzdanski Mar 6, 2026
0d3498d
add local conflicts
lzdanski Mar 6, 2026
bbf98bb
exec mode updates
lzdanski Mar 6, 2026
9ab652c
update index
lzdanski Mar 6, 2026
f2cc135
fix formatting
lzdanski Mar 6, 2026
987994b
refactor index structures
lzdanski Mar 6, 2026
e3cb083
🎨 [pre-commit.ci] Auto format from pre-commit.com hooks
pre-commit-ci[bot] Mar 6, 2026
b3cd655
remove unecessary numbers
lzdanski Mar 6, 2026
35508d8
fix spelling
lzdanski Mar 6, 2026
0aa2482
Merge branch 'astronomer:main' into update-execution-modes
lzdanski Mar 9, 2026
e59c689
copyediting
lzdanski Mar 9, 2026
acdd9f7
Fix page location
lzdanski Mar 9, 2026
e0baf46
fix links
lzdanski Mar 9, 2026
f6eae40
address page-level feedback
lzdanski Mar 10, 2026
656e45d
Address content review feedback
lzdanski Mar 10, 2026
3ddb849
fix copyediting
lzdanski Mar 10, 2026
aa4ce5a
Remove diagram
lzdanski Mar 10, 2026
5d5f9f6
Add missing invocation mode content
lzdanski Mar 10, 2026
cf78aaf
🎨 [pre-commit.ci] Auto format from pre-commit.com hooks
pre-commit-ci[bot] Mar 10, 2026
0b604af
Merge branch 'main' into update-execution-modes
lzdanski Mar 11, 2026
e207c1c
🎨 [pre-commit.ci] Auto format from pre-commit.com hooks
pre-commit-ci[bot] Mar 11, 2026
6b8e5ac
Accept review feedback, fix refs
lzdanski Mar 11, 2026
e29cdb9
fix build errors
lzdanski Mar 11, 2026
fe520d2
Apply suggestion from @tatiana
tatiana Mar 12, 2026
2c98f68
Apply suggestion from @tatiana
tatiana Mar 12, 2026
a96e087
Apply suggestion from @tatiana
tatiana Mar 12, 2026
53b4a9d
Apply suggestion from @tatiana
tatiana Mar 12, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 1 addition & 22 deletions docs/getting_started/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ The recommended way to install and run Cosmos depends on how you run Airflow. Fo
- `Getting Started on MWAA <mwaa.html>`__
- `Getting Started on GCC <gcc.html>`__

You might require a different setup depending on your particular configuration. See :ref:`exec-methods`.
You might require a different setup depending on your particular configuration. See :ref:`execution-modes`.

Example Demo: Jaffle Shop Project
__________________________________
Expand Down Expand Up @@ -76,24 +76,3 @@ as ``max_active_tasks``, ``max_active_runs``, and ``default_args``.

With Cosmos, transitioning from a dbt workflow to an Airflow Dag is seamless, giving you the best of both tools
for managing and scaling your data workflows.

.. _exec-methods:

Execution Methods
-----------------

For more customization, check out the different execution modes that Cosmos supports on the `Execution Modes <execution-modes.html>`__ page.

For specific guides, see the following:

- `Executing dbt DAGs with DockerOperators <../../guides/run_dbt/container/docker.html>`__
- `Executing dbt DAGs with KubernetesPodOperators <../../guides/run_dbt/container/kubernetes.html>`__
- `Executing dbt DAGs with Watcher Kubernetes Mode <../../guides/run_dbt/container/watcher-kubernetes-execution-mode.html>`__
- `Executing dbt DAGs with AzureContainerInstancesOperators <../../guides/run_dbt/container/azure-container-instance.html>`__
- `Executing dbt DAGs with GcpCloudRunExecuteJobOperators <../../guides/run_dbt/container/gcp-cloud-run-job.html>`__


Concepts Overview
-----------------

How do dbt and Airflow concepts map to each other? Learn more `in this link <dbt-airflow-concepts.html>`__.
Original file line number Diff line number Diff line change
@@ -1,19 +1,17 @@
:orphan:

.. _execution-modes-local-conflicts:

Airflow and dbt dependencies conflicts
======================================

When using the `Local Execution Mode <execution-modes.html#local>`__, users may face dependency conflicts between
`Apache Airflow® <https://airflow.apache.org/>`_ and dbt. The conflicts may increase depending on the Airflow providers and dbt adapters being used.
When using the :ref:`local-execution` without defining a custom ``ExecutionConfig.dbt_executable_path``, you might have dependency conflicts between
`Apache Airflow® <https://airflow.apache.org/>`_ and dbt. The number of conflicts can increase depending on the Airflow providers and dbt adapters you use.

If you find errors, we recommend users isolating the installation of dbt from the Airflow installation.
With the `Local Execution Mode <execution-modes.html#local>`__, this can be accomplished by installing dbt in a separate
Python virtualenv and setting the `ExecutionConfig.dbt_executable_path <../guides/execution-config.html>`_ and
`RenderConfig.dbt_executable_path <../guides/render-config.html>`_ parameters.
With the ``local`` execution mode, this can be accomplished by installing dbt in a separate
Python virtualenv and setting the `ExecutionConfig.dbt_executable_path <../../reference/configs/execution-config.html>`_ and
`RenderConfig.dbt_executable_path <../../guides/translate_dbt_to_airflow/render-config.html>`_ parameters.

The page `execution modes <execution-modes.html>`__ describes many other methods that support isolating dbt from Airflow.
The page, :ref:`execution-modes` describes many other methods that support isolating dbt from Airflow.

In the following table, ``x`` represents combinations that lead to conflicts (vanilla ``apache-airflow`` and ``dbt-core`` packages):

Expand Down
57 changes: 50 additions & 7 deletions docs/guides/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,33 @@
Guides
======

Cosmos offers a number of configuration options to customize its behavior. For more info, check out the links on the left or the table of contents below.
.. toctree::
:maxdepth: 0
:hidden:

self

Cosmos offers a number of configuration options to customize how Airflow dags and dbt commands run.

To set up a project, you follow the same general set of steps.


Set up dbt with Airflow
~~~~~~~~~~~~~~~~~~~~~~~~~~

Make your dbt projects available to Airflow and install dbt into the environment where your dbt code runs.

.. toctree::
:maxdepth: 1
:hidden:
:caption: Set up dbt with Airflow

dbt_setup/dbt-fusion
dbt_setup/execution-modes-local-conflicts

Connect to your dbt database
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Configure your Cosmos project to allow Airflow Dags to initiate dbt commands, and make data transformations and updates in your data warehouses. You can create these connections with your ``profiles.yml`` file in the dbt project, using profile mappings, or customizing ``ProfileConfig`` per dbt configuration.

.. toctree::
:maxdepth: 1
Expand All @@ -22,6 +41,12 @@ Cosmos offers a number of configuration options to customize its behavior. For m
connect_database/use-profile-mapping
connect_database/profile-customise-per-node


Translate your dbt code into Airflow Dags
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can customize how Cosmos parses your dbt workflows into Airflow Dags. Choosing how you want your dbt nodes to map to Airflow tasks within Dags can affect the time required for Cosmos to parse the dbt workflows and for Airflow to execute the resulting Dags.

.. toctree::
:maxdepth: 1
:hidden:
Expand All @@ -34,9 +59,14 @@ Cosmos offers a number of configuration options to customize its behavior. For m
translate_dbt_to_airflow/render-config
Customize node conversion <translate_dbt_to_airflow/dag-customization>


Run dbt
~~~~~~~~~~~~~

Specify more details about how Cosmos runs both dbt commands and Airflow Dags. This includes :ref:`execution-modes` , either one that runs dbt on an Airflow worker node or one that runs in a container. You can customize additional aspects of how your dbt code runs, like using particular operators that correspond to dbt commands. And, you can leverage Airflow's scheduling capabilities in your Cosmos Dags.

.. toctree::
:maxdepth: 3
:hidden:
:maxdepth: 1
:caption: How Cosmos runs dbt

run_dbt/execution-modes
Expand All @@ -46,24 +76,37 @@ Cosmos offers a number of configuration options to customize its behavior. For m
run_dbt/operators/operators
run_dbt/customization/index

Multi-project Setups
~~~~~~~~~~~~~~~~~~~~

If you have a multi-project architecture where you have multiple dbt projects that reference each others' models, you can set up ``dbt-loom`` with Cosmos to handle cross-project references.

.. toctree::
:maxdepth: 1
:hidden:
:caption: Multi-project Setups

Handle cross-project references <multi_project/multi-project>

Add your dbt documentation
~~~~~~~~~~~~~~~~~~~~~~~~~~

Cosmos supports dbt's documentation capabilities.

.. toctree::
:maxdepth: 1
:hidden:
:caption: dbt Documentation

dbt_docs/generating-docs
dbt_docs/hosting-docs


Cosmos DevEx
~~~~~~~~~~~~

You can configure Cosmos to improve your development experience.

.. toctree::
:maxdepth: 1
:hidden:
:caption: Cosmos DevEx

cosmos_devex/lineage
Expand Down
47 changes: 25 additions & 22 deletions docs/guides/run_dbt/airflow-worker/async-execution-mode.rst
Original file line number Diff line number Diff line change
@@ -1,20 +1,23 @@
.. _async-execution-mode:

Airflow Async Execution Mode
Airflow async execution mode
============================

This execution mode can reduce the runtime by 35% in comparison to Cosmos LOCAL execution mode, but is currently only available for BigQuery. While this mode was introduced in Cosmos 1.9, we strongly encourage users to use Cosmos 1.11, which has significant performance improvements.
This execution mode can reduce the runtime by 35% in comparison to Cosmos ``LOCAL`` execution mode, but is currently only available for BigQuery. While this mode was introduced in Cosmos 1.9, we strongly encourage users to use the latest version of Cosmos, which has significant performance improvements.

It can be particularly useful for long-running transformations, since it leverages Airflow's `deferrable operators <https://airflow.apache.org/docs/apache-airflow/stable/authoring-and-scheduling/deferring.html>`__.
The ``airflow_async`` execution mode is a way to run the dbt resources from your dbt project using Apache Airflow's
`Deferrable operators <https://airflow.apache.org/docs/apache-airflow/stable/authoring-and-scheduling/deferring.html>`__.
This execution mode is well-suited for when you have long-running resources and you want to run them asynchronously by
leveraging Airflow's deferrable operators. With deferrable operators, you can potentially observe higher throughput of tasks
because more dbt nodes run in parallel, since they won't be blocking Airflow's worker slots.

In this mode, there is a ``SetupAsyncOperator`` that will pre-generate the SQL files for the dbt project and upload them to Airflow XCom or a remote location. A remote location will only be used if users set ``AIRFLOW__COSMOS__REMOTE_TARGET_PATH`` and ``AIRFLOW__COSMOS__REMOTE_TARGET_PATH_CONN_ID``. This operator is run before the remaining pipeline.
All the pipeline dbt model transformations will be run using ``DbtRunAirflowAsyncOperator`` which, instead of running the ``dbt run`` command for each model. They will download the SQL files from the Airflow XCom or remote location and execute them directly leveraging the Airflow ``BigQueryInsertJobOperator``.

Users can leverage other existing ``BigQueryInsertJobOperator`` features, such as the UI controls to link to the job in the BigQuery UI.
In this mode, there is a ``SetupAsyncOperator`` that pre-generates the SQL files for the dbt project and uploads them to Airflow XCom or a remote location. Airflow only uses a remote location if you set ``AIRFLOW__COSMOS__REMOTE_TARGET_PATH`` and ``AIRFLOW__COSMOS__REMOTE_TARGET_PATH_CONN_ID``. This operator runs before the remaining pipeline.
All the pipeline dbt model transformations run using ``DbtRunAirflowAsyncOperator`` instead of running the ``dbt run`` command for each model. They download the SQL files from the Airflow XCom or remote location, and then execute them directly using the Airflow ``BigQueryInsertJobOperator``.

You can also use other existing ``BigQueryInsertJobOperator`` features, such as the UI controls to link to the job in the BigQuery UI.

Advantages of Airflow Async Mode
++++++++++++++++++++++++++++++++
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- **Improved Task Throughput:** Async tasks free up Airflow workers by leveraging the Airflow Trigger framework. While long-running SQL transformations are executing in the data warehouse, the worker is released and can handle other tasks, increasing overall task throughput.
- **Better Resource Utilization:** By minimizing idle time on Airflow workers, async tasks allow more efficient use of compute resources. Workers aren't blocked waiting for external systems and can be reused for other work while waiting on async operations.
Expand All @@ -34,18 +37,18 @@ We have `observed <https://github.com/astronomer/astronomer-cosmos/pull/1934>`_


Getting Started with Airflow Async Mode
+++++++++++++++++++++++++++++++++++++++
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This guide walks you through setting up an Astro CLI project and running a Cosmos-based DAG with a deferrable operator, enabling asynchronous task execution in Apache Airflow.

Prerequisites
+++++++++++++
-------------

- `Astro CLI <https://www.astronomer.io/docs/astro/cli/install-cli>`_
- Airflow>=2.9

1. Create Astro-CLI Project
+++++++++++++++++++++++++++
---------------------------

Run the following command in your terminal:

Expand All @@ -70,7 +73,7 @@ This will create an Astro project with the following structure:


2. Update Dockerfile
++++++++++++++++++++
--------------------

Edit your Dockerfile to ensure all necessary requirements are included.

Expand All @@ -80,7 +83,7 @@ Edit your Dockerfile to ensure all necessary requirements are included.


3. Add astronomer-cosmos Dependency
+++++++++++++++++++++++++++++++++++
-----------------------------------

In your ``requirements.txt``, add:

Expand All @@ -90,7 +93,7 @@ In your ``requirements.txt``, add:


4. Create Airflow DAG
+++++++++++++++++++++
---------------------

1. Create a new DAG file: ``dags/cosmos_async_dag.py``

Expand Down Expand Up @@ -152,8 +155,8 @@ In your ``requirements.txt``, add:
- Add a valid dbt project inside your Airflow project under ``dags/dbt/``.


5. Start the Project
++++++++++++++++++++
5. Start the project
--------------------

Launch the Airflow project locally:

Expand All @@ -166,8 +169,8 @@ This will:
- Spin up the scheduler, webserver, and triggerer (needed for deferrable operators)
- Expose Airflow UI at http://localhost:8080

6. Create Airflow Connection
++++++++++++++++++++++++++++
6. Create Airflow connection
----------------------------

Create an Airflow connection with following configurations

Expand Down Expand Up @@ -196,7 +199,7 @@ Create an Airflow connection with following configurations


7. Execute the DAG
++++++++++++++++++
------------------

1. Visit the Airflow UI at ``http://localhost:8080``
2. Enable the DAG: ``cosmos_async_dag``
Expand All @@ -209,8 +212,8 @@ Create an Airflow connection with following configurations
The ``run`` tasks will run asynchronously via the deferrable operator, freeing up worker slots while waiting on I/O or long-running tasks.


Control of where to upload the SQL files
++++++++++++++++++++++++++++++++++++++++
Control where to upload the SQL files
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For optimal performance we encourage to keep Cosmos standard behaviour (introduced in 1.11), which is to upload the SQL files to XCom, instead of a remote object location.

Expand All @@ -225,7 +228,7 @@ However, if you want to upload the SQL files to a remote object location instead


Limitations
+++++++++++
~~~~~~~~~~~


1. **Limited to dbt models**: Only dbt resource type models are run asynchronously using Airflow deferrable operators. Other resource types are executed synchronously, similar to the local execution mode.
Expand Down
30 changes: 30 additions & 0 deletions docs/guides/run_dbt/airflow-worker/cosmos-managed-venv.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
.. _cosmos-managed-venv:

Cosmos-managed virtual environment execution mode
========================================================

The ``virtualenv`` mode runs dbt commands from Python virtual environments created and managed by Cosmos. This mode removes the need to create a virtual environment at build time, unlike ``ExecutionMode.LOCAL``, while avoiding package conflicts. It is intended for cases where:

- You can't install dbt directly in the Airflow environment, either in the same environment or a dedicated one.
- Multiple dbt installations are required, and you prefer Cosmos to manage them without modifying the Airflow deployment.
- Speed is not a concern, and you can afford for Cosmos to create and update the Python virtual environment during the execution of each dbt node.

In most cases, the local execution mode with ``ExecutionConfig.dbt_executable_path`` is the preferred option, as it allows you to manage the dbt environment during the Airflow deployment process, instead of per-dbt node execution.

When you use ``virtualenv`` mode, you are responsible for declaring which version of ``dbt`` to use by giving the argument ``py_requirements``. Set this argument directly in operator instances or when you instantiate ``DbtDag`` and ``DbtTaskGroup`` as part of ``operator_args``.

Similar to the ``local`` execution mode, Cosmos converts Airflow Connections into a way ``dbt`` understands them by creating a ``dbt`` profile file (``profiles.yml``).
Also similar to the ``local`` execution mode, Cosmos will by default attempt to use a ``partial_parse.msgpack`` if one exists to speed up parsing.

Some drawbacks of the ``virtualenv`` approach:

- It is slower than ``local`` because it may create and update a new Python virtual environment for each Cosmos dbt task run, depending on the Airflow executor and if you set the ``ExecutionConfig.virtualenv_dir`` configuration.
- If dbt is unavailable in the Airflow scheduler, the default ``LoadMode.DBT_LS`` will not work. In this scenario, you must use a :ref:`parsing-methods` that does not rely on dbt, such as ``LoadMode.MANIFEST``.
- Only ``InvocationMode.SUBPROCESS`` is supported currently, attempt to use ``InvocationMode.DBT_RUNNER`` will raise error.

Example of how to use:

.. literalinclude:: ../../../../dev/dags/example_virtualenv.py
:language: python
:start-after: [START virtualenv_example]
:end-before: [END virtualenv_example]
4 changes: 3 additions & 1 deletion docs/guides/run_dbt/airflow-worker/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,7 @@ Run dbt in an Airflow worker
:maxdepth: 1
:caption: Run dbt in an Airflow worker

async-execution-mode
local-execution-mode
watcher-execution-mode
cosmos-managed-venv
async-execution-mode
23 changes: 23 additions & 0 deletions docs/guides/run_dbt/airflow-worker/local-execution-mode.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
.. _local-execution:

Local execution mode
====================

By default, Cosmos uses the ``local`` execution mode. It is the fastest way to run Cosmos operators, since it runs dbt either as a library or as a local subprocess.
For situations where dbt and Airflow dependencies conflict, :ref:`execution-modes-local-conflicts`, you most likely have the option to pre-install dbt in an isolated Python virtual environment, either as part of the container image or as part of a pre-start script.

The ``local`` execution mode assumes that the Airflow worker node can access a ``dbt`` binary. If ``dbt`` was not installed alongside Cosmos, you can create a dedicated virtual environment and define a custom path to ``dbt`` by declaring the argument ``ExecutionConfig.dbt_executable_path``.

.. note::
Starting in the 1.4 version, Cosmos tries to leverage the dbt partial parsing (``partial_parse.msgpack``) to speed up task execution.
This feature is bound to `dbt partial parsing limitations <https://docs.getdbt.com/reference/parsing#known-limitations>`_.
Learn more: :ref:`partial-parsing`.

When using the ``local`` execution mode, Cosmos converts Airflow Connections into a native ``dbt`` profiles file (``profiles.yml``).

Example of how to use, for instance, when ``dbt`` was installed together with Cosmos:

.. literalinclude:: ../../../../dev/dags/basic_cosmos_dag.py
:language: python
:start-after: [START local_example]
:end-before: [END local_example]
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _watcher-execution-mode:

Introducing ``ExecutionMode.WATCHER``: Experimental High-Performance dbt Execution in Cosmos
============================================================================================
Watcher execution mode (Experimental)
======================================

With the release of **Cosmos 1.11.0**, we are introducing a powerful new experimental execution mode — ``ExecutionMode.WATCHER`` — designed to drastically reduce dbt pipeline run times in Airflow.

Expand Down Expand Up @@ -149,7 +149,7 @@ This approach is best when your Airflow DAG is fully dedicated to a dbt project.
:start-after: [START example_watcher]
:end-before: [END example_watcher]

As it can be observed, the only difference with the default ``ExecutionMode.LOCAL`` is the addition of the ``execution_config`` parameter with the ``execution_mode`` set to ``ExecutionMode.WATCHER``. The ``ExecutionMode`` enum can be imported from ``cosmos.constants``. For more information on the ``ExecutionMode.LOCAL``, please, check the `dedicated page <execution-modes.html#local>`__
As it can be observed, the only difference with the default ``ExecutionMode.LOCAL`` is the addition of the ``execution_config`` parameter with the ``execution_mode`` set to ``ExecutionMode.WATCHER``. The ``ExecutionMode`` enum can be imported from ``cosmos.constants``. For more information on the ``ExecutionMode.LOCAL``, please, check the :ref:`local-execution` documentation.

**How it works:**

Expand Down
Loading
Loading