Skip to content
Merged
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@
"configuration/logging": "../guides/cosmos_devex/logging.html",
"configuration/memory_optimization": "../optimize_performance/memory_optimization.html",
"configuration/multi-project": "../run/multi_project/multi-project.html",
"configuration/operator-args": "../guides/run_dbt/customization/operator-args.html",
"configuration/operator-args": "../guides/run_dbt/operators/operator-args.html",
"configuration/parsing-methods": "../guides/translate_dbt_to_airflow/parsing-methods.html",
"configuration/partial-parsing": "../guides/run_dbt/customization/partial-parsing.html",
"configuration/profile-config": "../reference/configs/profile-config.html",
Expand Down
3 changes: 2 additions & 1 deletion docs/guides/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,12 @@ Cosmos offers a number of configuration options to customize its behavior. For m
:hidden:
:caption: How Cosmos runs dbt

run_dbt/index
run_dbt/execution-modes
run_dbt/airflow-worker/index
run_dbt/container/index
run_dbt/callbacks/callbacks
run_dbt/operators/operators
run_dbt/operators/index
run_dbt/customization/index

.. toctree::
Expand Down
1 change: 0 additions & 1 deletion docs/guides/run_dbt/customization/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,5 @@ Additional Customization
:caption: Additional Customization

scheduling
operator-args
partial-parsing
custom-airflow-properties
34 changes: 34 additions & 0 deletions docs/guides/run_dbt/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
.. _how-cosmos-runs-dbt:

How Cosmos runs dbt
===================

Cosmos can run dbt commands directly using operators, or, after the dbt project has been parsed and turned into an Airflow Dag or task group, you can execute it.

In many execution modes, Cosmos’ ``DbtDag`` and ``DbtTaskGroup '' create a separate task for each dbt node (model, seed, snapshot).
This leads to improved visibility and the
possibility of fine-grained control over your dbt commands. For example, you can set task parameters like pool
or retries on individual Cosmos tasks. Or, you can make downstream tasks run as soon as a specific Cosmos task has finished successfully.
Running one dbt command per task can bring performance challenges, since each invocation of a dbt command incurs overhead. To improve performance, newer versions of Cosmos have introduced alternatives that offer the same level of granularity while centralising the execution of the dbt command in a single task. Check :ref:`watcher-execution-mode` and :ref:`async-execution-mode`, for more information.

Cosmos uses different kinds of configurations to control how the dbt nodes are executed within the Airflow Dag or task group, which you can customize based on your project and needs.

Execution modes
~~~~~~~~~~~~~~~~

Execution modes are defined by the ``ExecutionConfig`` class in your Cosmos Dag.
Depending on your specific dbt project architecture and whether you want to run your dbt commands in the cloud or in a container separate from your Airflow environment.

Check out the available :ref:`execution-modes` and the detailed :ref:`execution-config` for more information about how to set up your Cosmos execution.


Running dbt commands
~~~~~~~~~~~~~~~~~~~~

In addition to specifying where you want Cosmos to run dbt commands, you can also configure the following:

- :ref:`callbacks`: Tell Cosmos how to handle artifacts produced by dbt while executing dbt code.
Comment thread
lzdanski marked this conversation as resolved.
- ``interceptor``: (new in v1.14) Optional list of callables run before building the dbt command. See :ref:`operator-args` or for more information.
- :ref:`operator-args`: Pass specific operator arguments, ``operator_args``, in your Dag that can directly correspond to dbt commands, Cosmos operations, or to define Airflow behavior.
- :ref:`scheduling`: Leverage Airflow to schedule your dbt workflows with cron-based scheduling, timetables, and data-aware scheduling.
- :ref:`partial-parsing`: Configure Cosmos to use dbt's partial parsing capabilities, improving dbt and Dag parsing, which speeds up execution times.
15 changes: 15 additions & 0 deletions docs/guides/run_dbt/operators/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@

.. _operator-index:

Operators
=========

Learn how to use operators with Cosmos.

.. toctree::
:maxdepth: 1
:caption: Operators

operators
operator-args
overriding-operator-args
Original file line number Diff line number Diff line change
Expand Up @@ -36,47 +36,6 @@ Example of setting a Cosmos-specific operator argument:
)


.. _operator-args-per-node:

Overriding operator arguments per dbt node (or group of nodes)
--------------------------------------------------------------

.. versionadded:: 1.8.0

Cosmos 1.8 introduced the capability for users to customise the operator arguments per dbt node, or per group of dbt nodes.
This can be done by defining the arguments via a dbt meta property alongside other dbt project configurations.

Let's say there is a DbtTaskGroup that sets a default pool to run all the dbt tasks, but a user would like the model expensive
to run a separate pool.

Users could either use ``operator_args`` or ``default args`` for defining the default behavior:

.. code-block:: python

dbt_task_group = DbtTaskGroup(
# ...
profile_config=ProfileConfig,
default_args={"pool": "default_pool"},
)

While configuring in the ``dbt_project.yml`` a different behaviour for the model "expensive", that should use the "expensive-pool":

.. code-block::

version: 2
models:
- name: expensive
description: description
meta:
cosmos:
operator_kwargs:
pool: expensive-pool


More information about this feature can be found in :ref:`custom-airflow-properties`.

To learn how to customise the profile per dbt model or Cosmos task, check :ref:`profile-customise-per-node`.

Summary of Cosmos-specific arguments
------------------------------------

Expand Down Expand Up @@ -193,15 +152,15 @@ Example usage of templated ``dbt_cmd_flags`` for microbatch models with event-ti
},
)

The following template fields are only selectable when using the operators in a standalone context (starting in Cosmos 1.4):
The following template fields are only selectable when using the operators in a standalone context via the ``operator_args`` parameter (starting in Cosmos 1.4):

- ``select``
- ``exclude``
- ``selector``
- ``models``

Since Airflow resolves template fields during Airflow DAG execution and not DAG parsing, the args above cannot be templated via ``DbtDag`` and ``DbtTaskGroup`` because both need to select dbt nodes during DAG parsing.
Since Airflow resolves template fields during Airflow DAG execution and not DAG parsing, the args above cannot be templated via ``DbtDag`` and ``DbtTaskGroup`` because both need to select dbt nodes during DAG parsing.

Additionally, the SQL for compiled dbt models is stored in the template fields, which is viewable in the Airflow UI for each task run.
This is provided for telemetry on task execution, and is not an operator arg.
For more information about this, see the `Compiled SQL <compiled-sql.html>`_ docs.
For more information about this, see the `Compiled SQL <../../cosmos_devex/compiled-sql.html>`_ docs.
8 changes: 5 additions & 3 deletions docs/guides/run_dbt/operators/operators.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
.. _operators:

Operators
=========
dbt command operators
=====================

Cosmos exposes individual operators that correspond to specific dbt commands, which can be used just like traditional
`Apache Airflow® <https://airflow.apache.org/>`_ operators. Cosmos names these operators using the format ``Dbt<dbt-command><execution-mode>Operator``. For example, ``DbtBuildLocalOperator``.
`Apache Airflow® <https://airflow.apache.org/>`_ operators. Cosmos names these operators using the format ``Dbt<dbt-command><execution-mode>Operator``.

The following examples show ``DbtCloneLocalOperator`` and ``DbtSeedLocalOperator``. You can see the full ``example_operator`` Dag in the `dev/dags directory <https://github.com/astronomer/astronomer-cosmos/blob/main/dev/dags/example_operators.py>`_.

Clone
-----
Expand Down
40 changes: 40 additions & 0 deletions docs/guides/run_dbt/operators/overriding-operator-args.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
.. _operator-args-per-node:

Overriding operator arguments per dbt node (or group of nodes)
==============================================================

.. versionadded:: 1.8.0

Cosmos 1.8 introduced the capability for users to customise the operator arguments per dbt node, or per group of dbt nodes.
This can be done by defining the arguments via a dbt meta property alongside other dbt project configurations.

Let's say there is a DbtTaskGroup that sets a default pool to run all the dbt tasks, but a user would like the model expensive
to run a separate pool.

Users could either use ``operator_args`` or ``default args`` for defining the default behavior:

.. code-block:: python

dbt_task_group = DbtTaskGroup(
# ...
profile_config=ProfileConfig,
default_args={"pool": "default_pool"},
)

While configuring in the ``dbt_project.yml`` a different behaviour for the model "expensive", that should use the "expensive-pool":

.. code-block::

version: 2
models:
- name: expensive
description: description
meta:
cosmos:
operator_kwargs:
pool: expensive-pool


More information about this feature can be found in :ref:`custom-airflow-properties`.

To learn how to customise the profile per dbt model or Cosmos task, check :ref:`profile-customise-per-node`.
2 changes: 2 additions & 0 deletions docs/reference/configs/execution-config.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _execution-config:

Execution Config
==================

Expand Down
Loading