From ac1ed07e8c961053d32ae0ddabbb1054c3f14bfc Mon Sep 17 00:00:00 2001 From: jx2lee Date: Fri, 17 Apr 2026 16:10:08 +0900 Subject: [PATCH 1/2] updated docs for DbtDocsS3KubernetesOperator --- docs/guides/dbt_docs/generating-docs.rst | 32 ++++++++++++++++++-- docs/guides/run_dbt/container/kubernetes.rst | 9 ++++-- 2 files changed, 37 insertions(+), 4 deletions(-) diff --git a/docs/guides/dbt_docs/generating-docs.rst b/docs/guides/dbt_docs/generating-docs.rst index d641fce82a..e246381388 100644 --- a/docs/guides/dbt_docs/generating-docs.rst +++ b/docs/guides/dbt_docs/generating-docs.rst @@ -9,14 +9,15 @@ After generating the dbt docs, you can host them natively within Airflow via the Alternatively, many users choose to serve these docs on a separate static website. This is a great way to share your data models with a broad array of stakeholders. -Cosmos offers two pre-built ways of generating and uploading dbt docs and a fallback option to run custom code after the docs are generated: +Cosmos offers pre-built ways of generating and uploading dbt docs, plus a fallback option to run custom code after the docs are generated: - :class:`~cosmos.operators.DbtDocsS3Operator`: generates and uploads docs to a S3 bucket. +- :class:`~cosmos.operators.kubernetes.DbtDocsS3KubernetesOperator`: generates docs in a Kubernetes Pod and uploads them to a S3 bucket from inside that Pod. - :class:`~cosmos.operators.DbtDocsAzureStorageOperator`: generates and uploads docs to an Azure Blob Storage. - :class:`~cosmos.operators.DbtDocsGCSOperator`: generates and uploads docs to a GCS bucket. - :class:`~cosmos.operators.DbtDocsOperator`: generates docs and runs a custom callback. -The first three operators require you to have a connection to the target storage. The last operator allows you to run custom code after the docs are generated in order to upload them to a storage of your choice. +The first four operators require you to have a connection to the target storage. The last operator allows you to run custom code after the docs are generated in order to upload them to a storage of your choice. Examples @@ -43,6 +44,33 @@ You can use the :class:`~cosmos.operators.DbtDocsS3Operator` to generate and upl bucket_name="test_bucket", ) +Upload to S3 from Kubernetes +'''''''''''''''''''''''''''' + +If you run dbt in :ref:`kubernetes`, use :class:`~cosmos.operators.kubernetes.DbtDocsS3KubernetesOperator`. +Unlike the local S3 operator, this operator generates the docs and uploads them to S3 from inside the Kubernetes Pod. + +This is important because the dbt ``target`` directory is created inside the Pod, not on the Airflow worker that launched it. + +Requirements specific to Kubernetes: + +- The container image must include your dbt project files. +- The container image or mounted files must include a ``profiles.yml`` file, because Kubernetes execution mode does not support :doc:`../connect_database/use-profile-mapping`. +- The container image must have the AWS CLI available because Cosmos uploads the generated docs with ``aws s3 sync``. +- The Pod still needs the database credentials and any other secrets required to run ``dbt docs generate``. + +The following example extends the Kubernetes example DAG and uploads the generated docs to S3: + +.. literalinclude:: ../../../dev/dags/jaffle_shop_kubernetes.py + :language: python + :start-after: [START kubernetes_docs_to_s3_example] + :end-before: [END kubernetes_docs_to_s3_example] + +The ``connection_id`` is resolved from Airflow and translated into AWS environment variables that are injected into the Pod before ``aws s3 sync`` runs. + +.. note:: + This Kubernetes integration currently supports S3 only. If you need another storage backend, use one of the local operators or extend Cosmos with another Kubernetes docs operator. + Upload to Azure Blob Storage '''''''''''''''''''''''''''' diff --git a/docs/guides/run_dbt/container/kubernetes.rst b/docs/guides/run_dbt/container/kubernetes.rst index cb35c9c3c6..c39e369fd6 100644 --- a/docs/guides/run_dbt/container/kubernetes.rst +++ b/docs/guides/run_dbt/container/kubernetes.rst @@ -38,6 +38,8 @@ At the moment, the user is expected to add to the Docker image both: - The dbt Profile, which contains the information for dbt to access the database while parsing the project from Apache Airflow nodes - Handle secrets +If you plan to generate dbt docs and upload them to S3 from Kubernetes, the image also needs the AWS CLI because Cosmos performs the upload from inside the Pod. + Additional KubernetesPodOperator parameters can be added to the ``operator_args`` parameter of the ``DbtKubernetesOperator``. For instance, @@ -47,6 +49,9 @@ For instance, :start-after: [START kubernetes_tg_example] :end-before: [END kubernetes_tg_example] +To generate dbt docs and upload them to S3 from the same Pod, use :class:`~cosmos.operators.kubernetes.DbtDocsS3KubernetesOperator`. +See :doc:`../../dbt_docs/generating-docs` for an end-to-end example and the extra requirements for this workflow. + Step-by-step instructions +++++++++++++++++++++++++ @@ -175,7 +180,7 @@ The Kubernetes execution mode has the following limitations: - Does not emit Airflow datasets, assets, and dataset aliases (there is an `open ticket #2329 `__ to address this) - Does not handle installing dbt deps for users (there is an `open ticket #679 `__ to address this) - Does not support `ProfileMapping `_ (there is an `open ticket #749 `__ to address this) -- Does not support `Callbacks `_ (there is an `open ticket #1575 `__ to address this) +- Does not support :doc:`../callbacks/callbacks` (there is an `open ticket #1575 `__ to address this) - Does not expose Compiled SQL as a `templated field `_ - Does not benefit from `Cosmos caching mechanisms `_ -- Does not support `generating dbt docs & uploading to an object store `_ (there is a `PR `_ to solve this for S3) +- Supports generating dbt docs and uploading them to S3 with :class:`~cosmos.operators.kubernetes.DbtDocsS3KubernetesOperator`; other object stores and callback-based uploads remain unsupported in Kubernetes execution mode From 15901df4be5a55adf0c98ee11166b543fdea14d5 Mon Sep 17 00:00:00 2001 From: jx2lee Date: Tue, 21 Apr 2026 09:02:05 +0900 Subject: [PATCH 2/2] applied review --- docs/guides/dbt_docs/generating-docs.rst | 4 +++- docs/guides/run_dbt/container/kubernetes.rst | 4 ++-- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/guides/dbt_docs/generating-docs.rst b/docs/guides/dbt_docs/generating-docs.rst index e246381388..a90cdb7de3 100644 --- a/docs/guides/dbt_docs/generating-docs.rst +++ b/docs/guides/dbt_docs/generating-docs.rst @@ -12,7 +12,7 @@ Alternatively, many users choose to serve these docs on a separate static websit Cosmos offers pre-built ways of generating and uploading dbt docs, plus a fallback option to run custom code after the docs are generated: - :class:`~cosmos.operators.DbtDocsS3Operator`: generates and uploads docs to a S3 bucket. -- :class:`~cosmos.operators.kubernetes.DbtDocsS3KubernetesOperator`: generates docs in a Kubernetes Pod and uploads them to a S3 bucket from inside that Pod. +- :class:`~cosmos.operators.kubernetes.DbtDocsS3KubernetesOperator` (introduced in Cosmos 1.15.0): generates docs in a Kubernetes Pod and uploads them to an S3 bucket from inside that Pod. - :class:`~cosmos.operators.DbtDocsAzureStorageOperator`: generates and uploads docs to an Azure Blob Storage. - :class:`~cosmos.operators.DbtDocsGCSOperator`: generates and uploads docs to a GCS bucket. - :class:`~cosmos.operators.DbtDocsOperator`: generates docs and runs a custom callback. @@ -47,6 +47,8 @@ You can use the :class:`~cosmos.operators.DbtDocsS3Operator` to generate and upl Upload to S3 from Kubernetes '''''''''''''''''''''''''''' +.. versionadded:: 1.15.0 + If you run dbt in :ref:`kubernetes`, use :class:`~cosmos.operators.kubernetes.DbtDocsS3KubernetesOperator`. Unlike the local S3 operator, this operator generates the docs and uploads them to S3 from inside the Kubernetes Pod. diff --git a/docs/guides/run_dbt/container/kubernetes.rst b/docs/guides/run_dbt/container/kubernetes.rst index c39e369fd6..83b2002d9e 100644 --- a/docs/guides/run_dbt/container/kubernetes.rst +++ b/docs/guides/run_dbt/container/kubernetes.rst @@ -49,7 +49,7 @@ For instance, :start-after: [START kubernetes_tg_example] :end-before: [END kubernetes_tg_example] -To generate dbt docs and upload them to S3 from the same Pod, use :class:`~cosmos.operators.kubernetes.DbtDocsS3KubernetesOperator`. +To generate dbt docs and upload them to S3 from the same Pod, use :class:`~cosmos.operators.kubernetes.DbtDocsS3KubernetesOperator` and Cosmos 1.15.0 or higher. See :doc:`../../dbt_docs/generating-docs` for an end-to-end example and the extra requirements for this workflow. Step-by-step instructions @@ -183,4 +183,4 @@ The Kubernetes execution mode has the following limitations: - Does not support :doc:`../callbacks/callbacks` (there is an `open ticket #1575 `__ to address this) - Does not expose Compiled SQL as a `templated field `_ - Does not benefit from `Cosmos caching mechanisms `_ -- Supports generating dbt docs and uploading them to S3 with :class:`~cosmos.operators.kubernetes.DbtDocsS3KubernetesOperator`; other object stores and callback-based uploads remain unsupported in Kubernetes execution mode +- Since 1.15.0, supports generating dbt docs and uploading them to S3 with :class:`~cosmos.operators.kubernetes.DbtDocsS3KubernetesOperator`; other object stores and callback-based uploads remain unsupported in Kubernetes execution mode