Skip to content

[Docs] Getting Started - Table of Contents restructure#2408

Closed
lzdanski wants to merge 11 commits into
astronomer:mainfrom
lzdanski:getting-started-toc-draft
Closed

[Docs] Getting Started - Table of Contents restructure#2408
lzdanski wants to merge 11 commits into
astronomer:mainfrom
lzdanski:getting-started-toc-draft

Conversation

@lzdanski
Copy link
Copy Markdown
Contributor

@lzdanski lzdanski commented Feb 25, 2026

Description

Restructures the existing Getting Started Section in docs to better organize the contents into expandable sections, without requiring redirects.

This PR does not address moving Execution Modes to another subdirectory, but focuses on adapting the docs currently visible in the production environment.

Related Issue(s)

N/A

Breaking Change?

Since no files are moving directories in the backend, no broken links anticipated.

Checklist

  • I have made corresponding changes to the documentation (if required)
  • I have added tests that prove my fix is effective or that my feature works

Cosmos Fundamentals
===================

Information about important cosmos concepts go here
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Information about important cosmos concepts go here
Information about important Cosmos concepts go here

Comment thread docs/getting_started/index.rst Outdated
Comment on lines +16 to +21
Run Cosmos <run-cosmos>
astro
aws-container-run-job
gcc
mwaa
open-source
Copy link
Copy Markdown
Collaborator

@tatiana tatiana Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure makes sense, but the headings need tightening for clarity and consistency to look more professional.

Currently, this renders:

  • Run Cosmos
  • Getting Started on Astro
  • Getting Started with Astronomer Cosmos on AWS ECS
  • Getting Started on Google Cloud Composer (GCC)
  • Getting Started on MWAA
  • Getting Started on Open Source Airflow

A few issues:

  • Inconsistent phrasing (“Run Cosmos” vs “Getting Started…”)
  • Some platform names should be precise (some use acronyms, others both, some contain the platform, others don't)
  • Mixed tone (imperative vs descriptive)

Would it make sense to bring Open Source to the top before commercial platforms, and then sort them alphabetically?

Comment thread docs/getting_started/index.rst Outdated
:caption: Cosmos Fundamentals

Cosmos fundamentals <cosmos-fundamentals>
Similar dbt and Airflow <dbt-airflow-concepts>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It absolutely makes sense to have a section like that in your documentation, but “Similar dbt and Airflow” sounds slightly awkward and could confuse readers.

Something similar to one of the options below may be more clear:

  • dbt vs Airflow
  • How dbt Compares to Airflow

Comment thread docs/getting_started/index.rst Outdated
Comment on lines +6 to +8
:caption: Cosmos Fundamentals

Cosmos fundamentals <cosmos-fundamentals>
Copy link
Copy Markdown
Collaborator

@tatiana tatiana Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently renders:

Cosmos Fundamentals

  • Cosmos Fundamentals

Could we avoid redundant naming of parents and children?

Comment thread docs/getting_started/index.rst Outdated
.. toctree::
:maxdepth: 1
:hidden:
:caption: Execution Modes
Copy link
Copy Markdown
Collaborator

@tatiana tatiana Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this section from “Execution Modes.”

In several of our earlier conversations, we discussed that this term is not meaningful to users who are just getting started with running dbt in Airflow. For someone simply trying to run dbt in Airflow for the first time, “Execution Modes” feels abstract and doesn’t clearly communicate what the section is about.

Could we consider renaming this to something more user-centred and task-oriented? For example:

  • Ways to Run dbt in Airflow
  • Ways to Run dbt with Cosmos
  • Running dbt in Airflow
  • How Cosmos Executes dbt
  • Choosing How to Run dbt

The goal would be to make the section immediately understandable to someone scanning the "Getting Started guide", rather than introducing terminology that requires additional context.

Comment on lines 28 to 37
@@ -17,11 +35,14 @@
Airflow Async Execution Mode <async-execution-mode>
Watcher Execution Mode <watcher-execution-mode>
Watcher Kubernetes Execution Mode <watcher-kubernetes-execution-mode>
Copy link
Copy Markdown
Collaborator

@tatiana tatiana Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently renders:

Execution Modes

  • Execution Modes
  • Airflow and dbt dependencies conflicts
  • Docker Execution Mode
  • Azure Container instance execution mode
  • AWS Container Run Job Execution Mode
  • GCP Cloud Run Job Execution Mode
  • Airflow Async Execution Mode
  • Watcher Execution Mode
  • Watcher Kubernetes Execution Mode

I think this part of the table of contents can cause confusion and reduce scannability for new users. A few concerns:

  1. Redundant Parent and Child Naming
  2. Overuse of the Phrase “Execution Mode”
  3. Inconsistent Naming Conventions
  4. Concept vs Implementation Mixing
  5. Not Beginner-Friendly in a “Getting Started” Context

On the inconsistent naming:

  • There’s an inconsistency in how platforms and concepts are presented:
  • Some are infrastructure-based (Docker, Azure Container Instance)
  • Some are cloud-job based (AWS Container Run Job, GCP Cloud Run Job)
  • Some are behaviour-based (Airflow Async, Watcher)
  • Some are dependency-related (Airflow and dbt dependency conflicts)
  • These are not all in the same conceptual category, yet they’re grouped under the same heading as if they were equivalent “modes.”

On (4), concept x implementation, the list mixes:

  • Conceptual topics (dependency conflicts)
  • Execution strategies (async, watcher)
  • Infrastructure backends (Docker, Cloud Run, etc.)

This makes the mental model unclear. A user might reasonably ask:

  • Are these all mutually exclusive modes?
  • Are there some infrastructure options?
  • Are some features layered on top of others?

Lastly, on (5), for someone just trying to run dbt in Airflow, seeing:

  • Watcher Kubernetes Execution Mode
  • AWS Container Run Job Execution Mode
    …is overwhelming and abstract.

Instead of a smooth getting-started experience, it introduces internal terminology before grounding the user in the task.

Maybe we should break it down into two parts:

dbt Installation

  • Installing dbt (that could cover: same virtualenv as Airflow via requirements.txt, dedicated user-managed virtualenv via Dockerfile, dedicated Cosmos-managed virtualenv using ExecutionMode.VIRTUALENV or by creating a container image that contains dbt
  • Installing dbt dependencies (ProjectConfig.install_dbt_deps)

Running dbt in Airflow

  • Run dbt in the Airflow worker environment
    • Standard Execution
    • Watcher Execution
    • Async Execution
  • Running dbt in a container
    • Kubernetes
    • Kubernetes Watcher
    • Docker (highlight in this section, it is not compatible with Astro due to Docker in Docker issues!)
    • AWS ECS
    • AWS EKS
    • Azure Container Instance
    • GCP Cloud Run

Comment thread docs/getting_started/index.rst Outdated
Watcher Execution Mode <watcher-execution-mode>
Watcher Kubernetes Execution Mode <watcher-kubernetes-execution-mode>
dbt and Airflow Similar Concepts <dbt-airflow-concepts>

Copy link
Copy Markdown
Collaborator

@tatiana tatiana Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lastly, I strongly believe we need a section in the "Getting Started" table of contents highlighting two topics that are very important:

Connecting to your database

  • Using profiles.yml
  • Using Airflow Connections

Bringing your dbt Project into Airflow

  • Choosing task granularity
    • One task per dbt node # DbtDag
    • Combining dbt and non-dbt tasks # DbtTaskGroup
    • Running multiple dbt nodes in a single task # Instantiating operators
  • Choosing a parsing strategy
    • Using manifest.json
    • Using dbt ls
  • Selecting what to run
    • Using dbt selectors in Cosmos
  • Enabling Tests
    • After each model
    • At the end of the pipeline
    • Using dbt build
    • Disabling tests
  • Managing Sources
    • Selecting sources to run
    • Running source freshness checks

@lzdanski lzdanski closed this Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants