Skip to content

Update dbt and Airflow Spark URLs in profile mapping docstrings#2695

Merged
pankajastro merged 1 commit into
mainfrom
docs/fix-external-profile-links
Jun 2, 2026
Merged

Update dbt and Airflow Spark URLs in profile mapping docstrings#2695
pankajastro merged 1 commit into
mainfrom
docs/fix-external-profile-links

Conversation

@pankajastro
Copy link
Copy Markdown
Contributor

Summary

  • dbt restructured the docs site: /docs/core/connect-data-platform/<X>-setup and /reference/warehouse-setups/<X>-setup now 404 or redirect to /docs/local/connect-data-platform/<X>-setup. Linkcheck flagged ~20 profile pages as [redirected permanently] and 7 as [broken] (anchors not found).
  • Updated the docstrings of every affected profile mapping class under cosmos/profiles/. The docs/reference/profiles/*.rst files are regenerated from these docstrings via docs/generate_mappings.py at build time, so the source-of-truth edit is the docstring (per CLAUDE.md).
  • Also fixed the Airflow Apache Spark provider docs URL: connections/spark.html returns 404 now that the provider splits into spark-connect / spark-sql / spark-submit. Pointed at the connections/index.html instead.

Anchor handling

  • Trino (certificate / JWT / LDAP): the page kept the sections but renamed the anchors. Updated to #example-profilesyml-for-{certificate,jwt,ldap} (verified via WebFetch).
  • Snowflake key-pair, BigQuery oauth-via-gcloud, BigQuery service-account-file, Spark thrift: the dbt page still exists but the anchor couldn't be confirmed via WebFetch (Docusaurus body is JS-rendered). Dropped the anchor and linked to the page top — readers can scroll/Ctrl-F. Picking a wrong anchor would silently land users at the top of the page anyway; explicit is better.

Files changed

25 files under cosmos/profiles/ (one docstring URL each, +2 in spark/thrift.py):

athena/access_key.py, bigquery/{oauth,service_account_file,service_account_keyfile_dict}.py, clickhouse/user_pass.py, databricks/{oauth,token}.py, duckdb/user_pass.py, exasol/user_pass.py, mysql/user_pass.py, oracle/user_pass.py, postgres/user_pass.py, redshift/user_pass.py, snowflake/{user_pass,user_privatekey,user_privatekey_file,user_encrypted_privatekey_file,user_encrypted_privatekey_env_variable}.py, spark/thrift.py, starrocks/user_pass.py, teradata/user_pass.py, trino/{certificate,jwt,ldap}.py, vertica/user_pass.py

Test plan

  • hatch run docs:build passes with --fail-on-warning and regenerates docs/reference/profiles/*.rst from updated docstrings
  • sphinx-build -b linkcheck docs docs/_build/linkcheck — every previously-flagged dbt and Airflow-Spark link on profile pages is gone from the broken/redirected list
  • Reviewer spot-checks one or two of the new URLs in a browser (e.g., https://docs.getdbt.com/docs/local/connect-data-platform/snowflake-setup)

Out of scope (for follow-up PRs)

  • VerticaUserPassword GitHub blob #L72 / #L138 anchors — always false positives in linkcheck (GitHub renders line anchors via JS).
  • AthenaAccessKey dbt-athena GitHub README #configuring-your-profile anchor — community repo README, unrelated to the dbt docs migration.
  • AthenaAccessKey registry.astronomer.ioairflow.apache.org/registry/ redirect — registry consolidation, unrelated.
  • Hand-written .rst pages that also reference the old dbt URL roots (docs/getting_started/oss-quickstart.rst, docs/getting_started/dbt-airflow-concepts.rst, docs/guides/connect_database/profile-customise-per-node.rst, docs/reference/glossary.rst) — these will be covered in the PR 3 cleanup of hand-written external links.

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings May 18, 2026 18:28
@pankajastro pankajastro requested review from a team, corsettigyg, dwreeves and jbandoro as code owners May 18, 2026 18:28
@pankajastro pankajastro requested review from pankajkoti and tatiana May 18, 2026 18:28
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates profile mapping docstring links so generated profile reference docs point to current dbt and Airflow provider documentation.

Changes:

  • Replaced outdated dbt setup URLs with /docs/local/connect-data-platform/... links.
  • Updated Trino auth anchors to current section anchors.
  • Replaced the obsolete Airflow Spark connection page link with the provider connections index.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cosmos/profiles/bigquery/oauth.py
Comment thread cosmos/profiles/bigquery/service_account_file.py
pankajastro added a commit that referenced this pull request May 19, 2026
## Summary

- Eight links in hand-written `.rst` files no longer resolve because the
upstream target was renamed, moved, removed, or the section anchor was
changed. Updated each to the current upstream URL.
- All replacement URLs verified live (WebFetch / `gh api`) before
committing — no guessed paths.

## Fixes

| File | Broken target | Replacement |
|---|---|---|
| `policy/airflow3-compatibility.rst` | OpenLineage `/guides/user.html`
(404) | `/guides/structure.html` |
| `policy/compatibility-policy.rst` | Airflow
`/security/end-of-life.html` (404) |
`/installation/supported-versions.html` |
| `guides/run_dbt/container/aws-container-run-job.rst` |
`docs.astronomer.io/cosmos/` (404) |
`www.astronomer.io/docs/learn/airflow-dbt` |
| `guides/cosmos_devex/lineage.rst` | OpenLineage
`processor.py#L36C1-L47C22` (file moved) | `tree/main/integration/dbt`
(new module root) |
| `guides/run_dbt/container/gcp-cloud-run-job.rst` | 5 broken anchors on
`/iam/docs/understanding-roles` | Per-role pages under
`/iam/docs/roles-permissions/`. Owner stays on `understanding-roles`
since basic roles aren't service-scoped. |
| `getting_started/dbt-airflow-concepts.rst` | dbt
`introduction#dbt-optimizes-your-workflow` (anchor not found) |
`#why-use-dbt` (current section name) |
| `getting_started/core-concepts.rst` |
`execution-modes.html#invocation-modes` (anchor not found) |
`render-config.html#how-to-run-dbt-ls-invocation-mode` (where the
section lives now) |

## Test plan

- [x] `hatch run docs:build` passes with `--fail-on-warning`
- [ ] `sphinx-build -b linkcheck` — pending verification (build was
clean; linkcheck run separately should confirm zero regressions on these
eight)
- [ ] Reviewer spot-checks 2-3 replacement URLs in a browser

## Out of scope (deferred)

- The handful of permanent-redirect (not broken) dbt URLs in
hand-written docs (`oss-quickstart.rst`,
`dbt-airflow-concepts.rst:24-25`, `profile-customise-per-node.rst`,
`reference/glossary.rst`, `compatibility-policy.rst:97`) — same pattern
as PR #2695 but for hand-written content. Could be folded in here, but
left out to keep this PR focused on truly-broken links.
- The handful of permanent-redirect GCP/Astronomer URLs (different
platforms migrating off `cloud.google.com` → `docs.cloud.google.com`
etc.) — separate cleanup.
- `VerticaUserPassword` GitHub `#L\d+` anchors — always false positives
in linkcheck (GitHub renders line anchors via JS); belongs in a
`linkcheck_ignore` config PR.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The dbt docs site moved all warehouse-setup pages from two legacy
roots to a single new location:

  /docs/core/connect-data-platform/<X>-setup  -> 404 / redirect
  /reference/warehouse-setups/<X>-setup       -> 404 / redirect
  /docs/local/connect-data-platform/<X>-setup -> current home

Linkcheck flagged these as redirected-permanently for ~20 profile
pages and as broken anchors for seven (Snowflake key-pair-auth,
BigQuery oauth-via-gcloud, BigQuery service-account-file, Spark
thrift, Trino certificate/jwt/ldap).

This commit updates the docstrings of every affected profile
mapping class under cosmos/profiles/. The docs build regenerates
docs/reference/profiles/*.rst from these docstrings via
docs/generate_mappings.py, so updating the source is enough.

Anchor handling:

- Trino: dbt kept the per-auth-method sections but renamed the
  anchors; updated to #example-profilesyml-for-{certificate,jwt,ldap}.
- Snowflake key-pair, BigQuery oauth-via-gcloud, BigQuery
  service-account-file, Spark thrift: the page still exists but
  the anchor could not be verified via WebFetch (Docusaurus
  JS-rendered body), so dropped the anchor and linked to the
  page top.

Also fixes the Airflow Apache Spark provider docs URL in
cosmos/profiles/spark/thrift.py: connections/spark.html (404) is
gone since the provider now splits into spark-connect / spark-sql
/ spark-submit pages. Pointed at the connections index.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pankajastro pankajastro force-pushed the docs/fix-external-profile-links branch from e55d697 to 1a591df Compare May 19, 2026 11:24
@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.04%. Comparing base (5b8ade7) to head (1a591df).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2695   +/-   ##
=======================================
  Coverage   98.04%   98.04%           
=======================================
  Files         105      105           
  Lines        7867     7867           
=======================================
  Hits         7713     7713           
  Misses        154      154           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@pankajastro pankajastro merged commit 2d9f6f0 into main Jun 2, 2026
125 checks passed
@pankajastro pankajastro deleted the docs/fix-external-profile-links branch June 2, 2026 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants