Skip to content

Add telemetry tracking for dbt docs plugin usage#2240

Merged
pankajkoti merged 8 commits into
mainfrom
telemetry-dbt-docs
Dec 31, 2025
Merged

Add telemetry tracking for dbt docs plugin usage#2240
pankajkoti merged 8 commits into
mainfrom
telemetry-dbt-docs

Conversation

@pankajkoti
Copy link
Copy Markdown
Contributor

@pankajkoti pankajkoti commented Dec 30, 2025

Adds telemetry emission to track dbt docs plugin usage via Scarf, capturing how users access and configure the dbt documentation viewer.

Metrics Tracked

  • storage_type: Backend storage type (s3, gcs, azure, http, local, or not_configured)
  • dbt_docs_configured: Whether the docs directory is configured
  • uses_custom_conn: Whether a custom connection ID is used
  • has_custom_name: Whether a custom project name is set (Airflow 3 only)

I have tested that the events are getting emitted to Scarf for both Airflow 2 and Airflow 3 plugins

closes: #2111

- Track dbt docs access via Scarf telemetry
- Capture metrics: storage_type, is_configured, uses_custom_conn
- Add _get_storage_type helper method to detect storage backend
- Add comprehensive tests for telemetry emission
- Fixes #2111
Move telemetry emission from dbt_docs_view to dbt_docs_index endpoint
because Airflow 3 navigation menu links directly to dbt_docs_index.html,
bypassing the wrapper view. Update tests to match actual user access path.
- Adjust plugin tests to expect 404 when docs are not configured
- Add missing index.html file in local storage test
- Fix telemetry tests to check for log level presence instead of startswith
  to accommodate Airflow 3.1 logging format changes
@netlify
Copy link
Copy Markdown

netlify Bot commented Dec 30, 2025

Deploy Preview for astronomer-cosmos canceled.

Name Link
🔨 Latest commit 9047edc
🔍 Latest deploy log https://app.netlify.com/projects/astronomer-cosmos/deploys/6954d9ddf48435000804ce36

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.99%. Comparing base (0fa0163) to head (9047edc).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2240   +/-   ##
=======================================
  Coverage   97.98%   97.99%           
=======================================
  Files          95       96    +1     
  Lines        6197     6220   +23     
=======================================
+ Hits         6072     6095   +23     
  Misses        125      125           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Remove duplicate `_get_storage_type` implementations from both Airflow 2 and Airflow 3 plugins and consolidate into a single `get_storage_type_from_path()` utility function in `cosmos/plugin/storage.py`. This eliminates code duplication and makes the function reusable across plugins. Updated corresponding tests to use the new utility function directly.
@pankajkoti pankajkoti marked this pull request as ready for review December 31, 2025 08:15
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds telemetry tracking for the dbt docs plugin to understand how users access and configure the documentation viewer across both Airflow 2 and 3.

  • Adds telemetry emission when users access dbt docs, tracking storage type, configuration status, and custom settings
  • Introduces a new utility function to detect storage backend type from file paths
  • Updates test assertions to use more flexible string matching for log level verification

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
cosmos/plugin/storage.py New utility module for detecting storage backend type from file paths
cosmos/plugin/airflow2.py Adds telemetry emission when dbt docs are accessed in Airflow 2
cosmos/plugin/airflow3.py Adds telemetry emission when dbt docs are accessed in Airflow 3
tests/plugin/test_plugin_af2.py Adds tests for telemetry emission in Airflow 2 plugin
tests/plugin/test_plugin_af3.py Adds tests for telemetry emission in Airflow 3 plugin and storage type detection
tests/test_telemetry.py Updates test assertions to use substring matching instead of prefix matching for log levels

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked the approach, @pankajkoti ! It will be exciting to see how people are using this feature.

Please, could you create a separate PR updating our privacy policy, including boht the change in this PR (#2240) and also the changes introduced in #2228

@pankajkoti pankajkoti merged commit d58e0aa into main Dec 31, 2025
90 checks passed
@pankajkoti pankajkoti deleted the telemetry-dbt-docs branch December 31, 2025 11:58
pankajkoti added a commit that referenced this pull request Jan 6, 2026
Add documentation for DAG run telemetry metrics (load mode, invocation mode, dbt deps, node converters, test/source behavior, model counts) and dbt docs plugin metrics (storage type, docs configuration, custom connections, custom project name).

These metrics were added in PR #2223 and PR #2240 but were not reflected in the privacy documentation.
pankajkoti added a commit that referenced this pull request Jan 6, 2026
Add PRIVACY NOTICE documentation for DAG run telemetry metrics (load
mode, invocation mode, dbt deps, node converters, test/source behavior,
model counts) and dbt docs plugin metrics (storage type, docs
configuration, custom connections, custom project name).

These metrics were added in PR #2223, PR #2228, and PR #2240, but were
not reflected in the privacy documentation.

closes: #2248
@tatiana tatiana added this to the Cosmos 1.13.0 milestone Jan 29, 2026
@pankajastro pankajastro mentioned this pull request Jan 29, 2026
tatiana added a commit that referenced this pull request Jan 30, 2026
Features

* Support cross-referencing models across dbt projects using dbt-loom by
@pankajkoti in #2271
* Support use of YAML selectors when using ``LoadMode.DBT_MANIFEST`` by
@YourRoyalLinus in #2261
* Introduce ``ExecutionMode.WATCHER_KUBERNETES`` to use the watcher with
``KubernetesPodOperator`` by @tatiana in #2207
* Add support for StarRocks profile mapping by @kurkim0661 in #2256
* Allow pushing URIs as XComs for Cosmos tasks by @corsettigyg in #2275
* Support defining custom callbacks alongside the ``WATCHER_KUBERNETES``
callback by @johnhoran in #2307

Enhancements

* Refactor: remove duplicate ``_construct_dest_file_path`` by @jx2lee in
#2077
* Leverage Airflow ``::group::`` to group logs associated with DAG
parsing by @tatiana in #2235
* Refactor ``DbtConsumerWatcherSensor`` for reusability by @tatiana in
#2245
* Restore plain text output when using ``ExecutionMode.WATCHER`` by
@tiovader in #2241

Bug Fixes

* Fix running empty models or ephemeral nodes in
``ExecutionMode.WATCHER`` by @tatiana in #2279
* Improve watcher producer task priority in scheduling and the UI by
@tatiana in #2237
* Fix typos and formatting issues in documentation by @pankajkoti in
#2259
* Allow watcher producer retries without erroring by @tatiana in #2283
* Fix ``TestBehavior.AFTER_ALL`` is missing project_name information
when loading project using manifest file by @tuantran0910 in #2242
* Fix duplicate log lines in watcher subprocess execution and format
timestamps by @pankajkoti in #2301

Docs

* Add Watcher Kubernetes documentation by @tatiana in #2303
* Document newly added telemetry metrics in the privacy notice by
@pankajkoti in #2249
* Add compatibility policy document by @pankajastro in #2251
* Improve watcher documentation related to dbt threads by @tatiana in
#2273
* Fix link in watcher execution mode documentation by @jedcunningham in
#2277
* Update Apache Airflow minimum compatibility policy by @tatiana in
#2285
* Clarify Cosmos runtime support until "End of Basic Support" by
@jedcunningham in #2286
* Update watcher docs by @tatiana in #2298
* Update watcher kubernetes documentation by @tatiana in #2306

Others

* Add Airflow 3 DAG versioning tests for Cosmos by @michal-mrazek in
#2177
* Add dbt Core 1.11 to the test matrix by @tatiana in #2230
* Add integration tests using InvocationMode.SUBPROCESS and validate
output by @tatiana in #2287
* Fix main branch failing tests by @tatiana in #2296
* Update pre-commit hooks to the latest versions by @jedcunningham in
#2289
* Pre-commit autoupdates by @pre-commit in #2222, #2264, #2274 and #2290
* Dependabot updates by @dependabot in #2218, #2219, #2220, #2280 and
#2284
* Add Scarf metrics to understand Cosmos feature usage patterns
- Add telemetry tracking for dbt docs plugin usage by @pankajkoti in
#2240
- Add DAG run telemetry metrics for load mode, invocation, and
render_config parameters by @pankajkoti in #2223
  - Collect profile metrics for DAG runs by @pankajastro in #2228
- Compress telemetry metadata to reduce serialized DAG size by
@pankajkoti in #2252
- Skip storing telemetry metadata when emission is disabled by
@pankajkoti in #2278
- Hide telemetry metadata parameters from the Airflow trigger UI by
@pankajkoti in #2247

closes:
astronomer/oss-integrations-private#317

---------

Co-authored-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Track dbt Docs utilisation via Scarf

3 participants