Fix caching unique id error#1295
Closed
tatiana wants to merge 4 commits into
Closed
Conversation
An Astronomer customer that is using Cosmos 1.5 with caching enabled in production is facing failures in their
DAGs once they change those DAGs or dbt projects.
They are using cache based on Airflow Variables.
The error they are facing is:
File /usr/local/lib/python3.11/site-packages/cosmos/dbt/graph.py, line 135, in parse_dbt_ls_output
unique_id=node_dict[unique_id]
KeyError: 'unique_id'
This change aims to add additional logs and skip nodes that may not contain necessary fields. This will allow us to identify the root cause
✅ Deploy Preview for sunny-pastelito-5ecb04 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Collaborator
Author
|
We identified what was the problem. There was a macro that would print JSONs in dbt ls that were not meant to be handled as dbt nodes. This was the dbt ls output, before printing the actual nodes: We have two solutions for the customer who faced this issue:
In the meantime, I'm going to close this PR and create a new PR against Cosmos main branch, and add a test-case for this. |
tatiana
added a commit
that referenced
this pull request
Oct 31, 2024
…1296) This change makes Cosmos more resilient, allowing it to be used even when JSONs do not represent dbt nodes in the `dbt ls` output. **Context** An Astronomer customer [raised a P1 incident](https://astronomer.zendesk.com/agent/tickets/67681), mentioning they could no longer run their Cosmos-powered DAGs. They were using Cosmos 1.5.0, and the issue was observed whenever DAGs were deployed using `Astro deploy --dags`, even if they only had whitespace as a difference. The DAGs could no longer be parsed, raising an exception similar to: ``` File /usr/local/lib/python3.11/site-packages/cosmos/dbt/graph.py, line 135, in parse_dbt_ls_output unique_id=node_dict[unique_id] KeyError: 'unique_id' ``` **Explanation** The customer recently changed their dbt project, adding print debug statements to one of their dbt macros. This caused the dbt ls output to contain lines that were valid JSON but were not valid dbt nodes, as observed in: ``` 11:20:43 Running with dbt=1.7.6 11:20:45 Registered adapter: bigquery=1.7.2 11:20:45 Unable to do partial parsing because saved manifest not found. Starting full parse. /***************************/ Values returned by mac_get_values: {} /***************************/ {"name": "some_model", "resource_type": "model", "package_name": "some_package", "original_file_path": "models/some_model.sql", "unique_id": "model.some_package.some_model", "alias": "some_model_some_package_1_8_0", "config": {"enabled": true, "alias": "some_model_some_package-1.8.0", "schema": "some_schema", "database": null, "tags": [], "meta": {}, "group": null, "materialized": "view", "incremental_strategy": null, "persist_docs": {}, "post-hook": [], "pre-hook": [], "quoting": {}, "column_types": {}, "full_refresh": null, "unique_key": null, "on_schema_change": "ignore", "on_configuration_change": "apply", "grants": {}, "packages": [], "docs": {"show": true, "node_color": null}, "contract": {"enforced": false, "alias_types": true}, "access": "protected"}, "tags": [], "depends_on": {"macros": [], "nodes": ["source.some_source"]}}""" ``` Cosmos didn't consider this use case. It assumed if a line was a JSON, it should be a dbt node: https://github.com/astronomer/astronomer-cosmos/blob/42a397fb40ff537c74bb6f596b4936815b14abbb/cosmos/dbt/graph.py#L161-L185 **Workaround** If customers updated the macro to print the information in a single line, they'd no longer observe the issue: ``` Values returned by mac_get_values: {} ``` We also released [1.5.0rc2](https://github.com/astronomer/astronomer-cosmos/releases/tag/astronomer-cosmos-v1.5.0rc2) with the change #1295, similar to the one introduced by this PR. **Fix** This change makes Cosmos more resilient to scenarios where `dbt ls` may output JSON lines that are not valid dbt nodes. It also logs those lines to help troubleshoot. We added a unit test to make sure we continue supporting this use-case.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Workaround for issues faced by customers in the production.
An Astronomer customer using Cosmos 1.5.0 with caching enabled in production faces failures in their DAGs once they change those DAGs or dbt projects using
astro deply --dags. DAGs that were not changed continue to work.They are using cache based on Airflow Variables.
The error they are facing is:
This change aims to add logs and skip nodes that may not contain the necessary fields. This will allow us to identify the root cause.