Skip to content

Fix tests that are not passing in #2261#1

Closed
tatiana wants to merge 22 commits into
YourRoyalLinus:2257-yaml-selector-support-with-manifest-loadmodefrom
astronomer:2257-yaml-selector-support-with-manifest-loadmode
Closed

Fix tests that are not passing in #2261#1
tatiana wants to merge 22 commits into
YourRoyalLinus:2257-yaml-selector-support-with-manifest-loadmodefrom
astronomer:2257-yaml-selector-support-with-manifest-loadmode

Conversation

@tatiana
Copy link
Copy Markdown

@tatiana tatiana commented Jan 29, 2026

@YourRoyalLinus, I had a quick look at the tests and tried to fix them in a separate branch (#2296), which is a draft PR so the CI can trigger the job. I'll close it once the checks in your branch are passing.

(1) Some integration tests were failing with:

    ERROR tests/test_example_dags.py - AssertionError: assert not {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'}
     +  where {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'} = <airflow.models.dagbag.DagBag object at 0x7fa3c6e724d0>.import_errors
    ERROR tests/test_example_dags_no_connections.py - AssertionError: assert not {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'}
     +  where {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'} = <airflow.models.dagbag.DagBag object at 0x7fa3be1d89d0>.import_errors

The reason is "Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later". This is how you can fix it: astronomer@424a9a1

(2) dbt Fusion tests stopped working because the default selector did not select any nodes in the jaffle_shop dbt project

There were dbt Fusion tests that referenced the dbt project jaffle_shop that you changed and would actually run the DAG and check the DAG topology - including which nodes were rendered.

Since we set the default selector to not match any nodes, this behaviour changed:

=========================== short test summary info ============================
FAILED tests/test_dbtf.py::test_dbt_dag_with_dbt_fusion - assert 0 == 23
 +  where 0 = len({})
 +    where {} = <cosmos.dbt.graph.DbtGraph object at 0x7f0b68bb80d0>.filtered_nodes
 +      where <cosmos.dbt.graph.DbtGraph object at 0x7f0b68bb80d0> = <DAG: snowflake_dbt_fusion_dag>.dbt_graph

My advice is that we do not add the selector changes to dev/dags/dbt/jaffle_shop, but instead to dev/dags/dbt/altered_jaffle_shop. Another suggestion is that we do not set the default selector to not match any nodes.

(3) The DAG cosmos_manifest_selectors_examples.py failed to run in a clean database because it would attempt to run the dbt model customers, without having run its upstream tasks. This can be observed in https://github.com/astronomer/astronomer-cosmos/actions/runs/21476188465/job/61860709231?pr=2296

(4) The DAG example_cosmos_cleanup_dag.py is not compatible with Airflow 3. I logged a follow-up ticket for us to check this: astronomer#2300, and this issue was not introduced by your PR. For now, it is skipped in AF3.

I fixed all the issues in astronomer#2296. If you get a chance, please add these fixes to this PR - otherwise we may merge your PR as it is, and then merge the fixes.

tatiana and others added 14 commits January 29, 2026 09:30
ERROR tests/test_example_dags.py - AssertionError: assert not {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'}
 +  where {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'} = <airflow.models.dagbag.DagBag object at 0x7fa3c6e724d0>.import_errors
ERROR tests/test_example_dags_no_connections.py - AssertionError: assert not {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'}
 +  where {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'} = <airflow.models.dagbag.DagBag object at 0x7fa3be1d89d0>.import_errors
…2271)

This PR adds support for dbt-loom, enabling Cosmos to work with
multi-project dbt architectures where downstream projects reference
models from upstream projects.

When using dbt-loom, downstream projects reference upstream models via
`{{ ref('upstream_project', 'model_name') }}`. dbt-loom injects these
external model references into the downstream project's namespace by
reading the upstream project's manifest.json

Cosmos now automatically detects and skips external nodes (those without
original_file_path) during DAG generation, while still creating tasks
for the project's own models. This works for both:
`LoadMode.DBT_LS` - parsing via dbt ls
`LoadMode.DBT_MANIFEST` - parsing via manifest file

The PR adds the example Projects (in `dev/dags/dbt/`):
`dbt_loom_upstream_platform/` - staging & intermediate models with seeds
`dbt_loom_downstream_finance/` - finance fact tables referencing
upstream models
`dbt_loom_dags.py` - combined DAG with chained task groups

The PR also adds a comprehensive guide for multi-project setups in
`docs/configuration/multi-project.rst`


closes: #2107

---------

Co-authored-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com>
```
cosmos.exceptions.CosmosDbtRunError: dbt invocation completed with errors: orders: Database Error in model orders (models/orders.sql)
  relation "public.stg_orders" does not exist
  LINE 18:     select * from "***"."public"."stg_orders"
                             ^
  compiled code at target/run/altered_jaffle_shop/models/orders.sql
Task instance in failure state
Task start:None end:2026-01-29 11:36:10.637938+00:00 duration:None
Task:<Task(DbtRunLocalOperator): local_example.orders.run> dag:<DAG: cosmos_manifest_selectors_example> dagrun:<DagRun cosmos_manifest_selectors_example @ 2026-01-29 11:36:06.661130+00:00: manual__2026-01-29T11:36:06.661130+00:00, state:running, queued_at: None. externally triggered: False>
Failure caused by dbt invocation completed with errors: orders: Database Error in model orders (models/orders.sql)
  relation "public.stg_orders" does not exist
  LINE 18:     select * from "***"."public"."stg_orders"
                             ^
  compiled code at target/run/altered_jaffle_shop/models/orders.sql
[2026-01-29 11:36:10,667] {taskinstance.py:3157} INFO - Exporting env vars: AIRFLOW_CTX_DAG_OWNER='airflow' AIRFLOW_CTX_DAG_ID='cosmos_manifest_selectors_example' AIRFLOW_CTX_TASK_ID='local_example.customers.run' AIRFLOW_CTX_EXECUTION_DATE='2026-01-29T11:36:06.661130+00:00' AIRFLOW_CTX_TRY_NUMBER='1' AIRFLOW_CTX_DAG_RUN_ID='manual__2026-01-29T11:36:06.661130+00:00'
Task instance is in running state
 Previous state of the Task instance: queued
Current task name:local_example.customers.run state:scheduled start_date:None
Dag name:cosmos_manifest_selectors_example and current dag run status:running
```
https://github.com/astronomer/astronomer-cosmos/actions/runs/21476188465/job/61860709231?pr=2296
…output (#2287)

Between Cosmos 1.11 and Cosmos 1.12, the Cosmos `ExecutionMode.WATCHER
changed its logging behaviour to always output JSON, which is a much
worse experience for end-users.

We had a contribution from @tiovader to fix this issue (#2241), and we
will release it in Cosmos 1.13, planned for this week.

To avoid regressions in how Cosmos outputs user-friendly logs, this PR
extends our integration tests to cover this behaviour and further
improves the implementation.
Improve watcher documentation based on issues and questions observed
while supporting users.
@tatiana tatiana changed the title Fix tests that are not passing in https://github.com/astronomer/astronomer-cosmos/pull/2261 Fix tests that are not passing in #2261 Jan 29, 2026
tatiana and others added 7 commits January 29, 2026 14:33
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Add support for the use of yaml selectors when loading dbt projects
using `LoadMode.DBT_MANIFEST`. This works by parsing the YAML selectors
found in the manifest file that have already been processed by the dbt
parser into corresponding select and exclude selections that are passed
onto the `select_nodes()` function.

The performance overhead for the initial parse will depend on the user's
`selector.yaml` definition, but in testing it has been reasonable. The
parsing penalty is only paid once, as future selector to select/exclude
mapping access is done from the cache, if enabled (similar to
`dbt_ls_cache`).

#### Key Changes 
- Implemented a parser to convert full YAML selector definitions into
corresponding `select` and `exclude` selections
- Is only used for graphs with `RenderConfig.load_method = DBT_MANIFEST`
and a defined `RenderConfig.selector`
- Implemented cache behavior to store and retrieve the parsed YAML
selector definitions
- Added a new `enable_cache_yaml_selectors` cosmos setting to
enable/disable caching
  - Invalidates if `YamlSelector` implementation change
- Ensures users do not have to manually clear the cache if the spec is
changed
- Renamed `dbt_ls_cache` to be more general so it can be used to store
parsed selector yaml definitions
- Cache should always be mutually exclusive for the `dbt_ls` output or
parsed selector yaml

#### Limitations
- I did not implement parser support for the `indirect_selection` or
`default` keywords.
- It seems plausible that both of these can be implemented if desired.
Omitting them was done to limit scope.
- This is not a full YAML selector parser, akin to what dbt implements. 
- While much of the implementation is borrowed from the latest version
of `dbt core`, this expects the selector definitions to be in the state
that exists in an unmodified manifest file.
- This approach allows the Cosmos parser to only to handle fully
structured `method-value` selection definitions.
- I kept many high-level parser exceptions in the Cosmos implementation
that are technically redundant with the dbt parser
- This was done to try and catch user-modified manifest files to fail
fast with helpful error messages.
- Modifying the selector definitions found in the manifest file is
considered undefined behavior.

These limitations are reflected in the documentation.

## Related Issue(s)

Closes #2257 

## Breaking Change?

- I reused the existing dbt_ls cache cache key and functionality,
generalizing where possible
- **This is an implementation detail and should not break the public API
nor alter any existing behavior**
- This requires it to be impossible to have a dbt_ls and yaml_selectors
cache at the same time due
- This is technically possible by setting the
`DbtGraph.cache_identifier` to be the same, but this can't be configured
by a user.

These are reflected in the documentation.
tatiana added a commit to astronomer/astronomer-cosmos that referenced this pull request Jan 29, 2026
Minimal changes on #2261 to see tests passing without changing the
original branch

I also created a PR on the @YourRoyalLinus PR:
YourRoyalLinus#1

This is a summary of the main issues:


(1) Some integration tests were failing with:
```
    ERROR tests/test_example_dags.py - AssertionError: assert not {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'}
     +  where {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'} = <airflow.models.dagbag.DagBag object at 0x7fa3c6e724d0>.import_errors
    ERROR tests/test_example_dags_no_connections.py - AssertionError: assert not {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'}
     +  where {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'} = <airflow.models.dagbag.DagBag object at 0x7fa3be1d89d0>.import_errors
```

The reason is "Object Storage feature is unavailable in Airflow version
2.7.0. Please upgrade to Airflow 2.8 or later". This is how you can fix
it:
424a9a1

(2) dbt Fusion tests stopped working because the YAML selector file
defined a default selector, which did not select any nodes in the
`jaffle_shop` dbt project.

There were dbt Fusion tests that referenced the dbt project
`jaffle_shop` that were changed and would actually run the DAG and check
the DAG topology - including which nodes were rendered.

Since we set the default selector to not match any nodes, this behaviour
changed:
```
=========================== short test summary info ============================
FAILED tests/test_dbtf.py::test_dbt_dag_with_dbt_fusion - assert 0 == 23
 +  where 0 = len({})
 +    where {} = <cosmos.dbt.graph.DbtGraph object at 0x7f0b68bb80d0>.filtered_nodes
 +      where <cosmos.dbt.graph.DbtGraph object at 0x7f0b68bb80d0> = <DAG: snowflake_dbt_fusion_dag>.dbt_graph
```

I moved the changes originally done to `dev/dags/dbt/jaffle_shop` to
`dev/dags/dbt/altered_jaffle_shop`. I also removed the default selector
definition.

(3) The DAG `cosmos_manifest_selectors_examples.py` failed to run in a
clean database because it would attempt to run the dbt model
`customers`, without having run its upstream tasks. This can be observed
in
https://github.com/astronomer/astronomer-cosmos/actions/runs/21476188465/job/61860709231?pr=2296

(4) The DAG `example_cosmos_cleanup_dag.py` is not compatible with
Airflow 3. I logged a follow-up ticket for us to check this:
#2300. This issue
was not introduced by your PR. For now, it is skipped in AF3.
govambam added a commit to govambam/astronomer-cosmos that referenced this pull request Feb 10, 2026
Minimal changes on astronomer#2261 to see tests passing without changing the
original branch

I also created a PR on the @YourRoyalLinus PR:
YourRoyalLinus#1

This is a summary of the main issues:

(1) Some integration tests were failing with:
```
    ERROR tests/test_example_dags.py - AssertionError: assert not {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'}
     +  where {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'} = <airflow.models.dagbag.DagBag object at 0x7fa3c6e724d0>.import_errors
    ERROR tests/test_example_dags_no_connections.py - AssertionError: assert not {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'}
     +  where {'/home/runner/work/astronomer-cosmos/astronomer-cosmos/dev/dags/cosmos_manifest_selectors_example.py': 'Traceback (mo...he required Object Storage feature is unavailable in Airflow version 2.7.0. Please upgrade to Airflow 2.8 or later.\n'} = <airflow.models.dagbag.DagBag object at 0x7fa3be1d89d0>.import_errors
```

The reason is "Object Storage feature is unavailable in Airflow version
2.7.0. Please upgrade to Airflow 2.8 or later". This is how you can fix
it:
astronomer@424a9a1

(2) dbt Fusion tests stopped working because the YAML selector file
defined a default selector, which did not select any nodes in the
`jaffle_shop` dbt project.

There were dbt Fusion tests that referenced the dbt project
`jaffle_shop` that were changed and would actually run the DAG and check
the DAG topology - including which nodes were rendered.

Since we set the default selector to not match any nodes, this behaviour
changed:
```
=========================== short test summary info ============================
FAILED tests/test_dbtf.py::test_dbt_dag_with_dbt_fusion - assert 0 == 23
 +  where 0 = len({})
 +    where {} = <cosmos.dbt.graph.DbtGraph object at 0x7f0b68bb80d0>.filtered_nodes
 +      where <cosmos.dbt.graph.DbtGraph object at 0x7f0b68bb80d0> = <DAG: snowflake_dbt_fusion_dag>.dbt_graph
```

I moved the changes originally done to `dev/dags/dbt/jaffle_shop` to
`dev/dags/dbt/altered_jaffle_shop`. I also removed the default selector
definition.

(3) The DAG `cosmos_manifest_selectors_examples.py` failed to run in a
clean database because it would attempt to run the dbt model
`customers`, without having run its upstream tasks. This can be observed
in
https://github.com/astronomer/astronomer-cosmos/actions/runs/21476188465/job/61860709231?pr=2296

(4) The DAG `example_cosmos_cleanup_dag.py` is not compatible with
Airflow 3. I logged a follow-up ticket for us to check this:
astronomer#2300. This issue
was not introduced by your PR. For now, it is skipped in AF3.
@github-actions
Copy link
Copy Markdown

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions Bot added the stale label Apr 19, 2026
@github-actions
Copy link
Copy Markdown

This PR was closed because it has been inactive for 10 days since being marked as stale.

@github-actions github-actions Bot closed this Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants