Skip to content

Add tests to sources, snapshots and seeds when using TestBehavior.AFTER_EACH#599

Merged
tatiana merged 4 commits into
mainfrom
issue-474
Oct 18, 2023
Merged

Add tests to sources, snapshots and seeds when using TestBehavior.AFTER_EACH#599
tatiana merged 4 commits into
mainfrom
issue-474

Conversation

@tatiana
Copy link
Copy Markdown
Collaborator

@tatiana tatiana commented Oct 13, 2023

Previously Cosmos would only create a task group when using TestBehavior.AFTER_EACH for nodes of the type DbtResourceType.MODEL. This change adds the same behavior to snapshots and seeds.

For this to work as expected with sources, we would need to create a default operator to handle DbtResourceType.SOURCE, which is outside the scope of the current ticket. Once this operator exists, sources will also lead to creating a task group.

All the test selectors were tested successfully with dbt 1.6.

This screenshot illustrates the validation of this feature, with an adapted version of Jaffle Shop:
Screenshot 2023-10-13 at 19 43 52

The modifications that were done to jaffle_shop were:

Appended the following lines to dev/dags/dbt/jaffle_shop/models/schema.yml:

seeds:
  - name: raw_customers
    description: Raw data from customers
    columns:
      - name: id
        tests:
          - unique
          - not_null

snapshots:
  - name: orders_snapshot
    description: Snapshot of orders
    columns:
      - name: orders_snapshot.order_id
        tests:
          - unique
          - not_null

And created the file dev/dags/dbt/jaffle_shop/snapshots/orders_snapshot.sql with:

{% snapshot orders_snapshot %}

{{
    config(
      target_database='postgres',
      target_schema='public',
      unique_key='order_id',

      strategy='timestamp',
      updated_at='order_date',
    )
}}

select * from {{ ref('jaffle_shop', 'orders') }}

{% endsnapshot %}

Closes: #474

@sweep-ai-deprecated
Copy link
Copy Markdown

Apply Sweep Rules to your PR?

  • Apply: Leftover TODOs in the code should be handled.
  • Apply: All new business logic should have corresponding unit tests in the tests/ directory.
  • Apply: Any clearly inefficient or repeated code should be optimized or refactored.

@netlify
Copy link
Copy Markdown

netlify Bot commented Oct 13, 2023

👷 Deploy Preview for amazing-pothos-a3bca0 processing.

Name Link
🔨 Latest commit ec4ad48
🔍 Latest deploy log https://app.netlify.com/sites/amazing-pothos-a3bca0/deploys/652efff32f4a9f0008ae4829

@tatiana tatiana temporarily deployed to internal October 13, 2023 18:51 — with GitHub Actions Inactive
@tatiana tatiana temporarily deployed to internal October 13, 2023 18:55 — with GitHub Actions Inactive
@tatiana tatiana temporarily deployed to internal October 13, 2023 18:57 — with GitHub Actions Inactive
@tatiana tatiana marked this pull request as ready for review October 13, 2023 19:09
@tatiana tatiana requested a review from a team as a code owner October 13, 2023 19:09
@tatiana tatiana requested a review from a team October 13, 2023 19:09
@tatiana tatiana added this to the 1.3.0 milestone Oct 13, 2023
@tatiana tatiana changed the title Add tests after sources, snapshots and seeds Add tests to sources, snapshots and seeds when using TestBehavior.AFTER_EACH Oct 13, 2023
@tatiana tatiana temporarily deployed to internal October 13, 2023 19:52 — with GitHub Actions Inactive
@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 13, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (a433f15) 93.13% compared to head (ec4ad48) 93.15%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #599      +/-   ##
==========================================
+ Coverage   93.13%   93.15%   +0.02%     
==========================================
  Files          51       51              
  Lines        2053     2059       +6     
==========================================
+ Hits         1912     1918       +6     
  Misses        141      141              
Files Coverage Δ
cosmos/airflow/graph.py 100.00% <100.00%> (ø)
cosmos/constants.py 100.00% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tatiana tatiana temporarily deployed to internal October 13, 2023 22:26 — with GitHub Actions Inactive
@tatiana tatiana temporarily deployed to internal October 17, 2023 09:06 — with GitHub Actions Inactive
@tatiana tatiana temporarily deployed to internal October 17, 2023 10:09 — with GitHub Actions Inactive
@tatiana tatiana temporarily deployed to internal October 17, 2023 12:36 — with GitHub Actions Inactive
@tatiana tatiana temporarily deployed to internal October 17, 2023 20:51 — with GitHub Actions Inactive
@tatiana tatiana temporarily deployed to internal October 17, 2023 21:43 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@harels harels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment thread cosmos/airflow/graph.py
if node.resource_type == DbtResourceType.MODEL:
task_args["models"] = node.name
elif node.resource_type == DbtResourceType.SOURCE:
task_args["select"] = f"source:{node.unique_id[len('source.'):]}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a test case that's covering this part?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harels, we have a unit that covers the path formation for all these variations:
test_create_test_task_metadata, and I manually played with dbt 1.6 to confirm all these variations work.

ATM, we don't have an integration test for DbtResourceType.SOURCE since our only example DAG that validates this type of node overrides any default behavior. For us to have an end-to-end version of the code going through this, Cosmos would need a default behavior for SOURCE nodes, something we don't currently have.

@harels @jlaneve Should we add a ticket to define a default behavior for Source nodes? What do you think that should look like? Should it check for freshness?

@tatiana tatiana merged commit e09ac6d into main Oct 18, 2023
@tatiana tatiana deleted the issue-474 branch October 18, 2023 10:32
@tatiana
Copy link
Copy Markdown
Collaborator Author

tatiana commented Oct 18, 2023

I'm merging since the PR was approved, and the open discussion should only be closed once we decide whether to have a follow-up ticket.

tatiana added a commit that referenced this pull request Oct 25, 2023
…ER_EACH (#599)

Previously Cosmos would only create a task group when using
`TestBehavior.AFTER_EACH` for nodes of the type `DbtResourceType.MODEL`.
This change adds the same behavior to snapshots and seeds.

For this to work as expected with sources, we would need to create a
default operator to handle `DbtResourceType.SOURCE`, which is outside
the scope of the current ticket. Once this operator exists, sources will
also lead to creating a task group.

All the test selectors were tested successfully with dbt 1.6.

This screenshot illustrates the validation of this feature, with an
adapted version of Jaffle Shop:
<img width="1502" alt="Screenshot 2023-10-13 at 19 43 52"
src="https://github.com/astronomer/astronomer-cosmos/assets/272048/893ac557-1846-4ed1-b5e2-913a7bd38485">

The modifications that were done to jaffle_shop were:

Appended the following lines to
`dev/dags/dbt/jaffle_shop/models/schema.yml`:
```
seeds:
  - name: raw_customers
    description: Raw data from customers
    columns:
      - name: id
        tests:
          - unique
          - not_null

snapshots:
  - name: orders_snapshot
    description: Snapshot of orders
    columns:
      - name: orders_snapshot.order_id
        tests:
          - unique
          - not_null
```

And created the file
`dev/dags/dbt/jaffle_shop/snapshots/orders_snapshot.sql` with:
```
{% snapshot orders_snapshot %}

{{
    config(
      target_database='postgres',
      target_schema='public',
      unique_key='order_id',

      strategy='timestamp',
      updated_at='order_date',
    )
}}

select * from {{ ref('jaffle_shop', 'orders') }}

{% endsnapshot %}
```

Closes: #474
(cherry picked from commit e09ac6d)
tatiana added a commit that referenced this pull request Oct 25, 2023
Bug fixes

* Resolve errors occurring when `dbt_project_path` is str and partial support `dbt_project_path=None` by @MrBones757 in #605
* Fix running dbt tests that depend on multiple models (support `--indirect-selection` buildable) by @david-mag in #613
* Add tests to sources, snapshots and seeds when using `TestBehavior.AFTER_EACH` by @tatiana in #599
* Fix custom selector when select has a subset of tags of the models' tags by @david-mag in #606

Others

* Update contributing guide docs by @raffifu in #591
* Fix running test that validates manifest-based DAGs by @tatiana in #619
* pre-commit updates in #604 and #621
tatiana added a commit that referenced this pull request Oct 25, 2023
Bug fixes

* Resolve errors occurring when `dbt_project_path` is str and partial support `dbt_project_path=None` by @MrBones757 in #605
* Fix running dbt tests that depend on multiple models (support `--indirect-selection` buildable) by @david-mag in #613
* Add tests to sources, snapshots and seeds when using `TestBehavior.AFTER_EACH` by @tatiana in #599
* Fix custom selector when select has a subset of tags of the models' tags by @david-mag in #606

Others

* Update contributing guide docs by @raffifu in #591
* Fix running test that validates manifest-based DAGs by @tatiana in #619
* pre-commit updates in #604 and #621

(cherry picked from commit 52c34a2)
tatiana added a commit that referenced this pull request Oct 25, 2023
Bug fixes

* Resolve errors occurring when `dbt_project_path` is str and partial support `dbt_project_path=None` by @MrBones757 in #605
* Fix running dbt tests that depend on multiple models (support `--indirect-selection` buildable) by @david-mag in #613
* Add tests to sources, snapshots and seeds when using `TestBehavior.AFTER_EACH` by @tatiana in #599
* Fix custom selector when select has a subset of tags of the models' tags by @david-mag in #606
* Fix `LoadMode.AUTOMATIC` behaviour to use `LoadMode.DBT_LS` when `ProfileMapping` is used by @tatiana in #625
* Fix failure if `openlineage-common` raises a jinja exception by @tatiana in #626

Others

* Update contributing guide docs by @raffifu in #591
* Remove unnecessary stack trace from Cosmos initialization by @tatiana in #624
* Fix running test that validates manifest-based DAGs by @tatiana in #619
* pre-commit updates in #604 and #621
tatiana added a commit that referenced this pull request Oct 25, 2023
Bug fixes

* Resolve errors occurring when `dbt_project_path` is str and partial support `dbt_project_path=None` by @MrBones757 in #605
* Fix running dbt tests that depend on multiple models (support `--indirect-selection` buildable) by @david-mag in #613
* Add tests to sources, snapshots and seeds when using `TestBehavior.AFTER_EACH` by @tatiana in #599
* Fix custom selector when select has a subset of tags of the models' tags by @david-mag in #606
* Fix `LoadMode.AUTOMATIC` behaviour to use `LoadMode.DBT_LS` when `ProfileMapping` is used by @tatiana in #625
* Fix failure if `openlineage-common` raises a jinja exception by @tatiana in #626

Others

* Update contributing guide docs by @raffifu in #591
* Remove unnecessary stack trace from Cosmos initialization by @tatiana in #624
* Fix running test that validates manifest-based DAGs by @tatiana in #619
* pre-commit updates in #604 and #621

(cherry picked from commit 635fb7b)
@tatiana tatiana mentioned this pull request Oct 25, 2023
tatiana added a commit that referenced this pull request Oct 25, 2023
Bug fixes

* Resolve errors occurring when `dbt_project_path` is str and partial
support `dbt_project_path=None` by @MrBones757 in #605
* Fix running dbt tests that depend on multiple models (support
`--indirect-selection` buildable) by @david-mag in #613
* Add tests to sources, snapshots and seeds when using
`TestBehavior.AFTER_EACH` by @tatiana in #599
* Fix custom selector when select has a subset of tags of the models'
tags by @david-mag in #606
* Fix `LoadMode.AUTOMATIC` behaviour to use `LoadMode.DBT_LS` when
`ProfileMapping` is used by @tatiana in #625
* Fix failure if `openlineage-common` raises a jinja exception by
@tatiana in #626

Others

* Update contributing guide docs by @raffifu in #591
* Remove unnecessary stack trace from Cosmos initialization by @tatiana
in #624
* Fix running test that validates manifest-based DAGs by @tatiana in
#619
* pre-commit updates in #604 and #621

(cherry picked from commit 635fb7b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add tests after snapshots

2 participants