Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add model_sql helper to enable isolated unit testing #3476

Closed
wants to merge 1 commit into from

Conversation

Zatte
Copy link

@Zatte Zatte commented Jun 18, 2021

Improves Unit testing capabilities.

Enabler for: #2354
Discussed here: https://discourse.getdbt.com/t/testing-with-fixed-data-set/564/8

Related: https://discourse.getdbt.com/t/state-of-testing-in-dbt/1778

Description

It didn't seem to be a good(relatively speaking) method for testing models in isolation. The best was introducing branching conditions in the model based on global settings which at most allowed for 1 test / model.

This PR introduces a new helper-method model_sql() (the name is debatable, in previous discussions it was called cte) which returns the compiled SQL code of a model but under a different context (allows one to overrides var/ref/source ). Models can be tested without changes outside test code and multiple tests can be build for the same model.

I think think approach is a cleaner approach to (unit) testing than what has been proposed so far.

  1. Pure dbt (Jinja/SQL)
  2. No need for custom macros.
  3. No additional plugins (if this pr gets merged)
  4. “Full" control per test
  5. No changes/branching/flags required in production models (!)
  6. Supports all data sizes.

Example use case:

Model
models/addder.sql

    SELECT 
      *,
      a+b + {{ var ("base", "0") }} AS sum
    FROM {{ ref('source_data') }}

Test
tests/adder_0001.sql

{% set model_to_test = model_sql(
    'model.adder',  {
      'var': {
        'base': '1'
      },
      'ref': {
        'source_data': 'mocked_source_data'
      }})
%}

WITH
mocked_source_data AS (
  SELECT
    *
  FROM
  -- bigquery construct; same effect as union all select ....
  UNNEST([
     struct(1 as a,  2 as b,  4 as expectedSum),
     struct(2 as a,  2 as b,  5 as expectedSum),
     struct(0 as a,  0 as b,  1 as expectedSum),
     struct(1 as a, -2 as b, 0 as expectedSum)
   ])
),

final AS (
  {{ model_to_test }}
)

-- Some SQL assertion based on the specific mocked data.
SELECT * FROM final WHERE sum != expectedSum

_I haven't been writing python for some 6 years and and this is the first time interacting the the DBT core, as such the PR is not yet complete. I would like to get some feedback before investing more time into this. No point in "polishing a turd".

If the maintainer(s) think it holds value i don't mind ticking all the boxes below (or hand it over to someone else)_

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot
Copy link

cla-bot bot commented Jun 18, 2021

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, don't hesitate to ping @drewbanin.

CLA has not been signed by users: @Zatte

@Zatte Zatte temporarily deployed to Bigquery June 18, 2021 15:23 Inactive
@Zatte Zatte temporarily deployed to Bigquery June 18, 2021 15:23 Inactive
@Zatte Zatte temporarily deployed to Postgres June 18, 2021 15:23 Inactive
@Zatte Zatte temporarily deployed to Redshift June 18, 2021 15:23 Inactive
@Zatte Zatte temporarily deployed to Redshift June 18, 2021 15:23 Inactive
@Zatte Zatte temporarily deployed to Snowflake June 18, 2021 15:23 Inactive
@Zatte Zatte temporarily deployed to Snowflake June 18, 2021 15:23 Inactive
@Zatte
Copy link
Author

Zatte commented Aug 23, 2021

@drewbanin I did sign the CLA some time ago but the PR seems stuck in this state. Anything you can help with?

@jtcohen6
Copy link
Contributor

@cla-bot check

@cla-bot
Copy link

cla-bot bot commented Aug 23, 2021

The cla-bot has been summoned, and re-checked this pull request!

@cla-bot cla-bot bot added the cla:yes label Aug 23, 2021
@jtcohen6
Copy link
Contributor

@Zatte Apologies, the delay here is completely on me. I'm playing catch-up this week; I'm going to do my best to take a detailed look and give you feedback soon.

@Zatte
Copy link
Author

Zatte commented Sep 22, 2021

@jtcohen6 : Just a monthly ping, hoping this is not forgotten until the end of times :)

@github-actions
Copy link
Contributor

This PR has been marked as Stale because it has been open for 180 days with no activity. If you would like the PR to remain open, please remove the stale label or comment on the PR, or it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Apr 12, 2022
@github-actions github-actions bot closed this Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes ok_to_test stale Issues that have gone stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants