Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expect Failure Mode for Tests #2982

Closed
zemekeneng opened this issue Dec 28, 2020 · 8 comments
Closed

Expect Failure Mode for Tests #2982

zemekeneng opened this issue Dec 28, 2020 · 8 comments
Labels
dbt tests Issues related to built-in dbt testing functionality enhancement New feature or request

Comments

@zemekeneng
Copy link

zemekeneng commented Dec 28, 2020

Describe the feature

An expect flag for tests that causes the test to pass when it would ordinarily fail, and vice versa.

version: 2
models:
  - name: my_model
    columns:
      - name: not_unique_col 
        tests:
          - my_test
               expect: fail
               # passes only when my_test would ordinarily fail
          - my_passing_test
               expect: pass
               # (default)  runs my_passing_test in the conventional way (so this would be a non-breaking change)
          - unique
               expect: =1
               # passes only when a single duplicate row is found

This would accomplish 2 things:

  1. Satisfy the need for failure scenarios for unittesting custom test in packages like dbt-utils. Currently, test coverage for tests in dbt-utils only contains cases where tests pass. We are just assuming that these generic test would fail given a failing dataset, but we are not verifying this to be the case through tests. With the expect: fail flag, test coverage for macros of all kinds can be dramatically improved.
  2. Effectively double the number of tests that are already written for testing models. While this is not the main benefit, it is worth noting.

Describe alternatives you've considered

One alternative is to handle test failures in the CI pipeline. Using tags like expect_fail, we could exclude these tests from the standard test suite, and run them in a separate pipeline that expects the tests to fail.

Another alternative is to have some type of test_fail macro, that would consume any other test, run its sql as an ephemeral model wrapped in

select 
   case when (
               select test_result from {{ model }}
   ) > 0 
   then 0 else 1 end

I am not sure there is a clean way to pass arguments between the generic test_fail and the specific test to be failed.

Finally, we could rewrite all tests as macros that produce tables and make a pass version and a fail version.

Who will this benefit?

Everyone that uses shared tooling because of improved test coverage. Probably most dbt users.

Are you interested in contributing this feature?

Yes. If we can settle on an interface, I would be excited to work on this. @clrcrl tells me that you have plans to decouple the test dataset from the count(*) wrapper. This might be a good opportunity to allow the expect flag on the test config to default to expect: =0 so as to allow =1 or >0 or perhaps this plus synonyms pass as =0 and fail as >0.

@zemekeneng zemekeneng added enhancement New feature or request triage labels Dec 28, 2020
@jtcohen6 jtcohen6 added dbt tests Issues related to built-in dbt testing functionality and removed triage labels Jan 4, 2021
@jtcohen6
Copy link
Contributor

jtcohen6 commented Jan 4, 2021

@zemekeneng This is really interesting! Thanks for the detailed writeup. We've spent years building the instincts around "no rows / 0 count means pass, yes rows / >0 count means fail," "select/count all the records you don't want." While it's now an acquired taste, and one I've come to like, I don't think it's radically more intuitive than "pass and fail when I say so."

Currently, test coverage for tests in dbt-utils only contains cases where tests pass. We are just assuming that these generic test would fail given a failing dataset, but we are not verifying this to be the case through tests.

This is a very good point. By writing tests with an eye only toward them passing—a major achievement, to have data be exactly the way we hope—we're still missing an important piece of the software / unit testing puzzle.

Future

Most directly, this issue feels kin to #2219, which proposes adapting the warn_after + error_after specifications from source freshness for use by schema + data tests, in lieu of the binary severity: warn|after.

Your explanation above has helped me realized that warn_after and error_after may still be too constraining, since they operate with the built-in assumption (true for stale source data) that bigger/later/_after is worse.

Instead, I like your proposal for simple boolean expressions using =, <, >. Instead of specifying when the test should pass (expect), though, I'd prefer to keep the specification around when and how the test shouldn't pass.

What do you think about a spec like this:

version: 2
models:
  - name: my_model
    columns:
      - name: not_unique_col 
        tests:
          - test_one:
              warn: ">0"     # cf. `severity: warn`, `warn_after: 0`
          - test_two:
              error: "<50%"  # raise an error if majority of records *are not* returned by the test query
          - test_three:
              warn: "!=0"
              error: "!=0"       # `error` always takes precedence over `warn`

By default, tests have {warn: none, error: "!=0"}.

Current

Another alternative is to have some type of test_fail macro, that would consume any other test, run its sql as an ephemeral model wrapped in.

This is the direction I was thinking in, since it's possible today and would mostly achieve the desired behavior. I imagine your config would look something like:

models:
  - name: my_model
    columns:
      - name: not_unique_col
        tests:
          - test_fail:
              - test_macro: test_unique
                expect: "=1"

It would require some tricky business to render the test macro from the context, but it could be hacked. Much trickier would be configuring the severity based on the test result.

I'm going to tag this for consideration in v0.20.0 ("The Test Release"); I think it neatly glosses #2219.

@jtcohen6 jtcohen6 added this to the Oh-Twenty [TBD] milestone Jan 4, 2021
@MarkMacArdle
Copy link

This is something I really want too but I'd like to extend the scope to also include compile errors. If this should be raised as a separate issue please let me know. This would be a bit different to the original issue as you'd be checking for a compile error rather than a database error.

I wanted this feature when writing schema tests that raised compile errors to validate inputs (like this in dbt-utils). Currently I manually check the compile errors are being raised when they should, but don't see a way of adding creating a test to do that.

Something like below would be useful:

version: 2
models:
  - name: my_model
    columns:
      - name: not_unique_col 
        tests:
          - my_test
               expect: compile_error

@jtcohen6
Copy link
Contributor

jtcohen6 commented Feb 9, 2021

Hey @MarkMacArdle, what's the context in which you'd want a test to raise a compilation error? Would it be for integration-testing the my_test schema test definition in the package/project that defines it, to make sure that it raises the error given an invalid argument? If so, I think that should be accomplished with a little bit of python/bash wrapping, such as we do for dbt and dbt-utils integration tests.

This does feel a bit different to me from the requested feature, which proposes an expected/thresholded failure mode only when the test SQL succeeds and returns results. I don't see a config like expect/error/warn affecting the behavior of a dbt test that raises a compilation error or returns a database error because of malformed SQL. The latest version of run_results.json distinguishes between these scenarios by calling the former error, and the latter runtime error.

@MarkMacArdle
Copy link

Thanks for the input @jtcohen6, I've moved this to its own issue now (#3060) and added some more detail on my situation.

@jtcohen6
Copy link
Contributor

Closing in favor of the concrete proposal for warn_if, error_if test configs over in #3258. That's where we'll make it happen.

@mike-weinberg
Copy link

Hi @jtcohen6!

I'd like to reopen this.

In traditional software engineering, one might have unit tests that specifically expect failure. This would be to ensure that known-good code should throw specific errors on known-bad data, eg if your code specifically defines conditions of inputs for which it throws value errors. and so a unit test might pass if that value error is thrown with an appropriately "bad" input.

prior art:

In DBT, we can only test via dbt tests, and so for me the most obvious way to have something akin to unit tests is if we can use something like a seed file with known-bad data, test on it, and, conditioned on the target/environment, expect a failure. I imagine this running in CI/CD.

I'm not sure what this needs to look like, since in a way we'd want to be able to associate specific targets/environments with specific failures. so there might be a need to support context awareness via eg jinja. so the expect subproperty might take some sort of json about when a test might pass or fail. Perhaps this should live separately, such that tests can be defined in the positive sense, and then have a regression test spec which makes the association between specific tests, specific targets/environments, and specific expectations of pass/failure

@jtcohen6
Copy link
Contributor

jtcohen6 commented Dec 8, 2023

In DBT, we can only test via dbt tests, and so for me the most obvious way to have something akin to unit tests is if we can use something like a seed file with known-bad data, test on it, and, conditioned on the target/environment, expect a failure. I imagine this running in CI/CD.

@mike-weinberg Have you had a chance to read one of the coolest GitHub discussions from the past few months? Better yet: this work is in flight! We're planning to have it in beta a few months from now.

@mike-weinberg
Copy link

AMAZE. That will work for us, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dbt tests Issues related to built-in dbt testing functionality enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants