Expect Failure Mode for Tests #2982

zemekeneng · 2020-12-28T20:18:28Z

Describe the feature

An expect flag for tests that causes the test to pass when it would ordinarily fail, and vice versa.

version: 2
models:
  - name: my_model
    columns:
      - name: not_unique_col 
        tests:
          - my_test
               expect: fail
               # passes only when my_test would ordinarily fail
          - my_passing_test
               expect: pass
               # (default)  runs my_passing_test in the conventional way (so this would be a non-breaking change)
          - unique
               expect: =1
               # passes only when a single duplicate row is found

This would accomplish 2 things:

Satisfy the need for failure scenarios for unittesting custom test in packages like dbt-utils. Currently, test coverage for tests in dbt-utils only contains cases where tests pass. We are just assuming that these generic test would fail given a failing dataset, but we are not verifying this to be the case through tests. With the expect: fail flag, test coverage for macros of all kinds can be dramatically improved.
Effectively double the number of tests that are already written for testing models. While this is not the main benefit, it is worth noting.

Describe alternatives you've considered

One alternative is to handle test failures in the CI pipeline. Using tags like expect_fail, we could exclude these tests from the standard test suite, and run them in a separate pipeline that expects the tests to fail.

Another alternative is to have some type of test_fail macro, that would consume any other test, run its sql as an ephemeral model wrapped in

select 
   case when (
               select test_result from {{ model }}
   ) > 0 
   then 0 else 1 end

I am not sure there is a clean way to pass arguments between the generic test_fail and the specific test to be failed.

Finally, we could rewrite all tests as macros that produce tables and make a pass version and a fail version.

Who will this benefit?

Everyone that uses shared tooling because of improved test coverage. Probably most dbt users.

Are you interested in contributing this feature?

Yes. If we can settle on an interface, I would be excited to work on this. @clrcrl tells me that you have plans to decouple the test dataset from the count(*) wrapper. This might be a good opportunity to allow the expect flag on the test config to default to expect: =0 so as to allow =1 or >0 or perhaps this plus synonyms pass as =0 and fail as >0.

The text was updated successfully, but these errors were encountered:

jtcohen6 · 2021-01-04T12:00:44Z

@zemekeneng This is really interesting! Thanks for the detailed writeup. We've spent years building the instincts around "no rows / 0 count means pass, yes rows / >0 count means fail," "select/count all the records you don't want." While it's now an acquired taste, and one I've come to like, I don't think it's radically more intuitive than "pass and fail when I say so."

Currently, test coverage for tests in dbt-utils only contains cases where tests pass. We are just assuming that these generic test would fail given a failing dataset, but we are not verifying this to be the case through tests.

This is a very good point. By writing tests with an eye only toward them passing—a major achievement, to have data be exactly the way we hope—we're still missing an important piece of the software / unit testing puzzle.

Future

Most directly, this issue feels kin to #2219, which proposes adapting the warn_after + error_after specifications from source freshness for use by schema + data tests, in lieu of the binary severity: warn|after.

Your explanation above has helped me realized that warn_after and error_after may still be too constraining, since they operate with the built-in assumption (true for stale source data) that bigger/later/_after is worse.

Instead, I like your proposal for simple boolean expressions using =, <, >. Instead of specifying when the test should pass (expect), though, I'd prefer to keep the specification around when and how the test shouldn't pass.

What do you think about a spec like this:

version: 2
models:
  - name: my_model
    columns:
      - name: not_unique_col 
        tests:
          - test_one:
              warn: ">0"     # cf. `severity: warn`, `warn_after: 0`
          - test_two:
              error: "<50%"  # raise an error if majority of records *are not* returned by the test query
          - test_three:
              warn: "!=0"
              error: "!=0"       # `error` always takes precedence over `warn`

By default, tests have {warn: none, error: "!=0"}.

Current

Another alternative is to have some type of test_fail macro, that would consume any other test, run its sql as an ephemeral model wrapped in.

This is the direction I was thinking in, since it's possible today and would mostly achieve the desired behavior. I imagine your config would look something like:

models:
  - name: my_model
    columns:
      - name: not_unique_col
        tests:
          - test_fail:
              - test_macro: test_unique
                expect: "=1"

It would require some tricky business to render the test macro from the context, but it could be hacked. Much trickier would be configuring the severity based on the test result.

I'm going to tag this for consideration in v0.20.0 ("The Test Release"); I think it neatly glosses #2219.

MarkMacArdle · 2021-02-08T10:33:15Z

This is something I really want too but I'd like to extend the scope to also include compile errors. If this should be raised as a separate issue please let me know. This would be a bit different to the original issue as you'd be checking for a compile error rather than a database error.

I wanted this feature when writing schema tests that raised compile errors to validate inputs (like this in dbt-utils). Currently I manually check the compile errors are being raised when they should, but don't see a way of adding creating a test to do that.

Something like below would be useful:

version: 2
models:
  - name: my_model
    columns:
      - name: not_unique_col 
        tests:
          - my_test
               expect: compile_error

jtcohen6 · 2021-02-09T10:57:09Z

Hey @MarkMacArdle, what's the context in which you'd want a test to raise a compilation error? Would it be for integration-testing the my_test schema test definition in the package/project that defines it, to make sure that it raises the error given an invalid argument? If so, I think that should be accomplished with a little bit of python/bash wrapping, such as we do for dbt and dbt-utils integration tests.

This does feel a bit different to me from the requested feature, which proposes an expected/thresholded failure mode only when the test SQL succeeds and returns results. I don't see a config like expect/error/warn affecting the behavior of a dbt test that raises a compilation error or returns a database error because of malformed SQL. The latest version of run_results.json distinguishes between these scenarios by calling the former error, and the latter runtime error.

MarkMacArdle · 2021-02-09T12:43:36Z

Thanks for the input @jtcohen6, I've moved this to its own issue now (#3060) and added some more detail on my situation.

jtcohen6 · 2021-04-13T02:45:38Z

Closing in favor of the concrete proposal for warn_if, error_if test configs over in #3258. That's where we'll make it happen.

mike-weinberg · 2023-12-08T16:34:18Z

Hi @jtcohen6!

I'd like to reopen this.

In traditional software engineering, one might have unit tests that specifically expect failure. This would be to ensure that known-good code should throw specific errors on known-bad data, eg if your code specifically defines conditions of inputs for which it throws value errors. and so a unit test might pass if that value error is thrown with an appropriately "bad" input.

prior art:

In DBT, we can only test via dbt tests, and so for me the most obvious way to have something akin to unit tests is if we can use something like a seed file with known-bad data, test on it, and, conditioned on the target/environment, expect a failure. I imagine this running in CI/CD.

I'm not sure what this needs to look like, since in a way we'd want to be able to associate specific targets/environments with specific failures. so there might be a need to support context awareness via eg jinja. so the expect subproperty might take some sort of json about when a test might pass or fail. Perhaps this should live separately, such that tests can be defined in the positive sense, and then have a regression test spec which makes the association between specific tests, specific targets/environments, and specific expectations of pass/failure

jtcohen6 · 2023-12-08T16:47:36Z

In DBT, we can only test via dbt tests, and so for me the most obvious way to have something akin to unit tests is if we can use something like a seed file with known-bad data, test on it, and, conditioned on the target/environment, expect a failure. I imagine this running in CI/CD.

@mike-weinberg Have you had a chance to read one of the coolest GitHub discussions from the past few months? Better yet: this work is in flight! We're planning to have it in beta a few months from now.

Unit Testing #8275

mike-weinberg · 2023-12-08T19:22:15Z

AMAZE. That will work for us, thanks!

zemekeneng added enhancement New feature or request triage labels Dec 28, 2020

jtcohen6 added dbt tests Issues related to built-in dbt testing functionality and removed triage labels Jan 4, 2021

jtcohen6 added this to the Oh-Twenty [TBD] milestone Jan 4, 2021

MarkMacArdle mentioned this issue Feb 9, 2021

Way to have test pass if compile error raised #3060

Closed

jtcohen6 mentioned this issue Apr 13, 2021

Net-new test configs #3258

Closed

jtcohen6 closed this as completed Apr 13, 2021

jtcohen6 removed this from the Margaret Mead milestone Apr 13, 2021

This was referenced Sep 14, 2021

TDDE: Transformations in DBT donaldsawyer/tdd-dbt#5

Closed

TDDE: Transformations in SQL Stored Proc donaldsawyer/tdd-dbt#7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expect Failure Mode for Tests #2982

Expect Failure Mode for Tests #2982

zemekeneng commented Dec 28, 2020 •

edited

Loading

jtcohen6 commented Jan 4, 2021

MarkMacArdle commented Feb 8, 2021

jtcohen6 commented Feb 9, 2021

MarkMacArdle commented Feb 9, 2021

jtcohen6 commented Apr 13, 2021

mike-weinberg commented Dec 8, 2023

jtcohen6 commented Dec 8, 2023

mike-weinberg commented Dec 8, 2023

Expect Failure Mode for Tests #2982

Expect Failure Mode for Tests #2982

Comments

zemekeneng commented Dec 28, 2020 • edited Loading

Describe the feature

Describe alternatives you've considered

Who will this benefit?

Are you interested in contributing this feature?

jtcohen6 commented Jan 4, 2021

Future

Current

MarkMacArdle commented Feb 8, 2021

jtcohen6 commented Feb 9, 2021

MarkMacArdle commented Feb 9, 2021

jtcohen6 commented Apr 13, 2021

mike-weinberg commented Dec 8, 2023

jtcohen6 commented Dec 8, 2023

mike-weinberg commented Dec 8, 2023

zemekeneng commented Dec 28, 2020 •

edited

Loading