Add test for 'a has fewer rows than b' #221

dmarts · 2020-05-13T14:16:25Z

Description & motivation

Add a fewer_rows_than test.

This lets an analyst confirm that our model (A) has fewer rows than some reference model (B). It's useful for confirming that the number of rows has actually reduced after applying some filter or inner join in model A.

Checklist

I have verified that these changes work locally
I have updated the README.md (if applicable)
I have added tests & descriptions to my models (and macros if applicable)

clrcrl · 2020-05-18T14:05:20Z

Hi dan! Thanks for the PR! I'm a little behind on a few other things, so just to set expectations, I probably won't look at this until late this week or potentially later than that 😅

dmarts · 2020-05-18T14:35:08Z

You do your thing 👍

clrcrl

One question about this design of this test, then I'll review fully! (Trying to get these done before Christmas haha!)

clrcrl · 2020-12-23T16:17:42Z

macros/schema_tests/fewer_rows_than.sql

+    select 
+        case
+            when count_model > count_comparison then count_model - count_comparison + 1
+            when count_model = count_comparison then count_model - count_comparison


I've spent more minutes looking at this line than I'd care to admit.

In this case, the two tables have the same number of rows, right? Do we expect that to be a passing, or failing case?

Who wrote this? Someone in the dim and distant past. Yikes. Anyway, we're trying to find the number of errors so line 24+ is supposed to find the number of excess rows:

If count_model = count_comparison then we have 1 too many rows therefore return 1

if count_model > count_comparison then we have count_model - count_comparison + 1 too many rows so return that number

if count_model < count_comparison then we're good so return 0

So I think L27 should be:
when count_model = count_comparison then 1

That's one bug squashed. Is it clearer now? If it's failed the very first review by an experienced user then it probably needs some more work 👍

Ah got it! (Welcome back from the Christmas break 😉 )

I think one thing that is throwing me is "count model" and "count comparison" — these terms are generic so it makes it a little challenging to follow the logic. I think we can make these more explicit, and maybe adjust the case logic to be a little more verbose.

What do you think of something like this?

select (count_model_with_more_rows - count_model_with_fewer_rows) as row_count_delta, case -- pass the test if the delta is positive (i.e. return the number 0) when row_count_delta > 0 then 0 -- fail the test if they are the same number when row_count_delta = 0 then 1 -- fail the test for negative numbers when row_count_delta < 0 then abs(row_count_delta) end as excess_rows from counts ) select excess_rows from final

I just realized that this may not work on postgres — most cloud warehouses support lateral column aliasing (i.e. using a calculated column in the same CTE) but I suspect postgres does not, so we might need to move the "case" statement into another CTE! 😬

Your suggestion looks good! I didn't know that about postgres but the very verbose CTE-heavy version I've just committed might solve it...

clrcrl · 2021-03-10T15:46:54Z

Reopened as #343 (to integrate upstream changes)

Macro and test case

6a29323

dmarts marked this pull request as ready for review May 13, 2020 14:16

dmarts requested a review from clrcrl as a code owner May 13, 2020 14:17

dmarts changed the title ~~Macro and test case~~ feature/fewer_rows_than May 13, 2020

dmarts changed the title ~~feature/fewer_rows_than~~ Add test for 'a has fewer rows than b' May 13, 2020

clrcrl changed the base branch from master to dev/0.7.0 December 23, 2020 16:02

clrcrl reviewed Dec 23, 2020

View reviewed changes

Update fewer_rows_than.sql

033dce9

clrcrl force-pushed the dev/0.7.0 branch 5 times, most recently from bbba960 to 60a3b3c Compare January 11, 2021 15:52

clrcrl approved these changes Mar 10, 2021

View reviewed changes

clrcrl mentioned this pull request Mar 10, 2021

Feature/fewer rows than #343

Merged

3 tasks

clrcrl closed this Mar 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add test for 'a has fewer rows than b' #221

Add test for 'a has fewer rows than b' #221

dmarts commented May 13, 2020 •

edited

Loading

clrcrl commented May 18, 2020

dmarts commented May 18, 2020

clrcrl left a comment

clrcrl Dec 23, 2020

dmarts Jan 4, 2021

clrcrl Jan 4, 2021

clrcrl Jan 4, 2021

dmarts Jan 5, 2021

clrcrl commented Mar 10, 2021

Add test for 'a has fewer rows than b' #221

Add test for 'a has fewer rows than b' #221

Conversation

dmarts commented May 13, 2020 • edited Loading

Description & motivation

Checklist

clrcrl commented May 18, 2020

dmarts commented May 18, 2020

clrcrl left a comment

Choose a reason for hiding this comment

clrcrl Dec 23, 2020

Choose a reason for hiding this comment

dmarts Jan 4, 2021

Choose a reason for hiding this comment

clrcrl Jan 4, 2021

Choose a reason for hiding this comment

clrcrl Jan 4, 2021

Choose a reason for hiding this comment

dmarts Jan 5, 2021

Choose a reason for hiding this comment

clrcrl commented Mar 10, 2021

dmarts commented May 13, 2020 •

edited

Loading