Add upper_bound_column to order by #660
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
resolves #659
This is a:
All pull requests from community contributors should target the
main
branch (default).Description & motivation
The order by clause(s) used in the window functions in the
mutually_exclusive_ranges
test give non-deterministic results when more than one range within a partition has the same lower_bound. If zero-length ranges are not allowed, the test will FAIL (as it should) if this occurs. But if zero length ranges are allowed and gaps are not required, it should be expected that one could have more than one range within a partition with the same lower_bound.This PR changes the order by clause(s) to use
{{ lower_bound_column }}, {{ upper_bound_column }}
as the ordering criteria. This will ensure that when multiple ranges have the same lower_bound, they are sorted based on their upper_bound, which will place all zero-length ranges together, with any non-zero-length range with the same lower_bound appearing as the last record in this group, causing it's upper_bound to be compared to the next distinct lower_bound.Since the test only considers the lower and upper bound columns, the fact that records with the same lower and upper bounds won't have a guaranteed order shouldn't be problematic, since these are effectively interchangeable as far as the test is concerned. So this should be enough to guarantee deterministic results (i.e., the test will always PASS or will always FAIL, unless changes are made to the dataset or to the test configuration).
Checklist
star()
source)limit_zero()
macro in place of the literal string:limit 0
dbt.type_*
macros instead of explicit datatypes (e.g.dbt.type_timestamp()
instead ofTIMESTAMP