Skip to content

[native] Handle Date type in toFilter function#19593

Merged
aditi-pandit merged 1 commit intoprestodb:masterfrom
isadikov:fix_date_multirange_error
Jun 12, 2023
Merged

[native] Handle Date type in toFilter function#19593
aditi-pandit merged 1 commit intoprestodb:masterfrom
isadikov:fix_date_multirange_error

Conversation

@isadikov
Copy link
Contributor

@isadikov isadikov commented May 9, 2023

This PR fixes TPC-DS Q83 error:

VeloxUserError: Filter(MultiRange, deterministic, null not allowed): testInt64() is not supported

In query 83, the date filter is constructed as a generic MultiRange filter instead of BigintMultiRange as dates are converted into int64_t. MultiRange filter does not support testInt64 method and crashes. Instead, we should handle dates the same way we handle integers.

This is complimentary to facebookincubator/velox#4794.

I added a test that verifies this change.

== NO RELEASE NOTE ==

Fixes #19522.

@isadikov
Copy link
Contributor Author

isadikov commented May 9, 2023

cc @majetideepak @aditi-pandit to review. Thanks!

Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isadikov can we add a test?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the if block above that handles integers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we can, however, there would still be a if-else statement because we need to use dateRangeToFilter instead of bigintRangeToFilter.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the dateRangeToFilter difference. This is fine. We can remove this when Date becomes a logical type.

@isadikov
Copy link
Contributor Author

The test part is mentioned in the description. I could not reproduce this issue in a test so I decided to include the fix only. We will add TPC-DS queries later so this will be covered. For now, I can only manually verify the patch.

@majetideepak
Copy link
Collaborator

majetideepak commented May 16, 2023

I could not reproduce this issue in a test

Why can we not reproduce this in a test? Why only in TPC-DS Q83? Is this dependent on Decimal Types?

@isadikov isadikov force-pushed the fix_date_multirange_error branch 2 times, most recently from 99e5cef to 036a2f7 Compare May 17, 2023 20:28
@isadikov
Copy link
Contributor Author

I added a test to verify the fix, the test fails without the fix and passes with the fix.

@isadikov isadikov force-pushed the fix_date_multirange_error branch from 036a2f7 to 5b6dee4 Compare May 17, 2023 20:45
@majetideepak
Copy link
Collaborator

@isadikov can you remove the "Advance Velox" bits?

@isadikov isadikov force-pushed the fix_date_multirange_error branch 2 times, most recently from 1697a5b to 1e789e4 Compare May 18, 2023 06:05
@isadikov
Copy link
Contributor Author

Oh, sorry about that, it was unintentional. I updated the PR.

@isadikov isadikov requested a review from majetideepak May 18, 2023 06:06
Copy link
Contributor

@aditi-pandit aditi-pandit May 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we require to do this integer range combination logic for date filters ? Ideally, date range is from 01-01-0001 to 12-31-9999, and the complex logic in combineIntegerRanges might be overkill for dates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true, having a complicated combination logic is not required for date filters because, as you mentioned, it is a much smaller range. I suppose I just wanted to keep this similar to integer ranges and avoid implementing combineDateRanges function since combineIntegerRanges is a superset.

Would you like to have a separate combineDateRanges function for date ranges?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the date ranges just be retained as originally specified without a combine function ? It should be fine to keep the original ranges just so I think.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code block merges multiple ranges in a domain into a single one. I think we have to combine here.

Copy link
Contributor

@aditi-pandit aditi-pandit May 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

combineIntegerRanges is a misnomer in the sense its not really combining range filters when there is overlap. It is doing :
i) Creating a values filter if all the filters are checking a single value
ii) Creating NegatedBigIntRange filters if there is only a small big int interval is that is omitted by the range filters.
iii) Else its returning a MultiRange Filter with the individual ranges.

For dates specifically the negated filters logic will not get triggered. For the most part only MultiRangeFilter will be returned. So this function is not really useful for Date filters.

In general, I feel its better to return a BigIntMultiRange filter here and then add optimizations for Date later as needed.

Copy link
Contributor Author

@isadikov isadikov May 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can follow up on streamlining the logic for Date type in a separate PR once this PR is merged if that works.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aditi-pandit thanks for the clarification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update, thanks!

@isadikov isadikov force-pushed the fix_date_multirange_error branch 3 times, most recently from 8a5f08a to 251b58a Compare May 21, 2023 22:42
@isadikov isadikov requested a review from aditi-pandit May 21, 2023 22:42
@isadikov isadikov force-pushed the fix_date_multirange_error branch 3 times, most recently from e13b512 to 1152367 Compare May 22, 2023 06:30
@isadikov
Copy link
Contributor Author

@aditi-pandit @majetideepak Could you review this PR also? I had to make a few changes including CI build to make the test pass - we need Parquet support for it. Thanks!

@majetideepak majetideepak marked this pull request as ready for review May 24, 2023 11:45
@majetideepak majetideepak requested review from a team as code owners May 24, 2023 11:45
@majetideepak majetideepak requested a review from ajaygeorge May 24, 2023 11:45
@isadikov isadikov force-pushed the fix_date_multirange_error branch 2 times, most recently from d29cfa7 to f08871b Compare May 25, 2023 08:13
@isadikov isadikov force-pushed the fix_date_multirange_error branch 2 times, most recently from e54eacd to 6c7a6dc Compare June 4, 2023 21:55
@isadikov
Copy link
Contributor Author

isadikov commented Jun 4, 2023

Strange, my first rebase removed the tests from the patch. I had to apply the changes manually and recreate the commit to have all of the original changes in the PR. All should be good now.

@isadikov isadikov force-pushed the fix_date_multirange_error branch from 6c7a6dc to c08f489 Compare June 9, 2023 08:06
@isadikov
Copy link
Contributor Author

@aditi-pandit Could you review this PR? Thanks.

@isadikov isadikov requested review from aditi-pandit and removed request for aditi-pandit June 11, 2023 19:37
Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @isadikov. This looks good.

@aditi-pandit aditi-pandit merged commit 37a0517 into prestodb:master Jun 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TPC-DS Q83: Filter(MultiRange, deterministic, null not allowed): testInt64() is not supported

3 participants