Skip to content

Conversation

@findepi
Copy link
Member

@findepi findepi commented Jan 17, 2023

When predicate is not of boolean type (e.g. null literal being of
unknown type), the Analysis holds a coercion that should be applied
to it. Before the change, the coercion was not getting applied though.

@findepi
Copy link
Member Author

findepi commented Jan 17, 2023

cc @assaf2 @martint

@findepi findepi force-pushed the findepi/test-predicates-never-of-unknown-type-7d70de branch 2 times, most recently from d53d8a3 to 605703b Compare January 18, 2023 09:02
@findepi findepi changed the title Test predicates never of unknown type Apply coercions when creating FilterNode Jan 18, 2023
@findepi findepi marked this pull request as ready for review January 18, 2023 09:02
@findepi findepi added the no-release-notes This pull request does not require release notes entry label Jan 18, 2023
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do something more generic here? Like checking if predicate is of type Literal but not a BooleanLiteral?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make it more complicated, yes.
However we cannot analyze expression type fully, as we don't have symbol types here, so we fail short of being really generic. I draw the line here, we can draw the line somewhere else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not add the check for NullLiteral. We don't do this in other PlanNodes which require boolean expressions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can all agree that FilterNode.predicate should be of boolean type.
Yet, the code base required updates in quite a few places to bring reality in line with the design.
The only known to me way to effectively find those places was to add the check here.
Now, we can hope that codebase is now correct and we can remove the check, but how will we ensure the problems do not come back with some new code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that the PlanNode constructor seems to be the best place for the check. If we do the check immediately after the initial planning, we might miss wrong filters created in the Optimizer. If we wait until execution, we might miss / break optimizations.

This check is not satisfying though, as it only detects NullLiteral. If you want to keep it, then maybe add a TODO explaining what's missing. That TODO will be easy to address in the future, when we have the new IR with types included.

Also, consider adding a similar check in other PlanNodes, at least those covered by this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is not satisfying though, as it only detects NullLiteral.

Correct, the check is a required check, but doesn't guarantee well-formedness.

consider adding a similar check in other PlanNodes

I would prefer to defer that, if i were to choose

If you want to keep it, then maybe add a TODO explaining what's missing.

Good idea, will add a comment!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the connection of this PR and #15558, can you also remove the handling of this case from RemoveTrivialFilters and then whoever gets the conflict will solve it, or do you have other plans?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also remove the handling of this case from RemoveTrivialFilters

Before my commit, there are cases where RemoveTrivialFilters can remove null filter -- it's when it's written as NULL in SQL and coercions are not applied. So changing RemoveTrivialFilters within this PR would be removing existing optimization. However, applying the coercions also removes the existing optimization, as the optimization no longer kicks in (becomes dead code)

Huh, I think your change needs to go first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't add the check in FilterNode constructor, like I suggested, we could retain this test case, and let it fail because of checkArgument in RemoveTrivialFilters. However, I think that the Optimizer rule should not validate the node, but instead just ignore NullLiteral.

I think that for now we can skip the validation altogether. That would be consistent with how we handle other sites where boolean type is required. If we want to validate the expression types, we should introduce a new mechanism that would have access to full type information (including symbols), and apply it to each site, that is: filter, joins, aggregation filter, merge etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I think that the Optimizer rule should not validate the node, but instead just ignore NullLiteral.

Yes, a rule doesn't need to validate a node, but it also is free to fail when a node doesn't adhere to a design. For example, a ProjectNode cannot have multiple sources, so a rule can simply fail if one does.

I think that for now we can skip the validation altogether. That would be consistent with how we handle other sites where boolean type is required

answered at the other comment: 020b366#r1085170555

@sopel39 sopel39 requested review from kasiafi and removed request for sopel39 January 18, 2023 10:02
@findepi findepi force-pushed the findepi/test-predicates-never-of-unknown-type-7d70de branch 2 times, most recently from 605703b to 1afd5e3 Compare January 18, 2023 12:46
@findepi
Copy link
Member Author

findepi commented Jan 23, 2023

CI #15809

@findepi findepi force-pushed the findepi/test-predicates-never-of-unknown-type-7d70de branch from 1afd5e3 to da8ef8d Compare January 23, 2023 10:20
@findepi

This comment was marked as outdated.

@findepi findepi requested a review from sopel39 January 23, 2023 10:26
@findepi findepi force-pushed the findepi/test-predicates-never-of-unknown-type-7d70de branch from da8ef8d to 5389bc0 Compare January 24, 2023 08:53
@findepi

This comment was marked as outdated.

@findepi findepi force-pushed the findepi/test-predicates-never-of-unknown-type-7d70de branch 2 times, most recently from 4914b03 to 020b366 Compare January 24, 2023 09:37
@findepi findepi requested a review from assaf2 January 24, 2023 09:39
@findepi findepi force-pushed the findepi/test-predicates-never-of-unknown-type-7d70de branch from 039b2fa to 09c97f2 Compare January 24, 2023 15:44
@findepi
Copy link
Member Author

findepi commented Jan 24, 2023

Thank you for your review! AC, PTAL

@findepi findepi requested a review from kasiafi January 24, 2023 15:45
@findepi findepi force-pushed the findepi/test-predicates-never-of-unknown-type-7d70de branch from 09c97f2 to d7d7e95 Compare January 24, 2023 16:01
This is for code consistency reasons. Most of the places where
`ExpressionMatcher` is constructed already goes through
`PlanMatchPattern.expression` factory method.
`UnwrapCastInComparison` works on any nested comparison expressions.
@findepi findepi force-pushed the findepi/test-predicates-never-of-unknown-type-7d70de branch from d7d7e95 to 76c3905 Compare January 25, 2023 16:30
@findepi
Copy link
Member Author

findepi commented Jan 25, 2023

Thank you for your review! AC, PTAL

@findepi findepi force-pushed the findepi/test-predicates-never-of-unknown-type-7d70de branch from 76c3905 to f59e623 Compare January 26, 2023 10:00
@findepi
Copy link
Member Author

findepi commented Jan 26, 2023

CI #15793

findepi and others added 3 commits January 26, 2023 20:51
`StatementAnalyzer.Visitor#analyzeRowFilter` captures necessary
coercions for filters that are not of boolean type (e.g. `null` literal
being of `unknown` type). Before the change, the coercions where not
getting applied though.
When predicate is not of boolean type (e.g. `null` literal being of
`unknown` type), the `Analysis` holds a coercion that should be applied
to it.  Before the change, the coercion was not getting applied though.

Co-authored-by: Assaf Bern <[email protected]>
@findepi findepi force-pushed the findepi/test-predicates-never-of-unknown-type-7d70de branch from f59e623 to 402fc7b Compare January 26, 2023 19:52
@sopel39
Copy link
Member

sopel39 commented Jan 26, 2023

Is there a potential perf impact? Does it need benchmarks?

@findepi
Copy link
Member Author

findepi commented Jan 27, 2023

Is there a potential perf impact? Does it need benchmarks?

@sopel39 good question, thanks for thinking about this!

There is some possibility that certain filters now become pruned out where they didn't before.
However, this is for edge cases only. The TPC-H/DS plans didn't change, so benchmarks results wouldn't either.

@findepi
Copy link
Member Author

findepi commented Jan 27, 2023

CI #15793, #15770

@findepi findepi merged commit dc40577 into trinodb:master Jan 27, 2023
@findepi findepi deleted the findepi/test-predicates-never-of-unknown-type-7d70de branch January 27, 2023 08:52
@findepi findepi removed the no-release-notes This pull request does not require release notes entry label Jan 27, 2023
@findepi findepi mentioned this pull request Jan 27, 2023
@github-actions github-actions bot added this to the 407 milestone Jan 27, 2023
@sopel39
Copy link
Member

sopel39 commented Jan 27, 2023

However, this is for edge cases only. The TPC-H/DS plans didn't change, so benchmarks results wouldn't either.

I was thinking that maybe execution now could take longer?

@findepi
Copy link
Member Author

findepi commented Jan 27, 2023

you mean that the generated code is different? this i don't know, unfortunately.
@dain would be the best person to answer this.

@sopel39
Copy link
Member

sopel39 commented Jan 27, 2023

@findepi would you like to run benchmarks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

5 participants