Add IGNORE NULLS clause to various Window functions#10568
Add IGNORE NULLS clause to various Window functions#10568rongrong merged 1 commit intoprestodb:masterfrom
Conversation
|
@electrum @martint @maciejgrzybek I ended up having to recreate this pull request after inadvertently wiping out all of my changes in the original PR. Please have a look at this. |
e5fcc25 to
5790346
Compare
|
Hey @ptkool - excited to see all the work you've done on this, I have a use case & was looking for it, so just stopping by to say thanks! |
|
I was just looking for the functionality. would give a +1 to the feature. |
|
Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need the corporate CLA signed. If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks! |
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks! |
1 similar comment
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks! |
|
This will be a really nice to have feature! Will avoid all the workarounds currently needed. Hope this PR is very close to acceptance! |
|
Anything new about this? |
|
@electrum ? This is very important feature!! |
|
any updates here? this is a must-have feature!! |
|
@ptkool Could you please squash the commits into logical features? If there's only one you can just squash all into one commit. Thanks! |
|
workaround for now: |
|
This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the task, make sure you've addressed reviewer comments, and rebase on the latest master. Thank you for your contributions! |
|
@ptkool bumping this PR - this would be a huge help!! |
83c8050 to
40338d3
Compare
|
@rongrong Can I get this reviewed again? |
57bcb64 to
1105314
Compare
199a254 to
e0bf716
Compare
|
Also bumping, would love this added. For those stuck I was able to use this solution on stackoverflow to come up with a stop-gap in Presto. |
rongrong
left a comment
There was a problem hiding this comment.
Thanks for working on this. Sorry about the late review. I somehow missed it earlier. A few high level comments:
- We structure commits logically. So when we fix things, instead of creating a commit as "fix..." we just modify the commit directly. For this PR, I think you can merge all commits into 1. You can force push to the branch afterwards.
- Since
RESPECT NULLSandIGNORE NULLSis a common functionality to allValueWindowFunction, you might want to restructure the code so this is handled in the parent class to avoid duplicate logic.
presto-parser/src/main/java/com/facebook/presto/sql/parser/AstBuilder.java
Outdated
Show resolved
Hide resolved
presto-parser/src/main/java/com/facebook/presto/sql/ExpressionFormatter.java
Outdated
Show resolved
Hide resolved
b86cbcb to
a94efcc
Compare
There was a problem hiding this comment.
I don't think you need to add ignoreNulls to Aggregation. RESPECT NULLS | IGNORE NULLS is only relevant for ValueWindowFunction. It's probably better to add a semantic check in ExpressionAnalyzer to make sure this is only specified for related window functions, and throw an explicit SemanticException otherwise.
There was a problem hiding this comment.
I have removed ignoreNulls from Aggregation.
There was a problem hiding this comment.
As far as adding a semantic check in ExpressionAnalyzer, there doesn't appear to be a nice way of determining whether a function is a value window function (other than checking the function name against a list of names).
There was a problem hiding this comment.
I think ignoreNulls should be a variable in ValueWindowFunction. You can have a separate setIgnoreNulls in ValueWindowFunction, that way you don't need to mess with the constructors. ignoreNulls is not really part of the function inputs anyways. If you keep a separate currentNonNullPosition in ValueWindowFunction, I believe the null handling can be done in ValueWindowFunction.processRow.
There was a problem hiding this comment.
I have made the suggested change with ignoreNulls.
There was a problem hiding this comment.
I don't believe moving the null handling to ValueWindowFunction.processRow will simplify the code - the logic isn't duplicated in any of the window functions.
There was a problem hiding this comment.
You probably still need special handling on each function, but how many non-null values there are since frameStart can be tracked. The current implementation is really inefficient as it tries to iterate from the beginning of the frame on every position to count whether it got enough non-null values. This is O(n^2) while you could have tracked this in O(n).
19d5c09 to
94ee06f
Compare
Add parser null treatment clause test Fix compilation error Clean up Remove ignoreNulls attribute from Aggregation class Fix documentation
94ee06f to
4f1ee84
Compare
See #4554.
This PR adds the IGNORE/RESPECT NULLS clause to window functions LAG, LEAD, FIRST_VALUE, LAST_VALUE, and NTH_VALUE.