-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert nth_value
to UDAF
#11287
Convert nth_value
to UDAF
#11287
Conversation
Pending methods are: - `accumulator` - `state_fields` - `reverse_expr`
This error message is manually formatted to remain consistent with existing error statements. It is not formatted by running: ``` cargo test -p datafusion-sqllogictest --test sqllogictests errors -- --complete ```
This reverts commit 1913efda49e585816286b54b371d4166ac894d1f.
This reverts commit 4bb204f6ec8028c4e3313db5af3fabfcdaf7fea8.
This reverts commit fa61ce62dcae6eae6f8e9c9900ebf8cff5023bc0.
self.nullable, // This should be the same as field() | ||
format_state_name(self.name(), "nth_value"), | ||
Field::new("item", args.input_type.clone(), true), | ||
false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think given the existing nth function, we should let nullable configurable. And, the nullability is actually for the list element. We should add nullable in StateFieldArgs
.
let mut fields = vec![Field::new_list(
format_state_name(self.name(), "nth_value"),
Field::new("item", args.input_type.clone(), self.nullable),
false)]
@eejbyfeldt is working on it in #11063
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marked it as a TODO in comments so that it can be completed once it is added to StateFieldArgs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
based on #11299, I think we will have non-null for nth_value too (similar to first value).
self.nullable, // This should be the same as field() | ||
format_state_name(self.name(), "nth_value"), | ||
Field::new("item", args.input_type.clone(), true), | ||
false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think given the existing nth function, we should let nullable configurable. And, the nullability is actually for the list element. We should add nullable in StateFieldArgs
.
let mut fields = vec![Field::new_list(
format_state_name(self.name(), "nth_value"),
Field::new("item", args.input_type.clone(), self.nullable),
false)]
@eejbyfeldt is working on it in #11063
Summary:
@jayzhan211 Thank you for the code review and providing context. It was very helpful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me too -- thank you @jcsherin and @jayzhan211
@@ -39,8 +39,6 @@ pub enum AggregateFunction { | |||
Max, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this list is quite close to empty 🤞
ordering_req: LexOrdering, | ||
signature: Signature, | ||
/// Determines whether `N` is relative to the beginning or the end | ||
/// of the aggregation. When set to `true`, then `N` is from the end. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
I merged this PR up from main to ensure all the CI still passes and once they do I plan to merge it |
🚀 |
Thanks again @jayzhan211 and @jcsherin |
* Copies `NthValueAccumulator` to `functions-aggregate` * Partial implementation of `AggregateUDFImpl` Pending methods are: - `accumulator` - `state_fields` - `reverse_expr` * Implements `accumulator` method * Retains existing comments verbatim * Removes unnecessary path prefix * Implements `reverse_expr` method * Adds `nullable` field to `NthValue` * Revert to existing name * Implements `state_fields` method * Removes `nth_value` from `physical-expr` * Adds default * Exports `nth_value` * Fixes build error in physical plan roundtrip test * Minor: formatting * Parses `N` from input expression * Fixes build error by using `nth_value_udaf` * Fixes `reverse_expr` by passing correct `N` * Update plan with lowercase UDF name * Updates error message for incorrect no. of arguments This error message is manually formatted to remain consistent with existing error statements. It is not formatted by running: ``` cargo test -p datafusion-sqllogictest --test sqllogictests errors -- --complete ``` * Fixes nullable "item" in `state_fields` * Minor: fix formatting after resolving conflicts * Updates multiple existing plans with lowercase name * Implements `retract_batch` for window aggregations * Fixes: regex mismatch for error message in CI * Revert "Updates multiple existing plans with lowercase name" This reverts commit 1913efda49e585816286b54b371d4166ac894d1f. * Revert "Implements `retract_batch` for window aggregations" This reverts commit 4bb204f6ec8028c4e3313db5af3fabfcdaf7fea8. * Fixes: use builtin window function instead of udaf * Revert "Updates error message for incorrect no. of arguments" This reverts commit fa61ce62dcae6eae6f8e9c9900ebf8cff5023bc0. * Refactor: renames field and method * Removes hack for nullability * Minor: refactors `reverse_expr` * Minor: removes unncessary path prefix * Minor: cleanup arguments for creating aggregate expr * Refactor: extracts `merge_ordered_arrays` to `physical-expr-common` * Minor: adds todo for configuring nullability * Retrigger CI --------- Co-authored-by: Andrew Lamb <[email protected]>
* Copies `NthValueAccumulator` to `functions-aggregate` * Partial implementation of `AggregateUDFImpl` Pending methods are: - `accumulator` - `state_fields` - `reverse_expr` * Implements `accumulator` method * Retains existing comments verbatim * Removes unnecessary path prefix * Implements `reverse_expr` method * Adds `nullable` field to `NthValue` * Revert to existing name * Implements `state_fields` method * Removes `nth_value` from `physical-expr` * Adds default * Exports `nth_value` * Fixes build error in physical plan roundtrip test * Minor: formatting * Parses `N` from input expression * Fixes build error by using `nth_value_udaf` * Fixes `reverse_expr` by passing correct `N` * Update plan with lowercase UDF name * Updates error message for incorrect no. of arguments This error message is manually formatted to remain consistent with existing error statements. It is not formatted by running: ``` cargo test -p datafusion-sqllogictest --test sqllogictests errors -- --complete ``` * Fixes nullable "item" in `state_fields` * Minor: fix formatting after resolving conflicts * Updates multiple existing plans with lowercase name * Implements `retract_batch` for window aggregations * Fixes: regex mismatch for error message in CI * Revert "Updates multiple existing plans with lowercase name" This reverts commit 1913efda49e585816286b54b371d4166ac894d1f. * Revert "Implements `retract_batch` for window aggregations" This reverts commit 4bb204f6ec8028c4e3313db5af3fabfcdaf7fea8. * Fixes: use builtin window function instead of udaf * Revert "Updates error message for incorrect no. of arguments" This reverts commit fa61ce62dcae6eae6f8e9c9900ebf8cff5023bc0. * Refactor: renames field and method * Removes hack for nullability * Minor: refactors `reverse_expr` * Minor: removes unncessary path prefix * Minor: cleanup arguments for creating aggregate expr * Refactor: extracts `merge_ordered_arrays` to `physical-expr-common` * Minor: adds todo for configuring nullability * Retrigger CI --------- Co-authored-by: Andrew Lamb <[email protected]>
Which issue does this PR close?
Closes #11284.
Rationale for this change
What changes are included in this PR?
nth_value
to UDAFN
will be from the end instead of the startnth_value
is used in window aggregations the existing builtin function is used instead of this UDAF. An explicit bypass is done where the logical plan nodes are created. This is similar to how it works forfirst_value
andlast_value
.Are these changes tested?
roundtrip_physical_plan
is updatedAre there any user-facing changes?