-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-52905][PYTHON] Arrow UDF for window #51593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
LGTM! Would you fix the lint (too long line)? |
27b9243 to
fe36f03
Compare
|
thanks, merged to master |
| child: SparkPlan): ArrowWindowPythonExec = { | ||
| val evalTypes = windowExpression.map(w => WindowFunctionType.pythonEvalType(w).get) | ||
| assert(evalTypes.distinct.size == 1, | ||
| "All window functions must have the same eval type in WindowInPandasExec") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WindowInPandasExec? I guess you meant ArrowWindowPythonExec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, will fix it in #51648
|
|
||
|
|
||
| def wrap_window_agg_arrow_udf(f, args_offsets, kwargs_offsets, return_type, runner_conf, udf_index): | ||
| window_bound_types_str = runner_conf.get("pandas_window_bound_types") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas_window_bound_types? Maybe we should have arrow_window_bound_types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arrow UDF reuses the same https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowWindowPythonEvaluatorFactory.scala#L71
Maybe we can give it a better name
What changes were proposed in this pull request?
Arrow UDF for window
Why are the changes needed?
to make Arrow UDF support window operation
Does this PR introduce any user-facing change?
Not, yet. Will make Arrow UDF public soon
How was this patch tested?
New tests
Was this patch authored or co-authored using generative AI tooling?
No