-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-44918][SQL][PYTHON] Support named arguments in scalar Python/Pandas UDFs #42617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dtenedor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM. I left one suggestion for a testing idea. Thanks a lot for adding this feature!!
sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonUDFRunner.scala
Outdated
Show resolved
Hide resolved
| /* | ||
| * Check if the named arguments: | ||
| * - don't have duplicated names | ||
| * - don't contain positional arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * - don't contain positional arguments | |
| * - don't contain positional arguments after named arguments | |
| * - all map to valid argument names from the function declaration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The third item is not right for Python UDF/UDTF. Currently it relies on Python.
|
|
||
| inputRDD.mapPartitions { iter => | ||
| val context = TaskContext.get() | ||
| val contextAwareIterator = new ContextAwareIterator(context, iter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
|
Thanks! merging to master. |
What changes were proposed in this pull request?
Supports named arguments in scalar Python/Pandas UDF.
For example:
or:
Why are the changes needed?
Now that named arguments support was added (#41796, #42020).
Scalar Python/Pandas UDFs can support it.
Does this PR introduce any user-facing change?
Yes, named arguments will be available for scalar Python/Pandas UDFs.
How was this patch tested?
Added related tests.
Was this patch authored or co-authored using generative AI tooling?
No.