-
Notifications
You must be signed in to change notification settings - Fork 3k
Spark: Support truncate in FunctionCatalog #5431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark: Support truncate in FunctionCatalog #5431
Conversation
a602e50 to
eae6bb8
Compare
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/functions/TruncateFunction.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/apache/iceberg/util/TestTruncateUtil.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/apache/iceberg/util/TestTruncateUtil.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/functions/TruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/functions/TruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/functions/TruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/functions/TruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/functions/TruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/functions/TruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/functions/TruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/functions/TruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/functions/TruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/functions/TruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/functions/TruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestSparkTruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestSparkTruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestSparkTruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestSparkTruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestSparkTruncateFunction.java
Outdated
Show resolved
Hide resolved
spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestSparkTruncateFunction.java
Outdated
Show resolved
Hide resolved
…ll input values in invoke
… on null on the special accesssors of certain row types that might not tolerate it
ddb851c to
f33b356
Compare
1d63a04 to
9e86776
Compare
…ove separate function for validation
…r fewer than 2 and more than 2 arguments
…ccepted in storage partitioned joins impl
|
@aokolnychyi @rdblue I addressed all of your feedback. Please take a look when you get a chance. Also, on the subject of nullability for the Knowing that Spark isn't the best at keeping track of nullability, I think this is better and adheres to the contract laid out in But take a look at the commit that added nullability-checking on the width field if you'd like: ad5343f |
fd7ef4b to
ce6dba1
Compare
|
Looks good to me. @aokolnychyi, do you want to take another look? |
|
@rdblue, let me take a quick now. |
aokolnychyi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
|
Thanks, @kbendick! Could you cherry-pick this to 3.2? |
|
Here's the PR for bucket. All the feedback from this PR was more or less applied there as well - #5513 |
Sure thing! |
(cherry picked from commit 6a5051b)
This is an offshoot of #5305 and partially closes #5349.
Adds a
system.truncatefunction that can be used in Spark SQL, as well as can be used as aFunctionCatalogfunction that can be turned into a transform for storage partitioned joins.This also breaks the definition of
Truncateout into utility functions inside ofTruncateUtil. Because different usages validate input at different times, all of the functions inTruncateUtildo not validate their input and instead assume that the input is validated by the calling code. This allows for theTruncatetransforms to validate their width one time (on instantiation), and for the Sparktruncatefunction to skip input validation for faster generated code.