-
Notifications
You must be signed in to change notification settings - Fork 3k
Spark: Fix UnresolvedException for some filters in rewrite_data_files procedure #3757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc: @RussellSpitzer , @jackye1995 |
.../v3.2/spark/src/main/java/org/apache/iceberg/spark/procedures/RewriteDataFilesProcedure.java
Outdated
Show resolved
Hide resolved
.../v3.2/spark/src/main/java/org/apache/iceberg/spark/procedures/RewriteDataFilesProcedure.java
Outdated
Show resolved
Hide resolved
|
Thanks. I think this is short enough to fix, could you directly do it across all versions affected? |
61d3333 to
53c02fe
Compare
@jackye1995 : I will do that once PR is approved. Else I might have to rework on multiple files if I get comments. |
53c02fe to
9f6693b
Compare
.../v3.2/spark/src/main/java/org/apache/iceberg/spark/procedures/RewriteDataFilesProcedure.java
Outdated
Show resolved
Hide resolved
...ark/src/main/scala/org/apache/spark/sql/execution/datasources/SparkExpressionConverter.scala
Show resolved
Hide resolved
...ark/src/main/scala/org/apache/spark/sql/execution/datasources/SparkExpressionConverter.scala
Show resolved
Hide resolved
...ark/src/main/scala/org/apache/spark/sql/execution/datasources/SparkExpressionConverter.scala
Show resolved
Hide resolved
9f6693b to
3df1684
Compare
|
@jackye1995 , @RussellSpitzer @huaxingao : I have addressed the comments. Please have a look at it again |
...ensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteDataFilesProcedure.java
Show resolved
Hide resolved
...ensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteDataFilesProcedure.java
Show resolved
Hide resolved
c8e2656 to
b19f0ba
Compare
|
@RussellSpitzer , @jackye1995 , @huaxingao : PR is ready. Please take a look at it again. |
...ensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteDataFilesProcedure.java
Outdated
Show resolved
Hide resolved
...ensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteDataFilesProcedure.java
Show resolved
Hide resolved
...ark/src/main/scala/org/apache/spark/sql/execution/datasources/SparkExpressionConverter.scala
Outdated
Show resolved
Hide resolved
b19f0ba to
31ba8db
Compare
31ba8db to
dd64c9a
Compare
jackye1995
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me, thanks for working on this! @ajantha-bhat please let me know if you would like to have also add fixes in the other version in this PR, or have separated PRs, up to you.
| private void insertData(int filesCount) { | ||
| ThreeColumnRecord record1 = new ThreeColumnRecord(1, "foo", "detail1"); | ||
| ThreeColumnRecord record2 = new ThreeColumnRecord(2, "bar", "detail2"); | ||
| ThreeColumnRecord record1 = new ThreeColumnRecord(1, "foo", null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed why was this changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while adding a new test cases for all possible filters, I wanted some null data so that my NOT NULL filter will not execute compaction (doesn't select data). C3 was never used in the testcases. So reused it with null data.
RussellSpitzer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question about the null's in the test insert, but i'm good for merge as well.
I think separated is better as we are following for spark PRs. |
|
merged, thanks @ajantha-bhat for the fix, and thanks @RussellSpitzer and @rdblue for reviews! |
Some filters fails with exception mentioned in #3756.
Hence use the spark resolved expressions in rewrite_data_files procedure
Fixes #3756