-
Notifications
You must be signed in to change notification settings - Fork 3k
Spark 3.1: Remove module #8661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark 3.1: Remove module #8661
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Some more followups may be needed.
- We can remove spark 3.1 mention from documents also. (spark-writes.md, nessie.md)
- Some table properties can be removed which was only used in spark-3.1
iceberg/core/src/main/java/org/apache/iceberg/TableProperties.java
Lines 356 to 367 in 1e52f2e
/** * @deprecated will be removed once Spark 3.1 support is dropped, the cardinality check is always * performed starting from 0.13.0. */ @Deprecated public static final String MERGE_CARDINALITY_CHECK_ENABLED = "write.merge.cardinality-check.enabled"; /** * @deprecated will be removed once Spark 3.1 support is dropped, the cardinality check is always * performed starting from 0.13.0. */ @Deprecated public static final boolean MERGE_CARDINALITY_CHECK_ENABLED_DEFAULT = true;
I can raise the followup PR if you are busy (If you can't handle in this PR maybe)
Fokko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we should remove 3.1, but why not do it after 1.4? Am I missing any downsides besides having to upload a few more artifacts?
Some folks went through the trouble of backporting features to 3.1 (eg. #8217). Also, as @ajantha-bhat mentioned, we need to clean up the docs as well: #7390 (review)
|
@Fokko, you are right that we did cherry-pick some changes to 3.1 but it is heavily behind in capabilities. There are some known limitations that may affect performance (e.g. the way we do cardinality check and distribute data on write, for example). Someone may test the brand new Iceberg jar with Spark 3.1 and think that's the best we offer. That said, it is minor. I am debating this too and have no problem doing 1.4 with this module in. |
|
@ajantha-bhat, I'd appreciate if you could help with the changes in core and docs in a follow-up PR. |
|
@danielcweeks @RussellSpitzer @rdblue, any thoughts on removing or keeping Spark 3.1 before 1.4? |
|
+1 On removing it now, I think any work on 3.1 is really just sunk costs. We don't gain that much and we are basically saying we will continue supporting 3.1 for any 1.4.x releases that may need to occur. |
|
I'm +1 on removing it as well (agree with @RussellSpitzer points). |
|
Agreed, let's remove now. |
|
@ajantha-bhat, I've submitted a PR to quickly drop the table props to cut an RC asap. Could you look into other places that I missed? |
Sorry for the delay. I live in India timezone. |
This PR removes support for Spark 3.1 as discussed on the dev list here.