Skip to content

Conversation

@aokolnychyi
Copy link
Contributor

This PR removes support for Spark 3.1 as discussed on the dev list here.

@aokolnychyi
Copy link
Contributor Author

aokolnychyi commented Sep 27, 2023

Copy link
Member

@ajantha-bhat ajantha-bhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Some more followups may be needed.

  • We can remove spark 3.1 mention from documents also. (spark-writes.md, nessie.md)
  • Some table properties can be removed which was only used in spark-3.1
    /**
    * @deprecated will be removed once Spark 3.1 support is dropped, the cardinality check is always
    * performed starting from 0.13.0.
    */
    @Deprecated
    public static final String MERGE_CARDINALITY_CHECK_ENABLED =
    "write.merge.cardinality-check.enabled";
    /**
    * @deprecated will be removed once Spark 3.1 support is dropped, the cardinality check is always
    * performed starting from 0.13.0.
    */
    @Deprecated public static final boolean MERGE_CARDINALITY_CHECK_ENABLED_DEFAULT = true;

I can raise the followup PR if you are busy (If you can't handle in this PR maybe)

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should remove 3.1, but why not do it after 1.4? Am I missing any downsides besides having to upload a few more artifacts?

Some folks went through the trouble of backporting features to 3.1 (eg. #8217). Also, as @ajantha-bhat mentioned, we need to clean up the docs as well: #7390 (review)

nk1506 added a commit to nk1506/iceberg that referenced this pull request Sep 27, 2023
@aokolnychyi
Copy link
Contributor Author

aokolnychyi commented Sep 27, 2023

@Fokko, you are right that we did cherry-pick some changes to 3.1 but it is heavily behind in capabilities. There are some known limitations that may affect performance (e.g. the way we do cardinality check and distribute data on write, for example). Someone may test the brand new Iceberg jar with Spark 3.1 and think that's the best we offer.

That said, it is minor. I am debating this too and have no problem doing 1.4 with this module in.

@aokolnychyi
Copy link
Contributor Author

@ajantha-bhat, I'd appreciate if you could help with the changes in core and docs in a follow-up PR.

@aokolnychyi
Copy link
Contributor Author

@danielcweeks @RussellSpitzer @rdblue, any thoughts on removing or keeping Spark 3.1 before 1.4?

@RussellSpitzer
Copy link
Member

+1 On removing it now, I think any work on 3.1 is really just sunk costs. We don't gain that much and we are basically saying we will continue supporting 3.1 for any 1.4.x releases that may need to occur.

@danielcweeks
Copy link
Contributor

danielcweeks commented Sep 27, 2023

I'm +1 on removing it as well (agree with @RussellSpitzer points).

@aokolnychyi
Copy link
Contributor Author

Agreed, let's remove now.

@aokolnychyi aokolnychyi merged commit 3e37106 into apache:master Sep 27, 2023
@aokolnychyi
Copy link
Contributor Author

@ajantha-bhat, I've submitted a PR to quickly drop the table props to cut an RC asap. Could you look into other places that I missed?

@ajantha-bhat
Copy link
Member

ajantha-bhat commented Sep 28, 2023

@ajantha-bhat, I've submitted a PR to quickly drop the table props to cut an RC asap. Could you look into other places that I missed?

Sorry for the delay. I live in India timezone.
Here is the PR #8671
We can merge this and use it if there is another RC for some reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants