Skip to content

Conversation

@panbingkun
Copy link
Contributor

@panbingkun panbingkun commented May 18, 2023

What changes were proposed in this pull request?

The pr aims to remove workaround for HADOOP-16255.
https://issues.apache.org/jira/browse/HADOOP-16255

Why are the changes needed?

  • Because HADOOP-16255 has been fix after hadoop version 3.1.2. Spark support hadoop version: >= 3.2.2 or >= 3.3.1
  • Make code clean.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass GA.

@LuciferYang
Copy link
Contributor

cc @attilapiros @viirya @sunchao @pan3793 FYI

@panbingkun
Copy link
Contributor Author

image

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - pending CI

@HyukjinKwon
Copy link
Member

Hmm .. does that mean Hadoop 3.2.0 won't work with this?

@HyukjinKwon
Copy link
Member

Hm, we currently build Spark w/ Hadoop 3.3.0 by default it might be fine but I would also ask some more looks e.g., @srowen @mridulm @tgravescs @dongjoon-hyun

@sunchao
Copy link
Member

sunchao commented May 18, 2023

Spark only works with Hadoop 3.3.1+ at the moment, as we discovered in #40847. We can potentially make it work with Hadoop 3.2.2+ if there's a workaround for https://issues.apache.org/jira/browse/SPARK-40039 which uses Hadoop API that only exist in Hadoop 3.3.1+. It definitely won't work with Hadoop 3.2.0 due to some shaded client related issues.

@HyukjinKwon
Copy link
Member

Thanks for clarification. Lgtm2

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
### What changes were proposed in this pull request?
The pr aims to remove workaround for HADOOP-16255.
https://issues.apache.org/jira/browse/HADOOP-16255

### Why are the changes needed?
- Because HADOOP-16255 has been fix after hadoop version  3.1.2. Spark support hadoop version: >= 3.2.2 or >= 3.3.1
- Make code clean.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes apache#41209 from panbingkun/SPARK-43548.

Authored-by: panbingkun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants