Skip to content

Conversation

@vinodkc
Copy link
Contributor

@vinodkc vinodkc commented Nov 4, 2025

Backport #52767 to Spark 4.0 branch
We upgraded Guava from 14.0.1 to 30+ in  spark 4.0 . Guava 33.4.0 used in Spark 4 consists of two main packages:

  • com.google.common
  • com.google.thirdparty

Prior to this PR, only the com.google.common package was shaded into the spark-network-common jar, while classes under com.google.thirdparty remained unshaded in the spark-network-common jar. This partial shading causes classloading conflicts and runtime errors when a downstream project depends on both Spark and its own version of Guava.

Eg: calls to guava class com.google.common.net.InternetDomainName fails with the following error:

Caused by: java.lang.NoSuchFieldError: EXACT
        at com.google.common.net.InternetDomainName.findSuffixOfType(InternetDomainName.java:226)
        at com.google.common.net.InternetDomainName.publicSuffixIndex(InternetDomainName.java:185)
        at com.google.common.net.InternetDomainName.hasPublicSuffix(InternetDomainName.java:400)
        at com.eadx.Domain$.printDomainInfo(Domain.scala:16)
        at com.eadx.TestApp$.main(TestApp.scala:16)

Root Cause:
com.google.common.net.InternetDomainName uses classes from com.google.thirdparty.publicsuffix.
The classloader resolves com.google.common.net.InternetDomainName from the downstream Guava jar, while com.google.thirdparty.publicsuffix.PublicSuffixPatterns is loaded from Spark 4.x Guava classes, leading to binary incompatibility.

Example diagnostic:

InternetDomainName → guava-32.0.0-jre.jar
(target/.../guava-32.0.0-jre.jar)

PublicSuffixPatterns → spark-network-common_2.13-4.0.0.jar
(target/.../spark-network-common_2.13-4.0.0.jar)

What changes were proposed in this pull request?

This PR ensures package com.google.thirdparty is also shaded and isolated under the sparkproject namespace in Spark, preventing downstream class conflicts and runtime errors.

Why are the changes needed?

These changes are necessary to prevent runtime errors and class conflicts for downstream projects that depend on both Spark and Guava by restoring proper isolation of shaded Guava classes in spark

Does this PR introduce any user-facing change?

No

How was this patch tested?

No new test cases added; used existing UT and IT.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the BUILD label Nov 4, 2025
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-54049][BUILD] Backport to Spark 4.0. Shade com.google.thirdparty package to fix Guava class conflicts in spark 4.0 [SPARK-54049][BUILD][4.0] Shade com.google.thirdparty package to fix Guava class conflicts in spark 4.0 Nov 4, 2025
@vinodkc
Copy link
Contributor Author

vinodkc commented Nov 4, 2025

retest this please

@HyukjinKwon
Copy link
Member

Merged to branch-4.0.

HyukjinKwon pushed a commit that referenced this pull request Nov 4, 2025
…Guava class conflicts in spark 4.0

Backport #52767 to Spark 4.0 branch
We upgraded Guava from 14.0.1 to 30+ in  spark 4.0 . Guava 33.4.0 used in Spark 4 consists of two main packages:

- `com.google.common`
- `com.google.thirdparty`

Prior to this PR, only the `com.google.common` package was shaded into the spark-network-common jar, while classes under `com.google.thirdparty` remained unshaded in the spark-network-common jar. This partial shading causes classloading conflicts and runtime errors when a downstream project depends on both Spark and its own version of Guava.

Eg:  calls to guava class `com.google.common.net.InternetDomainName` fails with the following error:
```
Caused by: java.lang.NoSuchFieldError: EXACT
        at com.google.common.net.InternetDomainName.findSuffixOfType(InternetDomainName.java:226)
        at com.google.common.net.InternetDomainName.publicSuffixIndex(InternetDomainName.java:185)
        at com.google.common.net.InternetDomainName.hasPublicSuffix(InternetDomainName.java:400)
        at com.eadx.Domain$.printDomainInfo(Domain.scala:16)
        at com.eadx.TestApp$.main(TestApp.scala:16)
```
**Root Cause**:
`com.google.common.net.InternetDomainName` uses classes from `com.google.thirdparty.publicsuffix`.
The classloader resolves `com.google.common.net.InternetDomainName` from the downstream Guava jar, while `com.google.thirdparty.publicsuffix.PublicSuffixPatterns` is loaded from Spark 4.x Guava classes, leading to binary incompatibility.

Example diagnostic:

```
InternetDomainName → guava-32.0.0-jre.jar
(target/.../guava-32.0.0-jre.jar)

PublicSuffixPatterns → spark-network-common_2.13-4.0.0.jar
(target/.../spark-network-common_2.13-4.0.0.jar)
```

### What changes were proposed in this pull request?

This PR ensures package  `com.google.thirdparty` is also shaded and isolated under the sparkproject namespace in Spark, preventing downstream class conflicts and runtime errors.

### Why are the changes needed?

These changes are necessary to prevent runtime errors and class conflicts for downstream projects that depend on both Spark and Guava by restoring proper isolation of shaded Guava classes in spark

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

No new test cases added; used existing UT and IT.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #52869 from vinodkc/br_shade_guava_thirdparty_4.0.

Authored-by: vinodkc <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
@HyukjinKwon HyukjinKwon closed this Nov 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants