-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-34762][BUILD] Fix the build failure with Scala 2.13 which is related to commons-* with better solution #31880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pom.xml
Outdated
| <commons.collections.version>3.2.2</commons.collections.version> | ||
| <scala.version>2.12.10</scala.version> | ||
| <scala-2.12.version>2.12.10</scala-2.12.version> | ||
| <scala-2.13.version>2.13.5</scala-2.13.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reviewers:
I confirmed this solution works even for branch-3.1 but branch-3.1 uses 2.13.4 rather than 2.13.5 so I'll open a backport PR after this PR merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, why do we need both versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to have both versions to avoid issues like SPARK-34774.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite see why it is necessary for this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the advantage or need - surely the build has one scala version? otherwise you have to flip even which property is referenced in the whole build?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first time I open this PR, <scala.version>2.12.10</scala.version> is absent from the profile <id>scala-2.12</id> so, if we change the version with change-scala-version.sh 2.13, there will be no more 2.12.10 because <scala.version>2.12.10</scala.version> is overwritten with <scala.version>2.13.5</scala.version>.
Then, change-scala-version.sh 2.12 will fail to modify pom.xml.
If we have both versions in properties (<scala-2.12.version> and <scala-2.13-version>) which will not be overwritten by change-scala-version.sh, it's easy to choose scala.version to change.
But, as I mentioned above, now we don't need such properties.
|
cc: @HyukjinKwon |
|
After I confirm that GA and Jenkins pass, I'll merge this. |
|
Test build #136209 has finished for PR 31880 at commit
|
|
retest this please. |
|
Test build #136218 has finished for PR 31880 at commit
|
| # This is a workaround for SPARK-34762. | ||
| ESCAPED_TO_VERSION=$(echo $TO_VERSION | sed -n "s/\./\\\\./gp") | ||
| SCALA_VERSION=$(sed -n "/<id>scala-$ESCAPED_TO_VERSION<\/id>/,/<\/profile>/ \ | ||
| s;^.*<scala\.version>\(.*\)</scala\.version>.*$;\1;p" pom.xml) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine now(?) we're avoiding maven help that caused side effect. It makes the script more robust anyway ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean we can go with this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't see why this part is necessary. Can we fix this without breaking other parts of the build script that depend on scala.version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sbt (or courier?) seems to fail to resolve dependency if the pom file for a dependency is in ~/.m2 but jar file is not. I don't know the reason.
For master branch, before #31862, change-scala-version.sh run mvn help:evaluate and it downloads commons-cli-1.2.pom but doesn't commons-cli-1.2.jar.
$ ls -R ~/.m2/repository/commons-cli/commons-cli/
/home/kou/.m2/repository/commons-cli/commons-cli/:
1.2
/home/kou/.m2/repository/commons-cli/commons-cli/1.2:
_remote.repositories commons-cli-1.2.pom commons-cli-1.2.pom.sha1
You can also confirm with mvn -X help:evaluate.
So, I resolved by getting commons-cli-1.2.jarusing mvn dependency:get in #31862.
For branch-3.1, mvn help:evaluate also downloads commons-cli-1.2.jar but it's resolved this part by #31862.
But mvn dependency:get downloads commons-io-2.6.pom though it doesn't download commons-io-2.6.jar.
ls -R ~/.m2/repository/commons-io/commons-io/
/home/kou/.m2/repository/commons-io/commons-io/:
2.4 2.5 2.6
/home/kou/.m2/repository/commons-io/commons-io/2.4:
_remote.repositories commons-io-2.4.pom commons-io-2.4.pom.sha1
/home/kou/.m2/repository/commons-io/commons-io/2.5:
_remote.repositories commons-io-2.5.jar commons-io-2.5.jar.sha1 commons-io-2.5.pom commons-io-2.5.pom.sha1
/home/kou/.m2/repository/commons-io/commons-io/2.6:
_remote.repositories commons-io-2.6.pom commons-io-2.6.pom.sha1
I understand branch-3.1 depends on commons-io-2.5 but, in fact, if we manually delete commons-io/comons-io-2.6 before sbt, build successfully finishes.
It's also true for master that commons-io-2.6.pom is present but commons-io-2.6.jar is absent.
But there is one difference between master and branch-3.1.
master depends on commons-io-2.8 which is newer version than commons-io-2.6 while branch-3.1 depends on commons-io-2.5 which is older than commons-io-2.6.
So I guess this affects build failure for branch-3.1 while it succeeds for master.
Anyway, if we don't use maven plugins in change-scala-version.sh, this problem can be resolved easily.
Or, do you have a better solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we solve this by updating commons-io in older branches? that would be fine too IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought what you think too and it can resolve this issue for the time being.
But I'm afraid that this build failure happens again in the future.
In this case, only commons-cli and comons-io matters but, actually, help and dependency downloads not only them.
I confirmed that help downloads pom files but not jar files for 300+ dependencies.
If we use newer maven or upgraded plugins and Spark and those plugins have a comondependency but plugins use newer version, this problem can happen again.
My worry might be unnecessary or you think we just just fix this problem when it happens again, I'll close this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear can we fix this by using newer Maven versions or newer plugins, or newer versions of the dependencies? I think that's fine, even if it means it pulls in a lot of stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can fix this by using newer Maven or newer plugins.
Spark needs to use newer version of dependencies than what plugins use.
This problem can happen if all the following condition is true.
- Spark and a plugin have a common direct/indirect dependcy.
- A plugin uses newer or the same version of the dependency.
- The plugin downloads
pombut notjarfor the dependency. - Build with
sbt(or may be the casecourieris used) under the condition that the pom is present but the jar is absent..
One example is dependency-plugin and commons-io. Both Spark and dependency-plugin depends on commons-io (dependency-plugin seems to depend on it indirectly).
And branch-3.1 depends on commons-io:2.4, while dependency-plugin depends on newer commons-io:2.6.
When mvn dependency:get runs, pom is downloaded but doesn't jar for commons-io:2.6.
Under this condition, if we build with sbt, sbt or courier doesn't download the dependent jar, leading this issue.
Newer Maven and newer plugins can depends on newer version of the common dependency than what Spark depends on. So I don't think we can't fix this issue using newer Maven or newer plugins.
|
Hi, All. To isolate the release branch (branch-3.1) from Scala 2.13, I removed Scala-2.13 Build GitHub Action job from branch-3.1 completely. From now, we are able to focus on |
|
Now GA for |
What changes were proposed in this pull request?
This PR fixes the issue that build fails with Scala 2.13 and sbt with better solution.
The issue was resolved in #31862 for
masterbranch but it's still inbranch-3.1.The reason for
branch-3.1ismvn helpinchange-scala-version.shdownloads the POM file ofcommons-ioand the JAR file is not downloaded.For
masterbranch, it's notcommons-iobutcommons-cli.According to the result, the affected library seems to be subject to various factors but one factor is the maven help plugin.
So, I modified
change-scala-version.shto change the way to fetchscala.versionto simply usesedrather thanmvn.Why are the changes needed?
To fix the issue with more proper way.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
I confirmed build succeed with the following procedure.