-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-44246][CONNECT][FOLLOW-UP] Miscellaneous cleanups for Spark Connect Jar/Classfile Isolation #41789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // Empty environment variables | ||
| envVars = Maps.newHashMap(), | ||
| pythonIncludes = pythonIncludes, | ||
| pythonIncludes = sessionHolder.artifactManager.getSparkConnectPythonIncludes.asJava, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HyukjinKwon With directly fetching the Python Includes, I've removed the private lazy val pythonIncludes and SessionHolder#withSessionBasedPythonPaths since we don't need to propagate them via confs anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotya. This is exactly the change I planned to made this week :-). Thank you for doing this here.
|
Merged to master. |
| import org.apache.spark.{JobArtifactSet, LocalSparkContext, SparkConf, SparkContext, SparkFunSuite} | ||
| import org.apache.spark.util.Utils | ||
|
|
||
| class ClassLoaderIsolationSuite extends SparkFunSuite with LocalSparkContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vicennial The new test failed with Scala 2.13:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should upload both TestHelloV2 and TestHelloV3 compiled jar in Scala 2.13 at the same time?
On the other hand, I think we should provide the source code of TestHelloV2 and `TestHelloV3 (such as in comments?). Otherwise, when upgrading to the major version of Scala(for example Scala 3.x), we will not know how to generate the new jar package for testing
Sorry, I found the source code in commnets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try fix #41852
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @LuciferYang
… with Scala 2.13 ### What changes were proposed in this pull request? The main change of this pr as follows: 1. rename `TestHelloV2.jar` and `TestHelloV3.jar` added in #41789 to `TestHelloV2_2.12.jar` and `TestHelloV3_2.12.jar` 2. Add corresponding `TestHelloV2_2.13.jar` and `TestHelloV3_2.13.jar` which compiled with Scala 2.13 3. Make `ClassLoaderIsolationSuite` use the correct jar in testing ### Why are the changes needed? Make `ClassLoaderIsolationSuite` test pass with Scala 2.13. The Scala 2.13 daily test failed after #41789 - https://github.com/apache/spark/actions/runs/5447771717/jobs/9910185372 ``` [info] - Executor classloader isolation with JobArtifactSet *** FAILED *** (83 milliseconds) [info] org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (localhost executor driver): java.lang.NoClassDefFoundError: scala/Serializable [info] at java.lang.ClassLoader.defineClass1(Native Method) [info] at java.lang.ClassLoader.defineClass(ClassLoader.java:756) [info] at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) [info] at java.net.URLClassLoader.defineClass(URLClassLoader.java:473) [info] at java.net.URLClassLoader.access$100(URLClassLoader.java:74) [info] at java.net.URLClassLoader$1.run(URLClassLoader.java:369) [info] at java.net.URLClassLoader$1.run(URLClassLoader.java:363) [info] at java.security.AccessController.doPrivileged(Native Method) [info] at java.net.URLClassLoader.findClass(URLClassLoader.java:362) [info] at java.lang.ClassLoader.loadClass(ClassLoader.java:418) [info] at java.lang.ClassLoader.loadClass(ClassLoader.java:351) [info] at java.lang.Class.forName0(Native Method) [info] at java.lang.Class.forName(Class.java:348) [info] at org.apache.spark.util.SparkClassUtils.classForName(SparkClassUtils.scala:35) [info] at org.apache.spark.util.SparkClassUtils.classForName$(SparkClassUtils.scala:30) [info] at org.apache.spark.util.Utils$.classForName(Utils.scala:94) [info] at org.apache.spark.executor.ClassLoaderIsolationSuite.$anonfun$new$3(ClassLoaderIsolationSuite.scala:53) [info] at scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.scala:18) [info] at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:576) [info] at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:574) [info] at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) [info] at org.apache.spark.rdd.RDD.$anonfun$foreach$2(RDD.scala:1028) [info] at org.apache.spark.rdd.RDD.$anonfun$foreach$2$adapted(RDD.scala:1028) [info] at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2406) [info] at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) [info] at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161) [info] at org.apache.spark.scheduler.Task.run(Task.scala:141) [info] at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:593) [info] at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1478) [info] at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:596) [info] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [info] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [info] at java.lang.Thread.run(Thread.java:750) [info] Caused by: java.lang.ClassNotFoundException: scala.Serializable [info] at java.net.URLClassLoader.findClass(URLClassLoader.java:387) [info] at java.lang.ClassLoader.loadClass(ClassLoader.java:418) [info] at java.lang.ClassLoader.loadClass(ClassLoader.java:351) [info] ... 33 more ``` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Actions - Manual check: **Scala 2.12** ``` build/sbt "core/testOnly *ClassLoaderIsolationSuite" ``` ``` [info] ClassLoaderIsolationSuite: [info] - Executor classloader isolation with JobArtifactSet (1 second, 394 milliseconds) [info] Run completed in 2 seconds, 437 milliseconds. [info] Total number of tests run: 1 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. ``` **Scala 2.13** ``` dev/change-scala-version.sh 2.13 build/sbt "core/testOnly *ClassLoaderIsolationSuite" -Pscala-2.13 ``` ``` [info] ClassLoaderIsolationSuite: [info] - Executor classloader isolation with JobArtifactSet (1 second, 355 milliseconds) [info] Run completed in 2 seconds, 264 milliseconds. [info] Total number of tests run: 1 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. ``` Closes #41852 from LuciferYang/SPARK-44297. Authored-by: yangjie01 <[email protected]> Signed-off-by: yangjie01 <[email protected]>
…ry files ### What changes were proposed in this pull request? The purpose of this pr is to clean up the binary files used to assist with Scala 2.12 testing. They include: - `core/src/test/resources/TestHelloV3_2.12.jar` and `core/src/test/resources/TestHelloV2_2.12.jar` added by SPARK-44246(#41789). - `connector/connect/client/jvm/src/test/resources/udf2.12` and `connector/connect/client/jvm/src/test/resources/udf2.12.jar` added by SPARK-43744(#42069) - `connector/connect/client/jvm/src/test/resources/TestHelloV2_2.12.jar` added by SPARK-44293(#41844) - `sql/hive/src/test/resources/regression-test-SPARK-8489/test-2.12.jar` added by SPARK-25304(#22308) ### Why are the changes needed? Spark 4.0 no longer supports Scala 2.12. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes #43106 from LuciferYang/SPARK-45321. Authored-by: yangjie01 <[email protected]> Signed-off-by: yangjie01 <[email protected]>
What changes were proposed in this pull request?
This PR is a follow-up of #41701 and addresses the comments mentioned here. The summary is:
pythonIncludesare directly fetched from theArtifactManagerviaSessionHolderinstead of propagating through the spark confSessionHolder#withContextrenamed toSessionHolder#withContextClassLoaderto decrease ambiguity.ArtifactManagerSuiteand a new suiteClassLoaderIsolationSuite.Why are the changes needed?
General follow-ups from here.
Does this PR introduce any user-facing change?
No
How was this patch tested?
New test suite and unit tests.