Skip to content

Conversation

@hvanhovell
Copy link
Contributor

What changes were proposed in this pull request?

This PR moves SparkBuildInfo and the code that generates its properties to common/util.

Why are the changes needed?

We need SparkBuildInfo in the connect scala client and we are removing connect's dependency on core.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing tests.

@hvanhovell
Copy link
Contributor Author

cc @amaliujia

@amaliujia
Copy link
Contributor

The CI doesn't look relevant. Pyspark jobs seem to have issues.

@LuciferYang
Copy link
Contributor

There may be some issue.

Before this PR, after packaging through Maven, there would be a file named spark-version-info.properties in spark-core_2.12-4.0.0-SNAPSHOT.jar

However, after this PR, I did not find it in spark-core_2.12-4.0.0-SNAPSHOT.jar and spark-common-utils_2.12-4.0.0-SNAPSHOT.jar

@LuciferYang
Copy link
Contributor

I build a new client with this pr(dev/make-distribution.sh --tgz ), then run bin/spark-shell --master local

bin/spark-shell --master local
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
org.apache.spark.SparkException: Could not find spark-version-info.properties
  at org.apache.spark.SparkBuildInfo$.<init>(SparkBuildInfo.scala:35)
  at org.apache.spark.SparkBuildInfo$.<clinit>(SparkBuildInfo.scala)
  at org.apache.spark.package$.<init>(package.scala:46)
  at org.apache.spark.package$.<clinit>(package.scala)
  at org.apache.spark.SparkContext.$anonfun$new$1(SparkContext.scala:197)
  at org.apache.spark.internal.Logging.logInfo(Logging.scala:60)
  at org.apache.spark.internal.Logging.logInfo$(Logging.scala:59)
  at org.apache.spark.SparkContext.logInfo(SparkContext.scala:86)
  at org.apache.spark.SparkContext.<init>(SparkContext.scala:197)
  at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2888)
  at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:1099)
  at scala.Option.getOrElse(Option.scala:189)
  at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:1093)
  at org.apache.spark.repl.Main$.createSparkSession(Main.scala:112)
  ... 55 elided
<console>:14: error: not found: value spark
       import spark.implicits._
              ^
<console>:14: error: not found: value spark
       import spark.sql
              ^
Exception in thread "main" java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.package$
	at org.apache.spark.repl.SparkILoop.printWelcome(SparkILoop.scala:100)
	at org.apache.spark.repl.SparkILoop.$anonfun$process$10(SparkILoop.scala:222)
	at org.apache.spark.repl.SparkILoop.withSuppressedSettings$1(SparkILoop.scala:189)
	at org.apache.spark.repl.SparkILoop.startup$1(SparkILoop.scala:201)
	at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:236)
	at org.apache.spark.repl.Main$.doMain(Main.scala:78)
	at org.apache.spark.repl.Main$.main(Main.scala:58)
	at org.apache.spark.repl.Main.main(Main.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1029)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

<!-- Execute the shell script to generate the spark build information. -->
<target>
<exec executable="${shell}">
<arg value="${project.basedir}/../build/${spark-build-info-script}"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<arg value="${project.basedir}/../build/${spark-build-info-script}"/>
<arg value="${project.basedir}/../../build/${spark-build-info-script}"/>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvanhovell with this change , bin/spark-shell --master local can run successful

@hvanhovell
Copy link
Contributor Author

Merging this. Failures are unrelated.

hvanhovell added a commit that referenced this pull request Jul 26, 2023
### What changes were proposed in this pull request?
This PR moves `SparkBuildInfo` and the code that generates its properties to `common/util`.

### Why are the changes needed?
We need `SparkBuildInfo` in the connect scala client and we are removing connect's dependency on `core`.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Existing tests.

Closes #42133 from hvanhovell/SPARK-44530.

Authored-by: Herman van Hovell <[email protected]>
Signed-off-by: Herman van Hovell <[email protected]>
(cherry picked from commit 3518799)
Signed-off-by: Herman van Hovell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants