Skip to content

Conversation

@junoha
Copy link
Contributor

@junoha junoha commented Oct 5, 2023

Issue #, if available:

NoClassDefFoundError error occurs when checking streaming job with the latest docker image.
It seems that Spark UI container uses Java 21 not Java 8.

2023-10-04 15:14:18 WARN server.HttpChannel: /history/spark-application-1696388583270/jobs/
org.sparkproject.guava.util.concurrent.ExecutionError: java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.catalyst.util.DateTimeUtils$
	at org.sparkproject.guava.cache.LocalCache$Segment.get(LocalCache.java:2261)
	at org.sparkproject.guava.cache.LocalCache.get(LocalCache.java:4000)
	at org.sparkproject.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
	at org.sparkproject.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
	at org.apache.spark.deploy.history.ApplicationCache.get(ApplicationCache.scala:89)
	at org.apache.spark.deploy.history.ApplicationCache.withSparkUI(ApplicationCache.scala:101)
	at org.apache.spark.deploy.history.HistoryServer.org$apache$spark$deploy$history$HistoryServer$$loadAppUi(HistoryServer.scala:256)
	at org.apache.spark.deploy.history.HistoryServer$$anon$1.doGet(HistoryServer.scala:104)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:503)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:590)
	at org.sparkproject.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
	at org.sparkproject.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1631)
	at org.apache.spark.ui.HttpSecurityFilter.doFilter(HttpSecurityFilter.scala:95)
	at org.sparkproject.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
	at org.sparkproject.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
	at org.sparkproject.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
	at org.sparkproject.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
	at org.sparkproject.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)
	at org.sparkproject.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
	at org.sparkproject.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
	at org.sparkproject.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
	at org.sparkproject.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355)
	at org.sparkproject.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
	at org.sparkproject.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:763)
	at org.sparkproject.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:234)
	at org.sparkproject.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
	at org.sparkproject.jetty.server.Server.handle(Server.java:516)
	at org.sparkproject.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487)
	at org.sparkproject.jetty.server.HttpChannel.dispatch(HttpChannel.java:732)
	at org.sparkproject.jetty.server.HttpChannel.handle(HttpChannel.java:479)
	at org.sparkproject.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
	at org.sparkproject.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
	at org.sparkproject.jetty.io.FillInterest.fillable(FillInterest.java:105)
	at org.sparkproject.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
	at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
	at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
	at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
	at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
	at org.sparkproject.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409)
	at org.sparkproject.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
	at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
	at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.catalyst.util.DateTimeUtils$
	at org.apache.spark.sql.streaming.ui.UIUtils$$anon$1.initialValue(UIUtils.scala:68)
	at org.apache.spark.sql.streaming.ui.UIUtils$$anon$1.initialValue(UIUtils.scala:65)
	at java.base/java.lang.ThreadLocal.setInitialValue(ThreadLocal.java:225)
	at java.base/java.lang.ThreadLocal.get(ThreadLocal.java:194)
	at java.base/java.lang.ThreadLocal.get(ThreadLocal.java:172)
	at org.apache.spark.sql.streaming.ui.UIUtils$.parseProgressTimestamp(UIUtils.scala:74)
	at org.apache.spark.sql.streaming.ui.StreamingQueryStatusListener.onQueryStarted(StreamingQueryStatusListener.scala:74)
	at org.apache.spark.sql.execution.streaming.StreamingQueryListenerBus.doPostEvent(StreamingQueryListenerBus.scala:131)
	at org.apache.spark.sql.execution.streaming.StreamingQueryListenerBus.doPostEvent(StreamingQueryListenerBus.scala:43)
	at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:117)
	at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:101)
	at org.apache.spark.sql.execution.streaming.StreamingQueryListenerBus.postToAll(StreamingQueryListenerBus.scala:88)
	at org.apache.spark.sql.execution.streaming.StreamingQueryListenerBus.onOtherEvent(StreamingQueryListenerBus.scala:108)
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:100)
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28)
	at org.apache.spark.scheduler.ReplayListenerBus.doPostEvent(ReplayListenerBus.scala:35)
	at org.apache.spark.scheduler.ReplayListenerBus.doPostEvent(ReplayListenerBus.scala:35)
	at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:117)
	at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:101)
	at org.apache.spark.scheduler.ReplayListenerBus.postToAll(ReplayListenerBus.scala:35)
	at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:89)
	at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:60)
	at org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$parseAppEventLogs$3(FsHistoryProvider.scala:1145)
	at org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$parseAppEventLogs$3$adapted(FsHistoryProvider.scala:1143)
	at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2764)
	at org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$parseAppEventLogs$1(FsHistoryProvider.scala:1143)
	at org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$parseAppEventLogs$1$adapted(FsHistoryProvider.scala:1141)
	at scala.collection.immutable.List.foreach(List.scala:431)
	at org.apache.spark.deploy.history.FsHistoryProvider.parseAppEventLogs(FsHistoryProvider.scala:1141)
	at org.apache.spark.deploy.history.FsHistoryProvider.rebuildAppStore(FsHistoryProvider.scala:1122)
	at org.apache.spark.deploy.history.FsHistoryProvider.createInMemoryStore(FsHistoryProvider.scala:1360)
	at org.apache.spark.deploy.history.FsHistoryProvider.getAppUI(FsHistoryProvider.scala:378)
	at org.apache.spark.deploy.history.HistoryServer.getAppUI(HistoryServer.scala:199)
	at org.apache.spark.deploy.history.ApplicationCache.$anonfun$loadApplicationEntry$2(ApplicationCache.scala:164)
	at org.apache.spark.deploy.history.ApplicationCache.time(ApplicationCache.scala:135)
	at org.apache.spark.deploy.history.ApplicationCache.org$apache$spark$deploy$history$ApplicationCache$$loadApplicationEntry(ApplicationCache.scala:162)
	at org.apache.spark.deploy.history.ApplicationCache$$anon$1.load(ApplicationCache.scala:56)
	at org.apache.spark.deploy.history.ApplicationCache$$anon$1.load(ApplicationCache.scala:52)
	at org.sparkproject.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
	at org.sparkproject.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
	at org.sparkproject.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
	at org.sparkproject.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
	... 41 more
Caused by: java.lang.ExceptionInInitializerError: Exception java.lang.ExceptionInInitializerError [in thread "HistoryServerUI-58"]
	at org.apache.spark.unsafe.types.UTF8String.fromBytes(UTF8String.java:109)
	at org.apache.spark.unsafe.types.UTF8String.fromString(UTF8String.java:139)
	at org.apache.spark.unsafe.types.UTF8String.<clinit>(UTF8String.java:99)
	at org.apache.spark.sql.catalyst.util.DateTimeUtils$.<init>(DateTimeUtils.scala:243)
	at org.apache.spark.sql.catalyst.util.DateTimeUtils$.<clinit>(DateTimeUtils.scala)
	... 83 more

Description of changes:

Add JAVA_HOME environment variables to Dockerfile so that Spark UI container uses Java 8 runtime.

# Inside of Spark UI container

## Check java_sdk alternatives
bash-5.2# ls -l /etc/alternatives/ | grep "java_sdk"
lrwxrwxrwx 1 root root 46 Sep 22 07:36 java_sdk -> /usr/lib/jvm/java-1.8.0-amazon-corretto.x86_64
lrwxrwxrwx 1 root root 46 Sep 22 07:36 java_sdk_1.8.0 -> /usr/lib/jvm/java-1.8.0-amazon-corretto.x86_64
lrwxrwxrwx 1 root root 46 Sep 22 07:36 java_sdk_1.8.0_openjdk -> /usr/lib/jvm/java-1.8.0-amazon-corretto.x86_64

## java_sdk_1.8.0_openjdk points to amazon corretto
bash-5.2# ls -l /usr/lib/jvm/
total 8
lrwxrwxrwx 1 root root   26 Sep 22 07:36 java -> /etc/alternatives/java_sdk
lrwxrwxrwx 1 root root   32 Sep 22 07:36 java-1.8.0 -> /etc/alternatives/java_sdk_1.8.0
drwxr-xr-x 9 root root 4096 Sep 22 07:36 java-1.8.0-amazon-corretto.x86_64
lrwxrwxrwx 1 root root   40 Sep 22 07:36 java-1.8.0-openjdk -> /etc/alternatives/java_sdk_1.8.0_openjdk
lrwxrwxrwx 1 root root   41 Sep 22 07:36 java-21-amazon-corretto -> /etc/alternatives/java-21-amazon-corretto
drwxr-xr-x 7 root root 4096 Sep 22 07:36 java-21-amazon-corretto.x86_64
lrwxrwxrwx 1 root root   21 Sep 22 07:36 jre -> /etc/alternatives/jre
lrwxrwxrwx 1 root root   27 Sep 22 07:36 jre-1.8.0 -> /etc/alternatives/jre_1.8.0
lrwxrwxrwx 1 root root   35 Sep 22 07:36 jre-1.8.0-openjdk -> /etc/alternatives/jre_1.8.0_openjdk
lrwxrwxrwx 1 root root   24 Sep 22 07:36 jre-21 -> /etc/alternatives/jre_21
lrwxrwxrwx 1 root root   32 Sep 22 07:36 jre-21-openjdk -> /etc/alternatives/jre_21_openjdk
lrwxrwxrwx 1 root root   29 Sep 22 07:36 jre-openjdk -> /etc/alternatives/jre_openjdk

## spark-class checks JAVA_HOME for Spark runtime
bash-5.2# grep -E "(JAVA_HOME|RUNNER)" /opt/spark/bin/spark-class
if [ -n "${JAVA_HOME}" ]; then
  RUNNER="${JAVA_HOME}/bin/java"
    RUNNER="java"
    echo "JAVA_HOME is not set" >&2
  "$RUNNER" -Xmx128m $SPARK_LAUNCHER_OPTS -cp "$LAUNCH_CLASSPATH" org.apache.spark.launcher.Main "$@"

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@junoha junoha marked this pull request as ready for review October 5, 2023 03:29
@moomindani moomindani merged commit c5f2588 into aws-samples:master Oct 5, 2023
@moomindani
Copy link
Contributor

Thank you for your contribution!

@junoha junoha changed the title Add JAVAHOME to Spark UI docker image Add JAVA_HOME to Spark UI docker image Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants