[SPARK-32091][CORE] Ignore timeout error when remove blocks on the lost executor #28924

Ngone51 · 2020-06-24T15:19:14Z

What changes were proposed in this pull request?

This PR adds the check to see whether the executor is lost (by asking the CoarseGrainedSchedulerBackend) after timeout error raised in BlockManagerMasterEndponit due to removing blocks(e.g. RDD, broadcast, shuffle). If the executor is lost, we will ignore the error. Otherwise, throw the error.

Why are the changes needed?

When removing blocks(e.g. RDD, broadcast, shuffle), BlockManagerMaserEndpoint will make RPC calls to each known BlockManagerSlaveEndpoint to remove the specific blocks. The PRC call sometimes could end in a timeout when the executor has been lost, but only notified the BlockManagerMasterEndpoint after the removing call has already happened. The timeout error could therefore fail the whole job.

In this case, we actually could just ignore the error since those blocks on the lost executor could be considered as removed already.

Does this PR introduce any user-facing change?

Yes. In case of users hits this issue, they will have the job executed successfully instead of throwing the exception.

How was this patch tested?

Added unit tests.

SparkQA · 2020-06-24T15:28:19Z

Test build #124490 has finished for PR 28924 at commit a09990f.

This patch fails build dependency tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
case class IsExecutorAlive(executorId: String) extends CoarseGrainedClusterMessage

dongjoon-hyun · 2020-06-28T19:34:49Z

Retest this please.

SparkQA · 2020-06-28T19:37:52Z

Test build #124606 has finished for PR 28924 at commit a09990f.

This patch fails build dependency tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
case class IsExecutorAlive(executorId: String) extends CoarseGrainedClusterMessage

Ngone51 · 2020-06-29T02:16:24Z

retest this please.

SparkQA · 2020-06-29T05:26:22Z

Test build #124620 has finished for PR 28924 at commit a09990f.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
case class IsExecutorAlive(executorId: String) extends CoarseGrainedClusterMessage

holdenk

Thanks for working on this, I left some minor concerns and questions when you've got a chance.

core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala

core/src/main/scala/org/apache/spark/util/RpcUtils.scala

core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala

holdenk · 2020-06-29T22:58:23Z

core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala

      .set(MEMORY_STORAGE_FRACTION, 0.999)
      .set(Kryo.KRYO_SERIALIZER_BUFFER_SIZE.key, "1m")
      .set(STORAGE_UNROLL_MEMORY_THRESHOLD, 512L)
+      .set(Network.RPC_ASK_TIMEOUT, "5s")


Any particular reason why 5?

In the newly added tests, we need to simulate the timeout error from BlockManager. But at the same time, we also don't want the test run too long since the default timeout value is 120s. Therefore, we choose a quite short timeout for the tests. On the other hand, we don't set it to a smaller value, e.g. 1s, which may cause test flaky.

Note that the best way to set the timeout value is to set it for the newly added tests locally instead of setting it globally. However, with the limitation of the current test framework in Core side, it's hard to set it locally since it requires more changes.

holdenk · 2020-06-29T23:00:21Z

Also for user facing change maybe "less failures" which is good and we should call out here so we can mention it in the release notes and encourage folks to upgrade.

Ngone51 · 2020-06-30T10:33:47Z

@holdenk Thanks for review. I've also updated the PR description for the user-facing part.

Ngone51 · 2020-06-30T10:34:11Z

ping @tgravescs @jiangxb1987 Could you also take a look? thanks!

tgravescs

overall looks good, few nits

core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedClusterMessage.scala

core/src/main/scala/org/apache/spark/util/RpcUtils.scala

SparkQA · 2020-06-30T22:11:26Z

Test build #124660 has finished for PR 28924 at commit 5a02ef2.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

prakharjain09 · 2020-07-01T08:05:24Z

core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala

-        bm.slaveEndpoint.ask[Boolean](removeMsg)
+        bm.slaveEndpoint.ask[Boolean](removeMsg).recover {
+          // use false as default value means no shuffle data were removed
+          handleFailure("shuffle", shuffleId.toString, bm.blockManagerId, false)


Previously we were not swallowing IOException here for removing shuffle blocks? But handleFailure method will start swallowing it? is this intentional?

I think we may miss it previously. The failure of removing blocks should be considered as non-fatal. For example, it's not worth failing the whole application if we fail to remove some blocks just after some heavy computation.

core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala

Ngone51 · 2020-07-02T12:38:00Z

thanks for the review. I've updated the PR. Please take another look:)

SparkQA · 2020-07-02T16:06:42Z

Test build #124903 has finished for PR 28924 at commit a04a74e.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

tgravescs · 2020-07-06T16:41:20Z

test this please

SparkQA · 2020-07-06T21:07:39Z

Test build #125103 has finished for PR 28924 at commit a04a74e.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

Ngone51 · 2020-07-07T03:08:04Z

retest this please

Ngone51 · 2020-07-07T03:08:24Z

cc @cloud-fan Could you also take a look? Thanks.

SparkQA · 2020-07-07T06:30:23Z

Test build #125169 has finished for PR 28924 at commit a04a74e.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

Ngone51 · 2020-07-07T07:13:55Z

retest this please

SparkQA · 2020-07-07T11:23:19Z

Test build #125186 has finished for PR 28924 at commit a04a74e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedClusterMessage.scala

core/src/main/scala/org/apache/spark/storage/BlockManagerMaster.scala

core/src/main/scala/org/apache/spark/util/RpcUtils.scala

cloud-fan · 2020-07-09T11:59:03Z

core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala

+      bm1Id.executorId, null, bm1Id.host, 1, Map.empty, Map.empty,
+      Map.empty, 0))
+
+    if (!withLost) {


what does withLost means?

We always set up three roles, driver, executor-1, executor-2 before these two tests. withLost here indicates whether to initialize the executor-2 as a lost executor in terms of driver's view.

core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala

SparkQA · 2020-07-10T11:54:57Z

Test build #125568 has finished for PR 28924 at commit 828aa2b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-07-10T12:13:53Z

Test build #125578 has finished for PR 28924 at commit 319d925.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2020-07-10T13:36:23Z

thanks, merging to master!

Ngone51 · 2020-07-10T15:01:49Z

thanks all!!

….timeout` config ### What changes were proposed in this pull request? SparkContext stop stuck on ContextCleaner ``` 25/11/05 18:12:29 ERROR [shutdown-hook-0] ThreadUtils: 14 Driver BLOCKED Blocked by Thread 60 Lock(org.apache.spark.ContextCleaner1726738661}) org.apache.spark.ContextCleaner.stop(ContextCleaner.scala:145) org.apache.spark.SparkContext.$anonfun$stop$9(SparkContext.scala:2094) org.apache.spark.SparkContext.$anonfun$stop$9$adapted(SparkContext.scala:2094) org.apache.spark.SparkContext$$Lambda$5309/807013918.apply(Unknown Source) scala.Option.foreach(Option.scala:407) org.apache.spark.SparkContext.$anonfun$stop$8(SparkContext.scala:2094) org.apache.spark.SparkContext$$Lambda$5308/1445921225.apply$mcV$sp(Unknown Source) org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1512) org.apache.spark.SparkContext.stop(SparkContext.scala:2094) org.apache.spark.SparkContext.stop(SparkContext.scala:2050) org.apache.spark.sql.SparkSession.stop(SparkSession.scala:718) com.shopee.data.content.ods.live_performance.Main$.main(Main.scala:62) com.shopee.data.content.ods.live_performance.Main.main(Main.scala) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:498) org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:751) ``` ContextCleaner stop() will wait lock ``` def stop(): Unit = { stopped = true // Interrupt the cleaning thread, but wait until the current task has finished before // doing so. This guards against the race condition where a cleaning thread may // potentially clean similarly named variables created by a different SparkContext, // resulting in otherwise inexplicable block-not-found exceptions (SPARK-6132). synchronized { cleaningThread.interrupt() } cleaningThread.join() periodicGCService.shutdown() } ``` , but one call on keepCleaning() hold the lock ``` 25/11/05 18:12:29 ERROR [shutdown-hook-0] ThreadUtils: 60 Spark Context Cleaner TIMED_WAITING Monitor(org.apache.spark.ContextCleaner1726738661}) sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:248) scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:258) scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:263) org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:294) org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) org.apache.spark.storage.BlockManagerMaster.removeBroadcast(BlockManagerMaster.scala:194) org.apache.spark.broadcast.TorrentBroadcast$.unpersist(TorrentBroadcast.scala:351) org.apache.spark.broadcast.TorrentBroadcastFactory.unbroadcast(TorrentBroadcastFactory.scala:45) org.apache.spark.broadcast.BroadcastManager.unbroadcast(BroadcastManager.scala:78) org.apache.spark.ContextCleaner.doCleanupBroadcast(ContextCleaner.scala:254) org.apache.spark.ContextCleaner.$anonfun$keepCleaning$3(ContextCleaner.scala:204) org.apache.spark.ContextCleaner.$anonfun$keepCleaning$3$adapted(ContextCleaner.scala:195) org.apache.spark.ContextCleaner$$Lambda$1178/1994584033.apply(Unknown Source) scala.Option.foreach(Option.scala:407) org.apache.spark.ContextCleaner.$anonfun$keepCleaning$1(ContextCleaner.scala:195) => holding Monitor(org.apache.spark.ContextCleaner1726738661}) org.apache.spark.ContextCleaner$$Lambda$1109/1496842179.apply$mcV$sp(Unknown Source) org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1474) org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:189) org.apache.spark.ContextCleaner$$anon$1.run(ContextCleaner.scala:79) ``` BlockManager stuck on removeBroadcast RpcUtils.INFINITE_TIMEOUT.awaitResult(future) 【PR #28924 change here】 ``` def removeBroadcast(broadcastId: Long, removeFromMaster: Boolean, blocking: Boolean): Unit = { val future = driverEndpoint.askSync[Future[Seq[Int]]]( RemoveBroadcast(broadcastId, removeFromMaster)) future.failed.foreach(e => logWarning(s"Failed to remove broadcast $broadcastId" + s" with removeFromMaster = $removeFromMaster - ${e.getMessage}", e) )(ThreadUtils.sameThread) if (blocking) { // the underlying Futures will timeout anyway, so it's safe to use infinite timeout here RpcUtils.INFINITE_TIMEOUT.awaitResult(future) } } ``` For such case only reason should be RPC was missing handling Driver OOM or A thread leak in yarn nm prevents the creation of new threads to handle RPC. ``` 25/11/05 08:16:22 ERROR [metrics-paimon-push-gateway-reporter-2-thread-1] ScheduledReporter: Exception thrown from PushGatewayReporter#report. Exception was suppressed. java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:717) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1115) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1388) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1416) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1400) at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185) at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:167) at io.prometheus.client.exporter.PushGateway.doRequest(PushGateway.java:243) at io.prometheus.client.exporter.PushGateway.push(PushGateway.java:134) at org.apache.paimon.metrics.reporter.PushGatewayReporter.report(PushGatewayReporter.java:84) at com.codahale.metrics.ScheduledReporter.report(ScheduledReporter.java:253) at com.codahale.metrics.ScheduledReporter.lambda$start$0(ScheduledReporter.java:182) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ``` For such case we can add a customized wait timeout here to avoid forever stuck then whole process stuck on here ### Why are the changes needed? Avoid app stuck ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? ### Was this patch authored or co-authored using generative AI tooling? No Closes #52919 from AngersZhuuuu/SPARK-54219. Authored-by: Angerszhuuuu <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

….timeout` config ### What changes were proposed in this pull request? SparkContext stop stuck on ContextCleaner ``` 25/11/05 18:12:29 ERROR [shutdown-hook-0] ThreadUtils: 14 Driver BLOCKED Blocked by Thread 60 Lock(org.apache.spark.ContextCleaner1726738661}) org.apache.spark.ContextCleaner.stop(ContextCleaner.scala:145) org.apache.spark.SparkContext.$anonfun$stop$9(SparkContext.scala:2094) org.apache.spark.SparkContext.$anonfun$stop$9$adapted(SparkContext.scala:2094) org.apache.spark.SparkContext$$Lambda$5309/807013918.apply(Unknown Source) scala.Option.foreach(Option.scala:407) org.apache.spark.SparkContext.$anonfun$stop$8(SparkContext.scala:2094) org.apache.spark.SparkContext$$Lambda$5308/1445921225.apply$mcV$sp(Unknown Source) org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1512) org.apache.spark.SparkContext.stop(SparkContext.scala:2094) org.apache.spark.SparkContext.stop(SparkContext.scala:2050) org.apache.spark.sql.SparkSession.stop(SparkSession.scala:718) com.shopee.data.content.ods.live_performance.Main$.main(Main.scala:62) com.shopee.data.content.ods.live_performance.Main.main(Main.scala) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:498) org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:751) ``` ContextCleaner stop() will wait lock ``` def stop(): Unit = { stopped = true // Interrupt the cleaning thread, but wait until the current task has finished before // doing so. This guards against the race condition where a cleaning thread may // potentially clean similarly named variables created by a different SparkContext, // resulting in otherwise inexplicable block-not-found exceptions (SPARK-6132). synchronized { cleaningThread.interrupt() } cleaningThread.join() periodicGCService.shutdown() } ``` , but one call on keepCleaning() hold the lock ``` 25/11/05 18:12:29 ERROR [shutdown-hook-0] ThreadUtils: 60 Spark Context Cleaner TIMED_WAITING Monitor(org.apache.spark.ContextCleaner1726738661}) sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:248) scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:258) scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:263) org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:294) org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) org.apache.spark.storage.BlockManagerMaster.removeBroadcast(BlockManagerMaster.scala:194) org.apache.spark.broadcast.TorrentBroadcast$.unpersist(TorrentBroadcast.scala:351) org.apache.spark.broadcast.TorrentBroadcastFactory.unbroadcast(TorrentBroadcastFactory.scala:45) org.apache.spark.broadcast.BroadcastManager.unbroadcast(BroadcastManager.scala:78) org.apache.spark.ContextCleaner.doCleanupBroadcast(ContextCleaner.scala:254) org.apache.spark.ContextCleaner.$anonfun$keepCleaning$3(ContextCleaner.scala:204) org.apache.spark.ContextCleaner.$anonfun$keepCleaning$3$adapted(ContextCleaner.scala:195) org.apache.spark.ContextCleaner$$Lambda$1178/1994584033.apply(Unknown Source) scala.Option.foreach(Option.scala:407) org.apache.spark.ContextCleaner.$anonfun$keepCleaning$1(ContextCleaner.scala:195) => holding Monitor(org.apache.spark.ContextCleaner1726738661}) org.apache.spark.ContextCleaner$$Lambda$1109/1496842179.apply$mcV$sp(Unknown Source) org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1474) org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:189) org.apache.spark.ContextCleaner$$anon$1.run(ContextCleaner.scala:79) ``` BlockManager stuck on removeBroadcast RpcUtils.INFINITE_TIMEOUT.awaitResult(future) 【PR apache#28924 change here】 ``` def removeBroadcast(broadcastId: Long, removeFromMaster: Boolean, blocking: Boolean): Unit = { val future = driverEndpoint.askSync[Future[Seq[Int]]]( RemoveBroadcast(broadcastId, removeFromMaster)) future.failed.foreach(e => logWarning(s"Failed to remove broadcast $broadcastId" + s" with removeFromMaster = $removeFromMaster - ${e.getMessage}", e) )(ThreadUtils.sameThread) if (blocking) { // the underlying Futures will timeout anyway, so it's safe to use infinite timeout here RpcUtils.INFINITE_TIMEOUT.awaitResult(future) } } ``` For such case only reason should be RPC was missing handling Driver OOM or A thread leak in yarn nm prevents the creation of new threads to handle RPC. ``` 25/11/05 08:16:22 ERROR [metrics-paimon-push-gateway-reporter-2-thread-1] ScheduledReporter: Exception thrown from PushGatewayReporter#report. Exception was suppressed. java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:717) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1115) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1388) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1416) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1400) at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185) at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:167) at io.prometheus.client.exporter.PushGateway.doRequest(PushGateway.java:243) at io.prometheus.client.exporter.PushGateway.push(PushGateway.java:134) at org.apache.paimon.metrics.reporter.PushGatewayReporter.report(PushGatewayReporter.java:84) at com.codahale.metrics.ScheduledReporter.report(ScheduledReporter.java:253) at com.codahale.metrics.ScheduledReporter.lambda$start$0(ScheduledReporter.java:182) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ``` For such case we can add a customized wait timeout here to avoid forever stuck then whole process stuck on here ### Why are the changes needed? Avoid app stuck ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#52919 from AngersZhuuuu/SPARK-54219. Authored-by: Angerszhuuuu <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

impr

a09990f

probot-autolabeler bot added the CORE label Jun 24, 2020

holdenk reviewed Jun 29, 2020

View reviewed changes

Ngone51 added 4 commits June 30, 2020 18:09

use RpcUtils.makeDriverRef

8cb221e

add comments

6973d17

update comment for infiniteTimeout

3053b22

check -> checks

5a02ef2

tgravescs reviewed Jun 30, 2020

View reviewed changes

core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedClusterMessage.scala Outdated Show resolved Hide resolved

core/src/main/scala/org/apache/spark/util/RpcUtils.scala Outdated Show resolved Hide resolved

prakharjain09 reviewed Jul 1, 2020

View reviewed changes

core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala Show resolved Hide resolved

Ngone51 added 4 commits July 2, 2020 19:48

update comment

e5dcfcf

update comment of infinite

56ccc63

rename method

763a32c

add comment

a04a74e