[SPARK-26288][CORE] Restore RegisteredExecutors information for External shuffle service in Standalone/Kubernetes backend when the service is restarted #23393

weixiuli · 2018-12-28T05:34:56Z

What changes were proposed in this pull request?

As we all know that spark on Yarn uses DB #7943 to record RegisteredExecutors information which can be reloaded and used again when the ExternalShuffleService is restarted .

The RegisteredExecutors information can't be recorded both in the mode of spark's standalone and spark on k8s , which will cause the RegisteredExecutors information to be lost ,when the ExternalShuffleService is restarted.

To solve the problem above, a method is proposed and is committed .

How was this patch tested?

new unit tests

weixiuli · 2018-12-28T05:41:22Z

cc @dongjoon-hyun @gatorsmile Kindly review

gatorsmile · 2018-12-28T07:19:13Z

Let us cc @sameeragarwal @tejasapatil Do you have any opinion about this new feature?

tejasapatil · 2018-12-28T07:30:55Z

I am currently on a vacation. Will get back to you on this in 2 weeks. Thanks, Tejas

…

On Thu, Dec 27, 2018 at 11:19 PM Xiao Li ***@***.***> wrote: Let us cc @sameeragarwal <https://github.com/sameeragarwal> @tejasapatil <https://github.com/tejasapatil> Do you have any opinion about this new feature? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#23393 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADFJaa_dQSTITFmONv5_WZ7x3qfQO6Ywks5u9cYJgaJpZM4ZjecF> .

sameeragarwal · 2018-12-28T23:43:04Z

From a pure feature perspective, I think it makes sense to support this for non-yarn modes as well. While Facebook uses a custom scheduler, we rely on this leveldb state to be robust against external shuffle service failures/restarts (for certain deployment types). Although admittedly, I lack full context behind as to why this was only done for Yarn to begin with. cc @dafrista @squito who might've more details there.

dongjoon-hyun · 2019-01-01T06:05:25Z

ok to test

SparkQA · 2019-01-01T10:53:42Z

Test build #100608 has finished for PR 23393 at commit 8cc02a7.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

squito

It was originally done only for yarn just because I wasn't very familiar with other modes -- but it certainly should be possible to do it. There are some corner cases to think about -- the one which comes to mind is, what happens if an application is stopped while the external shuffle service is down? In yarn, we rely on being told the application was stopped even after the NM comes back. I don't think the same is true in standalone mode, the master won't tell the worker after it comes back? So then you'll leave an entry in the DB forever. Maybe this is rare enough and low-impact enough that you'd never expect that list to get large, but at least worth thinking through and documenting.

squito · 2019-01-04T17:23:55Z

core/src/main/scala/org/apache/spark/deploy/ExternalShuffleService.scala

+  protected def initRegisteredExecutorsDB(dbName: String): File = {
+    val localDirs = sparkConf.get("spark.local.dir", "").split(",")
+    if (localDirs.length >= 1 && !"".equals(localDirs(0))) {
+      createDirectory(localDirs(0), dbName)


any reason to only use localDirs(0) instead of checking all localDirs? I worry that you might get another local dir pre-pended or something during a configuration change + restart

also it seems you're creating a path like "[local-dir]/registeredExecutors/registeredExecutors.ldb", any reason for the extra level?

Hi, @squito , as you agreed that it certainly should be possible to do it.
the one which comes to mind is, what happens if an application is stopped while the external shuffle service is down? In yarn, we rely on being told the application was stopped even after the NM comes back.
Now , It can leave an entry in the DB forever when some time like above. As you said that Maybe this is rare enough and low-impact enough , but at least worth thinking through and documenting. So I think we can add some core to remove the entry with WorkDirCleanup when set #spark.worker.cleanup.enabled = true in standalone mode. can you have any good idea ?

This commit uses localDirs(0) instead of checking all localDirs to make sure it's a same path to be used by DB and make sure initRegisteredExecutorsDB to work , localDirs(0) is just to
be used for DB instead of additional set.

Creating a path like "[local-dir]/registeredExecutors/registeredExecutors.ldb" is just to make it look clearly .

yes I think WorkDirCleanup maybe just what we need to ensure things get cleaned up, good idea.

I understand wanting to use a consistent directory, but like I said I'm worried about restarts after configuration changes (maybe not a concern in a standalone mode? does it always require a total restart?) You could do something like what was done in the original patch for yarn, to check all the dirs, but fallback to dir[0] (that code has since changed to take advantage of other yarn features for recovery):

spark/network/yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java

Lines 192 to 200 in 708036c

private File findRegisteredExecutorFile(String[] localDirs) {

for (String dir: localDirs) {

File f = new File(dir, "registeredExecutors.ldb");

if (f.exists()) {

return f;

}

}

return new File(localDirs[0], "registeredExecutors.ldb");

}

weixiuli · 2019-01-24T06:19:22Z

cc @squito @gatorsmile @dongjoon-hyun PTAL.

SparkQA · 2019-01-24T08:05:02Z

Test build #101616 has finished for PR 23393 at commit 76a7564.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

squito · 2019-01-24T18:29:52Z

Jenkins, retest this please

squito · 2019-01-24T18:40:51Z

core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala

+          // if an application is stopped while the external shuffle service is down?
+          // So then it'll leave an entry in the DB and the entry should be removed.
+          if (conf.get(config.SHUFFLE_SERVICE_DB_ENABLED) &&
+            conf.get(config.SHUFFLE_SERVICE_ENABLED) && !isAppStillRunning) {


nit: double indent this line, the continuation of the condition

Hi,@squito ,i have fixed it，thanks a lot!

SparkQA · 2019-01-24T23:03:45Z

Test build #101645 has finished for PR 23393 at commit 76a7564.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

squito · 2019-01-25T02:26:34Z

the approach looks good to me. I would need to take a closer look at the full flow of everything in Worker before being ready to merge myself -- if nobody else has a chance I'll try to find time to come back to review that part more thoroughly

weixiuli · 2019-01-25T02:29:55Z

ok

SparkQA · 2019-01-25T06:47:34Z

Test build #101659 has finished for PR 23393 at commit 6c4248d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

weixiuli · 2019-03-04T11:46:35Z

hi,@squito , how about this patch ,wether it can be merged to master.

squito

sorry for the delayed review, I was hoping somebody that knew standalone better would chime in. But I took a closer look and I think this is OK. My comments are just style stuff

squito · 2019-03-04T21:18:35Z

core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala

+          // if an application is stopped while the external shuffle service is down?
+          // So then it'll leave an entry in the DB and the entry should be removed.
+          if (conf.get(config.SHUFFLE_SERVICE_DB_ENABLED) &&
+            conf.get(config.SHUFFLE_SERVICE_ENABLED)) {


nit: double indent this line (4 spaces)

sorry , i don't understand why this is necessary.

its a continuation of the condition of the if. That helps separate it from the body, which is only indented 2 spaces. eg. like this:

spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/AnalysisHelper.scala

Lines 133 to 137 in 190a3a4

protected def assertNotAnalysisRule(): Unit = {

if (Utils.isTesting &&

AnalysisHelper.inAnalyzer.get > 0 &&

AnalysisHelper.resolveOperatorDepth.get == 0) {

throw new RuntimeException("This method should not be called in the analyzer")

squito · 2019-03-04T21:20:03Z

core/src/main/scala/org/apache/spark/deploy/ExternalShuffleService.scala


  private val shuffleServiceSource = new ExternalShuffleServiceSource

+  protected def initRegisteredExecutorsDB(dbName: String): File = {


I think renaming to findRegisteredExecutorsDBFile would be better, nothing is really getting initialized here

squito · 2019-03-04T21:22:14Z

core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala

        }.foreach { dir =>
          logInfo(s"Removing directory: ${dir.getPath}")
          Utils.deleteRecursively(dir)
+


unrelated to your change -- can you add 2 space indentation on line 466 !Utils.doesDirectoryContainAnyNewFiles(dir, APP_DATA_RETENTION_SECONDS) as its a continuation?

squito · 2019-03-04T21:24:39Z

core/src/test/scala/org/apache/spark/deploy/ExternalShuffleServiceDbSuite.scala

+    externalShuffleService = new ExternalShuffleService(sparkConf, securityManager)
+    // externalShuffleService restart
+    externalShuffleService.start()
+    bockHandler = externalShuffleService.getBlockHandler


typo, missing an 'l in blockHandler

squito · 2019-03-04T21:27:28Z

...work-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockHandler.java

      new ExternalShuffleBlockResolver(conf, registeredExecutorFile));
  }

+  /** ForTesting */


can you use the same style for labelling the "for testing" methods as already in these files? I think for all of these java files its @VisibleForTestin

squito · 2019-03-04T21:38:20Z

core/src/test/scala/org/apache/spark/deploy/ExternalShuffleServiceDbSuite.scala

+    // pass
+    blockResolver.closeForTest()
+    // externalShuffleService stop
+    externalShuffleService.stop()


I don't think those comments add anything.

should the close() / stop() be in the afterAll (or a finally)?

squito · 2019-03-04T21:42:14Z

core/src/test/scala/org/apache/spark/deploy/ExternalShuffleServiceDbSuite.scala

+  }
+
+  // This test getBlockData will be passed when the external shuffle service is restarted.
+  test("restart External Shuffle Service With InitRegisteredExecutorsDB") {


I'd reword the test name & comment here a bit. Maybe for test-name, "Recover shuffle data with spark.shuffle.service.db.enabled=true after shuffle service restart"

and the comment should say something more like "The beforeAll ensures the shuffle data was already written, and then the shuffle service was stopped. Here we restart the shuffle service and make we can read the shuffle data"

done ,thanks your.

squito · 2019-03-04T21:43:04Z

core/src/test/scala/org/apache/spark/deploy/ExternalShuffleServiceDbSuite.scala

+    assert(error.contains("not registered"))
+    blockResolver.closeForTest()
+    // externalShuffleService stop
+    externalShuffleService.stop()


squito · 2019-03-04T21:43:45Z

core/src/test/scala/org/apache/spark/deploy/ExternalShuffleServiceDbSuite.scala

+  }
+
+  // This test getBlockData will't be passed when the external shuffle service is restarted.
+  test("restart External Shuffle Service Without InitRegisteredExecutorsDB") {


typo: will not? but as above, I'd reword the test name and comment

squito · 2019-03-04T21:46:57Z

core/src/test/scala/org/apache/spark/deploy/ExternalShuffleServiceDbSuite.scala

+
+  /**
+   * Manages some sort-shuffle data, including the creation
+   * and cleanup of directories that can be read by the


comment appears to be cutoff.

I don't love that this is copied from elsewhere -- can you just add the network-common as a test dependency to core, like this:

spark/core/pom.xml

Lines 364 to 370 in 827d371

<dependency>

<groupId>org.apache.spark</groupId>

<artifactId>spark-launcher_${scala.binary.version}</artifactId>

<version>${project.version}</version>

<classifier>tests</classifier>

<scope>test</scope>

</dependency>

SparkQA · 2019-03-06T10:53:43Z

Test build #103087 has finished for PR 23393 at commit b246642.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

weixiuli · 2019-03-06T11:15:22Z

hi,@squito ,thank you very much for your detailed review,PTAL.

SparkQA · 2019-03-06T15:45:09Z

Test build #103088 has finished for PR 23393 at commit db1b11e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

squito · 2019-03-06T16:55:06Z

core/src/test/scala/org/apache/spark/deploy/ExternalShuffleServiceDbSuite.scala

this block is not indented correctly

squito · 2019-03-06T16:56:41Z

~~lgtm, I'll fix the remaining style issues while merging~~
EDIT: sorry noticed some more things

squito

Sorry, actually I realized there are some other things to fix.

In addition to the comment on ShuffleSuite, I also realized we're not testing the WorkDirCleanup part. It doesnt seem like there is a test in WorkerSuite even for the old WorkDirCleanup part, but can you add something?

squito · 2019-03-06T17:04:28Z

core/src/test/scala/org/apache/spark/deploy/ExternalShuffleServiceDbSuite.scala

actually, sorry one important thing -- why does this extend ShuffleSuite? Its not really changing the behavior that ShuffleSuite is designed for

sorry, I've replaced it with SparkFunSuite.

attilapiros

I had some pending review comments but as I see it is quite close to merging I decided to add them now and continue the review tomorrow (if it is not merged yet).

attilapiros · 2019-03-06T16:20:12Z

core/src/main/scala/org/apache/spark/deploy/ExternalShuffleService.scala

The if with localDirs(0).nonEmpty can be spared by using this line:

val localDirs = sparkConf.getOption("spark.local.dir").map(_.split(",")).getOrElse(Array())

That's a good idea,thank you.

attilapiros · 2019-03-06T16:31:22Z

core/src/main/scala/org/apache/spark/deploy/ExternalShuffleService.scala

Nit: indentation

if () { ... } else { ... }

attilapiros · 2019-03-06T16:32:32Z

core/src/main/scala/org/apache/spark/deploy/ExternalShuffleService.scala

Nit: indentation as above.

Also there are few other places with this indentation error but I do not want to spam your PR.

attilapiros · 2019-03-06T16:55:54Z

core/src/test/scala/org/apache/spark/deploy/ExternalShuffleServiceDbSuite.scala

indentation?

weixiuli · 2019-03-07T13:33:51Z

hi,@squito @attilapiros,i have added WorkDirCleanup test f,PTAL and review.

attilapiros · 2019-03-07T14:46:32Z

core/src/test/scala/org/apache/spark/deploy/ExternalShuffleServiceDbSuite.scala

This assert is not needed.

attilapiros · 2019-03-07T15:29:09Z

core/src/test/scala/org/apache/spark/deploy/worker/WorkerSuite.scala

Is there a way to avoid this sleep? On my machine running the test via IntelliJ this 10 milliseconds was not enough.

Maybe overkill but moving in the Worker extracting the cleanup functionality (the cleanup Future body) into a separate method visible for the test and calling that function directly instead of sending WorkDirCleanup would solve this issue. Otherwise I would a bit worried about having a flaky test depending on Jenkins workload. What is your opinion?

I just read @squito's solution for this problem and I like that as it is less intrusive.

squito

thanks for adding the test, just a couple minor things.

One more thought -- is there any particular reason we need a config "spark.shuffle.service.db.enabled", rather than just leaving it always on? Its always on for yarn. Is it just in case there is some bug we're introducing here, as a way to turn it off? If so, maybe we should document that it may be removed in the future. In general I'd rather avoid adding confs when unnecessary, but given my lack of expertise with standalone it might be best to leave it in.

Which also reminds me of the other thing that needs to be done -- you should add this to the docs here: https://github.com/apache/spark/blob/master/docs/spark-standalone.md#cluster-launch-scripts

squito · 2019-03-07T15:19:16Z

core/src/test/scala/org/apache/spark/deploy/worker/WorkerSuite.scala

you can combine these conditions into one if:

if (!executorDir.exists() && !executorDir.mkdirs()) {

squito · 2019-03-07T15:22:16Z

core/src/test/scala/org/apache/spark/deploy/worker/WorkerSuite.scala

using sleep() in tests to wait for things isn't great -- it either leads to flakiness if you dont' sleep long enough and the test is occasionally slow, or if you make it super long, then it makes tests slow.

Ideally there would be a condition variable you could wait on, but that's probably not worth it here. Instead using scalatest's eventually works well, eg. like this:

spark/core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala

Lines 279 to 282 in 315c95c

eventually(timeout(1000 milliseconds), interval(10 milliseconds)) {

assert(!store.hasLocalBlock("a1-to-remove"))

master.getLocations("a1-to-remove") should have size 0

}

squito · 2019-03-07T15:24:27Z

core/src/test/scala/org/apache/spark/deploy/worker/WorkerSuite.scala

to test the old behavior of WorkDirCleanup, you also want to assert that !executorDir.exists(), right? (regardless of value)

squito · 2019-03-07T15:24:56Z

core/src/test/scala/org/apache/spark/deploy/worker/WorkerSuite.scala

nit: rename value to something more meaningful, eg. dbCleanupEnabled

squito · 2019-03-07T15:30:38Z

core/src/test/scala/org/apache/spark/deploy/worker/WorkerSuite.scala

minor -- as I mentioned below, both of these tests should still test dir cleanup, so maybe we can rename a little to something like

WorkDirCleanup cleans app dirs and shuffle metadata when spark.shuffle.service.db.enabled=true

WorkdDirCleanup cleans only app dirs when spark.shuffle.service.db.enabled=false

SparkQA · 2019-03-07T17:17:11Z

Test build #103137 has finished for PR 23393 at commit c42712c.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

squito · 2019-03-07T17:32:09Z

looks like it might be a real test failure

weixiuli · 2019-03-12T13:48:28Z

@squito @attilapiros

weixiuli · 2019-03-15T05:39:31Z

Retest this please.

weixiuli · 2019-03-15T11:49:20Z

@squito @attilapiros

squito · 2019-03-15T22:00:30Z

core/src/main/scala/org/apache/spark/internal/config/Worker.scala

  val WORKER_CLEANUP_ENABLED = ConfigBuilder("spark.worker.cleanup.enabled")
    .booleanConf
-    .createWithDefault(false)
+    .createWithDefault(true)


why are you changing this default? Honestly I am much less comfortable merging it with the default changed, as I don't have much experience w/ standalone mode.

As ,this commit depend on WORKER_CLEANUP_ENABLED . While ,we should keep the default value of spark.worker.cleanup.enabled = false . But ,We should make it clear in the docs that spark.worker.cleanup.enabled should be enabled if spark.shuffle.service.db.enabled is "true”, all right?

squito · 2019-03-15T22:00:52Z

docs/spark-standalone.md

I believe attila meant to revert your change to the docs, not to change the default

squito · 2019-03-15T22:03:50Z

docs/spark-standalone.md

+    used again when the external shuffle service is restarted. Note that this only affects standalone
+    mode, its has always on for yarn. We should Enable `spark.worker.cleanup.enabled` to remove the entry
+    (It will leave an entry in the DB forever when an application is stopped while the external shuffle
+    service is down) in the leveldb with WorkDirCleanup. It may be removed in the future.


Some minor rewordings:

Store External Shuffle service state on local disk so that when the external shuffle service is restarted, it will automatically reload info on current executors. This only affects standalone mode (yarn always has this behavior enabled). You should also enable <code>spark.worker.cleanup.enabled</code>, to ensure that the state eventually gets cleaned up. This config may be removed in the future.

Very clear description, thank you.

weixiuli · 2019-03-15T22:39:08Z

@squito @attilapiros

attilapiros · 2019-03-15T22:48:12Z

@weixiuli you have to change WORKER_CLEANUP_ENABLED too (back to createWithDefault(false))

weixiuli · 2019-03-15T23:01:18Z

@attilapiros ok ,i have done.

SparkQA · 2019-03-15T23:47:29Z

Test build #103544 has finished for PR 23393 at commit cb78728.

This patch fails from timeout after a configured wait of 400m.
This patch merges cleanly.
This patch adds no public classes.

weixiuli · 2019-03-16T00:13:48Z

Jenkins ,retest this please.

SparkQA · 2019-03-16T03:19:08Z

Test build #103557 has finished for PR 23393 at commit ed9a842.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-03-16T03:47:47Z

Test build #103559 has finished for PR 23393 at commit 1f94b86.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

squito

lgmt except for one really minor thing

squito · 2019-03-18T17:44:11Z

...ork-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java

+  @VisibleForTesting
+  public static File getFileForTest(String[] localDirs, int subDirsPerLocalDir, String filename) {
+    return getFile(localDirs, subDirsPerLocalDir, filename);
+  }


this is unused now, right? You can undo this change?

yes,your are right,i have fixxed it.

SparkQA · 2019-03-19T04:32:14Z

Test build #103642 has finished for PR 23393 at commit 475d278.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

weixiuli · 2019-03-19T07:40:53Z

Jenkins ,retest this please.

squito · 2019-03-19T21:16:08Z

failure was spurious, earlier run passed everything with only an extra unused function, so merging to master. thanks @weixiuli

As we all know that spark on Yarn uses DB apache#7943 to record RegisteredExecutors information which can be reloaded and used again when the ExternalShuffleService is restarted . The RegisteredExecutors information can't be recorded both in the mode of spark's standalone and spark on k8s , which will cause the RegisteredExecutors information to be lost ,when the ExternalShuffleService is restarted. To solve the problem above, a method is proposed and is committed . new unit tests Closes apache#23393 from weixiuli/SPARK-26288. Authored-by: weixiuli <[email protected]> Signed-off-by: Imran Rashid <[email protected]>

squito reviewed Jan 4, 2019

View reviewed changes

squito reviewed Jan 24, 2019

View reviewed changes

squito reviewed Mar 4, 2019

View reviewed changes

squito reviewed Mar 6, 2019

View reviewed changes

core/src/test/scala/org/apache/spark/deploy/ExternalShuffleServiceDbSuite.scala Outdated

Copy link

Contributor

squito Mar 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this block is not indented correctly

squito suggested changes Mar 6, 2019

View reviewed changes

attilapiros reviewed Mar 6, 2019

View reviewed changes

attilapiros reviewed Mar 7, 2019

View reviewed changes

squito suggested changes Mar 7, 2019

View reviewed changes

weixiuli added 8 commits March 15, 2019 12:14

findRegisteredExecutorsDBFile

2ba3695

fix ExternalShuffleServiceDbSuite

8424852

fix style

d53c52a

add WorkDirCleanup test

c681bfb

fix small

29804eb

add it to the docs

7e87f64

fix style

7b8b31c

fix style

cb78728

weixiuli force-pushed the SPARK-26288 branch from c93f01c to cb78728 Compare March 15, 2019 05:32

squito reviewed Mar 15, 2019

View reviewed changes

change spark-standalone.md

ed9a842

revert default

1f94b86

squito approved these changes Mar 18, 2019

View reviewed changes

fix small

475d278

asfgit closed this in 8b0aa59 Mar 19, 2019

ueshin mentioned this pull request Mar 26, 2019

[SPARK-26288][CORE][FOLLOW-UP][DOC] Fix broken tag in the doc. #24216

Closed

gatorsmile changed the title ~~[SPARK-26288][CORE]add initRegisteredExecutorsDB~~ [SPARK-26288][CORE] Restore RegisteredExecutors information for External shuffle service in Standalone/Kubernetes backend when the service is restarted Jan 2, 2020

	private File findRegisteredExecutorFile(String[] localDirs) {
	for (String dir: localDirs) {
	File f = new File(dir, "registeredExecutors.ldb");
	if (f.exists()) {
	return f;
	}
	}
	return new File(localDirs[0], "registeredExecutors.ldb");
	}

	protected def assertNotAnalysisRule(): Unit = {
	if (Utils.isTesting &&
	AnalysisHelper.inAnalyzer.get > 0 &&
	AnalysisHelper.resolveOperatorDepth.get == 0) {
	throw new RuntimeException("This method should not be called in the analyzer")


		private val shuffleServiceSource = new ExternalShuffleServiceSource

		protected def initRegisteredExecutorsDB(dbName: String): File = {

	<dependency>
	<groupId>org.apache.spark</groupId>
	<artifactId>spark-launcher_${scala.binary.version}</artifactId>
	<version>${project.version}</version>
	<classifier>tests</classifier>
	<scope>test</scope>
	</dependency>

	eventually(timeout(1000 milliseconds), interval(10 milliseconds)) {
	assert(!store.hasLocalBlock("a1-to-remove"))
	master.getLocations("a1-to-remove") should have size 0
	}

[SPARK-26288][CORE] Restore RegisteredExecutors information for External shuffle service in Standalone/Kubernetes backend when the service is restarted #23393

[SPARK-26288][CORE] Restore RegisteredExecutors information for External shuffle service in Standalone/Kubernetes backend when the service is restarted #23393

Uh oh!

Conversation

weixiuli commented Dec 28, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

weixiuli commented Dec 28, 2018

Uh oh!

gatorsmile commented Dec 28, 2018

Uh oh!

tejasapatil commented Dec 28, 2018 via email

Uh oh!

sameeragarwal commented Dec 28, 2018

Uh oh!

dongjoon-hyun commented Jan 1, 2019

Uh oh!

SparkQA commented Jan 1, 2019

Uh oh!

squito left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

weixiuli commented Jan 24, 2019

Uh oh!

SparkQA commented Jan 24, 2019

Uh oh!

squito commented Jan 24, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jan 24, 2019

Uh oh!

squito commented Jan 25, 2019

Uh oh!

weixiuli commented Jan 25, 2019

Uh oh!

SparkQA commented Jan 25, 2019

Uh oh!

weixiuli commented Mar 4, 2019

Uh oh!

squito left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

weixiuli commented Dec 28, 2018 •

edited

Loading

squito commented Mar 6, 2019 •

edited

Loading

weixiuli Mar 7, 2019 •

edited

Loading

attilapiros left a comment •

edited

Loading