Skip to content

Conversation

@hustfxj
Copy link
Contributor

@hustfxj hustfxj commented Jan 10, 2017

I deploy the standalone cluster with two masters. and utilize zooKeeper to provide leader election. Firstly, I submit the application with cluster mode. Then I kill the leader master, and the standby master will be the leader. But the new leader will lost the statistics of the driver's resource. Then I stop the application, we will see the negative used resource at the worker from masterPage. Like that:

Workers

Worker Id	Address	State	Cores	Memory
worker-20161220162751-10.125.6.222-59295	10.125.6.222:59295	ALIVE	4 (-1 Used)	6.8 GB (-1073741824.0 B Used)
worker-20161220164233-10.218.135.80-10944	10.218.135.80:10944	ALIVE	4 (0 Used)	6.8 GB (0.0 B Used)

Because the new leader forget calculate the driver‘ resource when the master receive the "WorkerLatestState" message. At the same time we can set RUNNING state for the app after the master receive the message, otherwise the app' state will still be WAITTING.

…the worker When the leader master has changed.
@hustfxj hustfxj changed the title the new leader will lost the statistics of the driver's resource on the worker When the leader master has changed. [SPARK-18959] the new leader will lost the statistics of the driver's resource on the worker When the leader master has changed. Jan 10, 2017
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@hustfxj
Copy link
Contributor Author

hustfxj commented Jan 17, 2017

@srowen can you help review it ? I think it is a bug. Thank you very much.

@hustfxj
Copy link
Contributor Author

hustfxj commented Mar 9, 2017

@srowen @andrewor14 can you review it again? Thank you

@jiangxb1987
Copy link
Contributor

@hustfxj Unluckily we don't support multi-master nodes in standalone mode, so could you please close this PR? Thank you!

@asfgit asfgit closed this in b32bd00 Jun 27, 2017
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
## What changes were proposed in this pull request?

This PR proposes to close stale PRs, mostly the same instances with apache#18017

I believe the author in apache#14807 removed his account.

Closes apache#7075
Closes apache#8927
Closes apache#9202
Closes apache#9366
Closes apache#10861
Closes apache#11420
Closes apache#12356
Closes apache#13028
Closes apache#13506
Closes apache#14191
Closes apache#14198
Closes apache#14330
Closes apache#14807
Closes apache#15839
Closes apache#16225
Closes apache#16685
Closes apache#16692
Closes apache#16995
Closes apache#17181
Closes apache#17211
Closes apache#17235
Closes apache#17237
Closes apache#17248
Closes apache#17341
Closes apache#17708
Closes apache#17716
Closes apache#17721
Closes apache#17937

Added:
Closes apache#14739
Closes apache#17139
Closes apache#17445
Closes apache#18042
Closes apache#18359

Added:
Closes apache#16450
Closes apache#16525
Closes apache#17738

Added:
Closes apache#16458
Closes apache#16508
Closes apache#17714

Added:
Closes apache#17830
Closes apache#14742

## How was this patch tested?

N/A

Author: hyukjinkwon <[email protected]>

Closes apache#18417 from HyukjinKwon/close-stale-pr.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants