Commit 6665df6
[SPARK-4006] Block Manager - Double Register Crash
This issue affects all versions since 0.7 up to (including) 1.1
In long running contexts, we encountered the situation of double register without a remove in between. The cause for that is unknown, and assumed a temp network issue.
However, since the second register is with a BlockManagerId on a different port, blockManagerInfo.contains() returns false, while blockManagerIdByExecutor returns Some. This inconsistency is caught in a conditional statement that does System.exit(1), which is a huge robustness issue for us.
The fix - simply remove the old id from both maps during register when this happens. We are mimicking the behavior of expireDeadHosts(), by doing local cleanup of the maps before trying to add new ones.
Also - added some logging for register and unregister.
https://issues.apache.org/jira/browse/SPARK-4006
Author: Tal Sliwowicz <[email protected]>
Closes #2854 from tsliwowicz/branch-0.9.2-block-mgr-removal and squashes the following commits:
95ae4db [Tal Sliwowicz] [SPARK-4006] In long running contexts, we encountered the situation of double registe...
81d69f0 [Tal Sliwowicz] fixed comment
efd93f2 [Tal Sliwowicz] In long running contexts, we encountered the situation of double register without a remove in between. The cause for that is unknown, and assumed a temp network issue.1 parent 3fba7b7 commit 6665df6
File tree
1 file changed
+11
-6
lines changed- core/src/main/scala/org/apache/spark/storage
1 file changed
+11
-6
lines changedLines changed: 11 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
160 | 160 | | |
161 | 161 | | |
162 | 162 | | |
| 163 | + | |
163 | 164 | | |
164 | 165 | | |
165 | 166 | | |
| |||
225 | 226 | | |
226 | 227 | | |
227 | 228 | | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
233 | 234 | | |
234 | | - | |
235 | 235 | | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
236 | 241 | | |
237 | 242 | | |
238 | 243 | | |
| |||
0 commit comments