-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-6075] Fix bug in that caused lost accumulator updates: do not store WeakReferences in localAccums map #4835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/cc @andrewor14, it turns out that the "flaky" accumulator test was actually a real bug (fixed by this patch). |
|
This only affects |
|
Test build #28131 has started for PR 4835 at commit
|
|
Test build #28131 has finished for PR 4835 at commit
|
|
Test FAILed. |
|
That's a legitimate test failure caused by a new assertion that I added here (being extra defensive about avoiding duplicate accumulator registration, since this might cause silent data loss): |
|
I still think that we should investigate / add that assertion, since correctness of accumulators relies on the assumption that a deserialized task has only one instance of an an accumulator with a given id, but it looks like this issue existed prior to this recent WeakReference change; I'll roll back my new assertion and follow up on that issue in a separate patch. |
…ed to this change
|
Test build #28133 has started for PR 4835 at commit
|
|
Test build #28133 has finished for PR 4835 at commit
|
|
Test PASSed. |
|
I'm going to commit this to |
|
I think this PR also fixes SPARK-6020, since |
This fixes a non-deterministic bug introduced in #4021 that could cause tasks' accumulator updates to be lost. The problem is that
localAccumsshould not hold weak references: after the task finishes running there won't be any strong references to these local accumulators, so they can get garbage-collected before the executor reads thelocalAccumsmap. We don't need weak references here anyways, since this map is cleared at the end of each task.