-
Notifications
You must be signed in to change notification settings - Fork 408
[CELEBORN-1275] Fix bug that callback function may hang when unchecked exception missed #2316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b82be88 to
bf9d308
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #2316 +/- ##
==========================================
- Coverage 48.69% 48.68% -0.00%
==========================================
Files 209 209
Lines 12940 12944 +4
Branches 1119 1119
==========================================
+ Hits 6300 6301 +1
- Misses 6233 6236 +3
Partials 407 407 ☔ View full report in Codecov by Sentry. |
AngersZhuuuu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
|
cc @mridulm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
For reference, this is fixed in spark in SPARK-28160
|
You are correct, I had a misunderstanding previously. |
|
thanks, merging to main(v0.5.0)/branch-0.4(v0.4.1)/branch-0.3(0.3.3). |
…d exception missed ### What changes were proposed in this pull request? Refer: [SPARK-28160](https://issues.apache.org/jira/browse/SPARK-28160) / apache/spark#24964 ByteBuffer.allocate may throw OutOfMemoryError when the response is large but no enough memory is available. However, when this happens, TransportClient.sendRpcSync will just hang forever if the timeout set to unlimited. ### Why are the changes needed? To catch the exception of `ByteBuffer.allocate` in corner case. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Quote the local test in apache/spark#24964 ``` I tested in my IDE by setting the value of size to -1 to verify the result. Without this patch, it won't be finished until timeout (May hang forever if timeout set to MAX_INT), or the expected IllegalArgumentException will be caught. Override public void onSuccess(ByteBuffer response) { try { int size = response.remaining(); ByteBuffer copy = ByteBuffer.allocate(size); // set size to -1 in runtime when debug copy.put(response); // flip "copy" to make it readable copy.flip(); result.set(copy); } catch (Throwable t) { result.setException(t); } } ``` Closes #2316 from turboFei/fix_transport_client_onsucess. Authored-by: Fei Wang <[email protected]> Signed-off-by: chenfu <[email protected]> (cherry picked from commit 387bffc) Signed-off-by: chenfu <[email protected]>
…d exception missed ### What changes were proposed in this pull request? Refer: [SPARK-28160](https://issues.apache.org/jira/browse/SPARK-28160) / apache/spark#24964 ByteBuffer.allocate may throw OutOfMemoryError when the response is large but no enough memory is available. However, when this happens, TransportClient.sendRpcSync will just hang forever if the timeout set to unlimited. ### Why are the changes needed? To catch the exception of `ByteBuffer.allocate` in corner case. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Quote the local test in apache/spark#24964 ``` I tested in my IDE by setting the value of size to -1 to verify the result. Without this patch, it won't be finished until timeout (May hang forever if timeout set to MAX_INT), or the expected IllegalArgumentException will be caught. Override public void onSuccess(ByteBuffer response) { try { int size = response.remaining(); ByteBuffer copy = ByteBuffer.allocate(size); // set size to -1 in runtime when debug copy.put(response); // flip "copy" to make it readable copy.flip(); result.set(copy); } catch (Throwable t) { result.setException(t); } } ``` Closes #2316 from turboFei/fix_transport_client_onsucess. Authored-by: Fei Wang <[email protected]> Signed-off-by: chenfu <[email protected]> (cherry picked from commit 387bffc) Signed-off-by: chenfu <[email protected]>
What changes were proposed in this pull request?
Refer: SPARK-28160 / apache/spark#24964
ByteBuffer.allocate may throw OutOfMemoryError when the response is large but no enough memory is available. However, when this happens, TransportClient.sendRpcSync will just hang forever if the timeout set to unlimited.
Why are the changes needed?
To catch the exception of
ByteBuffer.allocatein corner case.Does this PR introduce any user-facing change?
No.
How was this patch tested?
Quote the local test in apache/spark#24964