-
Notifications
You must be signed in to change notification settings - Fork 7k
[core] Make PinObjectIDs RPC Fault Tolerant #56443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Make PinObjectIDs RPC Fault Tolerant #56443
Conversation
Signed-off-by: joshlee <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request makes the PinObjectIDs RPC fault-tolerant by switching to a retryable RPC client. To support this, new unit tests have been added to verify the idempotency of the HandlePinObjectIDs handler. The changes look correct and well-tested. My main feedback is to refactor the new tests to reduce code duplication.
Signed-off-by: joshlee <[email protected]>
Signed-off-by: joshlee <[email protected]>
Signed-off-by: joshlee <[email protected]>
Signed-off-by: joshlee <[email protected]>
Signed-off-by: joshlee <[email protected]>
Signed-off-by: joshlee <[email protected]>
| buffers.second.size())}; | ||
| object_buffers->emplace_back(shm_buffer); | ||
| } else { | ||
| object_buffers->emplace_back(plasma::ObjectBuffer{}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the change to emplace in the buffer even if the obj isn't there in fake plasma client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prod version does this as well:
ray/src/ray/object_manager/plasma/client.cc
Line 631 in 0c62bdb
| *out = std::vector<ObjectBuffer>(num_objects); |
by resizing the vector then skipping over indices that don't contain. Hence not emplacing caused issues with testing since it varied from the test causing RAY_CHECK failure here:
ray/src/ray/raylet/node_manager.cc
Line 2421 in 0c62bdb
| RAY_CHECK_EQ(object_ids.size(), results.size()); |
TLDR: it uses these dummy default values to detect what objects are not in plasma
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need the change for this test though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PinObjectIDs test also test the not happy path where we try to pin an object that doesn't exist which will trigger the RAY_CHECK in node_manager.cc I linked above
Signed-off-by: joshlee <[email protected]>
Making PinObjectIDs RPC fault tolerant. Added cpp unit tests to verify idempotency. --------- Signed-off-by: joshlee <[email protected]> Signed-off-by: Zhiqiang Ma <[email protected]>
Making PinObjectIDs RPC fault tolerant. Added cpp unit tests to verify idempotency. --------- Signed-off-by: joshlee <[email protected]> Signed-off-by: zac <[email protected]>
Making PinObjectIDs RPC fault tolerant. Added cpp unit tests to verify idempotency. --------- Signed-off-by: joshlee <[email protected]> Signed-off-by: elliot-barn <[email protected]>
Making PinObjectIDs RPC fault tolerant. Added cpp unit tests to verify idempotency. --------- Signed-off-by: joshlee <[email protected]> Signed-off-by: Marco Stephan <[email protected]>
Making PinObjectIDs RPC fault tolerant. Added cpp unit tests to verify idempotency. --------- Signed-off-by: joshlee <[email protected]> Signed-off-by: elliot-barn <[email protected]>
Making PinObjectIDs RPC fault tolerant. Added cpp unit tests to verify idempotency. --------- Signed-off-by: joshlee <[email protected]> Signed-off-by: Douglas Strodtman <[email protected]>
Making PinObjectIDs RPC fault tolerant. Added cpp unit tests to verify idempotency. --------- Signed-off-by: joshlee <[email protected]>
Making PinObjectIDs RPC fault tolerant. Added cpp unit tests to verify idempotency. --------- Signed-off-by: joshlee <[email protected]>
Why are these changes needed?
Making PinObjectIDs RPC fault tolerant. Added cpp unit tests to verify idempotency.
Related issue number
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.