-
Notifications
You must be signed in to change notification settings - Fork 7k
[xray] Implements ray.wait #2162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
49 commits
Select commit
Hold shift + click to select a range
0e18ca7
Use pubsub instead of timeout.
elibol fc2572c
Correct status message.
elibol 2e8af60
eric's feedback!
elibol b57a548
Changes from Stephanie's review.
elibol a128698
object directory changes for ray.wait.
elibol fa5c32d
Merge branch 'master' into om_pubsub
elibol 0ccf46b
Merge branch 'master' into om_pubsub
elibol b02de4f
Merge branch 'om_pubsub' into om_wait
elibol f9a9e16
wait without testing or timeout=0.
elibol 15b7f61
Handle remaining cases for wait.
elibol a22263b
linting
elibol 8ab41f0
added tests of om wait imp.
elibol 98bacfa
add local test.
elibol d518a89
Merge branch 'master' into om_wait
elibol 53f33e0
plasma imp.
elibol 8ef35f7
block worker as with pull.
elibol 6e10f9e
local scheduler implementation of wait.
elibol 9a95c65
with passing tests.
elibol aa12bd7
minor adjustments.
elibol 9e1602d
Merge branch 'master' into om_wait_local_scheduler
elibol 304b39c
handle return statuses.
elibol 5d63bb3
enable more tests.
elibol cf1fdb2
add test for existing num_returns semantics, and maintain existing nu…
elibol 531d024
move error handling to both code paths.
elibol d0d3ea4
implementing another round of feedback.
elibol 62ae832
Comment on OM tests.
elibol 67eef67
remove check for length zero list.
elibol 0796a17
remove elapsed.
elibol dd9f0db
Preserve input/output order.
elibol 9d4ed2b
debias local objects.
elibol 541b88c
Merge branch 'master' into om_wait_local_scheduler
elibol 58af739
use common helper function in object directory.
elibol d9ef29b
updated documentation
elibol fa1928b
linting.
elibol d41b1d0
handle return status.
elibol aeaab5b
simplify order preservation test + fix valgrind test error.
elibol 048f45f
update name of final Lookup callback.
elibol 0aa7525
Merge branch 'master' into om_wait_local_scheduler
elibol 833939f
linting
elibol 8e1947c
c++ style casting.
elibol 83d04dd
linting.
elibol 080282f
linting.
elibol a58f5c9
incorporate second round of feedback.
elibol c6d8ba5
correct python tests.
elibol 7d8d756
test comments.
elibol 6b6e2f3
incorporate reviews.
elibol 3a86c93
Fixes with regression tests.
elibol 1a99f25
update documentation.
elibol 00eafd7
reference to avoid copy.
elibol File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -377,7 +377,7 @@ ray::Status ObjectManager::Wait(const std::vector<ObjectID> &object_ids, | |
| wait_state.object_id_order = object_ids; | ||
| wait_state.timeout_ms = timeout_ms; | ||
| wait_state.num_required_objects = num_required_objects; | ||
| for (auto &object_id : object_ids) { | ||
| for (const auto &object_id : object_ids) { | ||
| if (local_objects_.count(object_id) > 0) { | ||
| wait_state.found.insert(object_id); | ||
| } else { | ||
|
|
@@ -393,7 +393,7 @@ ray::Status ObjectManager::Wait(const std::vector<ObjectID> &object_ids, | |
| // we obtain information about all given objects, regardless of their location. | ||
| // This is required to ensure we do not bias returning locally available objects | ||
| // as ready whenever Wait is invoked with a mixture of local and remote objects. | ||
| for (auto &object_id : wait_state.remaining) { | ||
| for (const auto &object_id : wait_state.remaining) { | ||
| // Lookup remaining objects. | ||
| wait_state.requested_objects.insert(object_id); | ||
| RAY_CHECK_OK(object_directory_->LookupLocations( | ||
|
|
@@ -422,7 +422,7 @@ void ObjectManager::AllWaitLookupsComplete(const UniqueID &wait_id) { | |
| WaitComplete(wait_id); | ||
| } else { | ||
| // Subscribe to objects in order to ensure Wait-related tests are deterministic. | ||
| for (auto &object_id : wait_state.object_id_order) { | ||
| for (const auto &object_id : wait_state.object_id_order) { | ||
| if (wait_state.remaining.count(object_id) == 0) { | ||
| continue; | ||
| } | ||
|
|
@@ -470,7 +470,7 @@ void ObjectManager::WaitComplete(const UniqueID &wait_id) { | |
| RAY_CHECK(wait_state.timeout_ms > 0 || wait_state.timeout_ms == -1); | ||
| } | ||
| // Unsubscribe to any objects that weren't found in the time allotted. | ||
| for (auto &object_id : wait_state.requested_objects) { | ||
| for (const auto &object_id : wait_state.requested_objects) { | ||
| RAY_CHECK_OK(object_directory_->UnsubscribeObjectLocations(wait_id, object_id)); | ||
| } | ||
| // Cancel the timer. This is okay even if the timer hasn't been started. | ||
|
|
@@ -480,7 +480,7 @@ void ObjectManager::WaitComplete(const UniqueID &wait_id) { | |
| // Order objects according to input order. | ||
| std::vector<ObjectID> found; | ||
| std::vector<ObjectID> remaining; | ||
| for (auto item : wait_state.object_id_order) { | ||
| for (const auto item : wait_state.object_id_order) { | ||
|
||
| if (found.size() < wait_state.num_required_objects && | ||
| wait_state.found.count(item) > 0) { | ||
| found.push_back(item); | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug that I described in the earlier comment is still an issue here.
wait_stateis a reference to the value atactive_wait_requests_. The reference will become invalid if the entry is erased fromactive_wait_requests_between iterations of this for loop, so this line can produce undefined behavior.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the relevant check that corrects the bug you described earlier:
active_wait_requests_.find(wait_id) == active_wait_requests_.end()There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I don't think it will break in that particular way anymore, but undefined behavior is still possible because of the reference to
wait_state. Same underlying issue, but it will break at a different line.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. I've made the fix and added a regression test.