-
Notifications
You must be signed in to change notification settings - Fork 7k
[Data] Add preemption test for DataOpTask and refactor test utilities
#57883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Balaji Veeramani <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a solid improvement to the test suite for DataOpTask. The introduction of callback seams for testing preemption scenarios is a good pattern, and the refactoring of existing tests using the new create_stub_streaming_gen helper function significantly improves clarity and maintainability. The new preemption test is well-designed, especially with the consideration of disabling object inlining to properly test metadata fetching from a failed node. I've identified a couple of minor issues: a logical error in DataOpTask where a callback receives an incorrect argument, and a redundant line in the new test. Once these are addressed, this will be an excellent contribution.
python/ray/data/_internal/execution/interfaces/physical_operator.py
Outdated
Show resolved
Hide resolved
…or.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Balaji Veeramani <[email protected]>
| # Ray inlines small objects (including metadata) by storing them directly with | ||
| # the object reference itself rather than in the remote node's object store. | ||
| # Consequently, when the streaming executor calls `ray.get` on metadata from a | ||
| # node that has died, the call succeeds because the inlined metadata is not | ||
| # stored in the failed node's object store. To explicitly test the case where | ||
| # metadata resides in the object store (and becomes unavailable when the node | ||
| # dies), we disable inlining by setting the maximum inline size to 0. This | ||
| # simulates scenarios where metadata is too large to inline, which can occur in | ||
| # practice when schemas contain many fields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@israbbani Can you confirm I'm not lying here
Signed-off-by: Balaji Veeramani <[email protected]>
…d-test Signed-off-by: Balaji Veeramani <[email protected]>
DataOpTask and refactor test utilities
Signed-off-by: Balaji Veeramani <[email protected]>
Signed-off-by: Balaji Veeramani <[email protected]>
Signed-off-by: Balaji Veeramani <[email protected]>
…es (ray-project#57883) This PR adds a test to verify that DataOpTask handles node failures correctly during execution. To enable this testing, callback seams are added to DataOpTask that allow tests to simulate preemption scenarios by killing and restarting nodes at specific points during task execution. ## Summary - Add callback seams (`block_ready_callback` and `metadata_ready_callback`) to `DataOpTask` for testing purposes - Add `has_finished` property to track task completion state - Create `create_stub_streaming_gen` helper function to simplify test setup - Refactor existing `DataOpTask` tests to use the new helper function - Add new parametrized test `test_on_data_ready_with_preemption` to verify behavior when nodes fail during execution ## Test plan - Existing tests pass with refactored code - New preemption test validates that `on_data_ready` handles node failures correctly by testing both block and metadata callback scenarios --------- Signed-off-by: Balaji Veeramani <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…es (ray-project#57883) This PR adds a test to verify that DataOpTask handles node failures correctly during execution. To enable this testing, callback seams are added to DataOpTask that allow tests to simulate preemption scenarios by killing and restarting nodes at specific points during task execution. ## Summary - Add callback seams (`block_ready_callback` and `metadata_ready_callback`) to `DataOpTask` for testing purposes - Add `has_finished` property to track task completion state - Create `create_stub_streaming_gen` helper function to simplify test setup - Refactor existing `DataOpTask` tests to use the new helper function - Add new parametrized test `test_on_data_ready_with_preemption` to verify behavior when nodes fail during execution ## Test plan - Existing tests pass with refactored code - New preemption test validates that `on_data_ready` handles node failures correctly by testing both block and metadata callback scenarios --------- Signed-off-by: Balaji Veeramani <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…es (ray-project#57883) This PR adds a test to verify that DataOpTask handles node failures correctly during execution. To enable this testing, callback seams are added to DataOpTask that allow tests to simulate preemption scenarios by killing and restarting nodes at specific points during task execution. ## Summary - Add callback seams (`block_ready_callback` and `metadata_ready_callback`) to `DataOpTask` for testing purposes - Add `has_finished` property to track task completion state - Create `create_stub_streaming_gen` helper function to simplify test setup - Refactor existing `DataOpTask` tests to use the new helper function - Add new parametrized test `test_on_data_ready_with_preemption` to verify behavior when nodes fail during execution ## Test plan - Existing tests pass with refactored code - New preemption test validates that `on_data_ready` handles node failures correctly by testing both block and metadata callback scenarios --------- Signed-off-by: Balaji Veeramani <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Aydin Abiar <[email protected]>
…es (ray-project#57883) This PR adds a test to verify that DataOpTask handles node failures correctly during execution. To enable this testing, callback seams are added to DataOpTask that allow tests to simulate preemption scenarios by killing and restarting nodes at specific points during task execution. ## Summary - Add callback seams (`block_ready_callback` and `metadata_ready_callback`) to `DataOpTask` for testing purposes - Add `has_finished` property to track task completion state - Create `create_stub_streaming_gen` helper function to simplify test setup - Refactor existing `DataOpTask` tests to use the new helper function - Add new parametrized test `test_on_data_ready_with_preemption` to verify behavior when nodes fail during execution ## Test plan - Existing tests pass with refactored code - New preemption test validates that `on_data_ready` handles node failures correctly by testing both block and metadata callback scenarios --------- Signed-off-by: Balaji Veeramani <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Future-Outlier <[email protected]>
This PR adds a test to verify that DataOpTask handles node failures correctly during execution. To enable this testing, callback seams are added to DataOpTask that allow tests to simulate preemption scenarios by killing and restarting nodes at specific points during task execution.
Summary
block_ready_callbackandmetadata_ready_callback) toDataOpTaskfor testing purposeshas_finishedproperty to track task completion statecreate_stub_streaming_genhelper function to simplify test setupDataOpTasktests to use the new helper functiontest_on_data_ready_with_preemptionto verify behavior when nodes fail during executionTest plan
on_data_readyhandles node failures correctly by testing both block and metadata callback scenarios