[Data] Add preemption test for `DataOpTask` and refactor test utilities #57883

bveeramani · 2025-10-18T20:23:49Z

This PR adds a test to verify that DataOpTask handles node failures correctly during execution. To enable this testing, callback seams are added to DataOpTask that allow tests to simulate preemption scenarios by killing and restarting nodes at specific points during task execution.

Summary

Add callback seams (block_ready_callback and metadata_ready_callback) to DataOpTask for testing purposes
Add has_finished property to track task completion state
Create create_stub_streaming_gen helper function to simplify test setup
Refactor existing DataOpTask tests to use the new helper function
Add new parametrized test test_on_data_ready_with_preemption to verify behavior when nodes fail during execution

Test plan

Existing tests pass with refactored code
New preemption test validates that on_data_ready handles node failures correctly by testing both block and metadata callback scenarios

Signed-off-by: Balaji Veeramani <[email protected]>

gemini-code-assist

Code Review

This pull request is a solid improvement to the test suite for DataOpTask. The introduction of callback seams for testing preemption scenarios is a good pattern, and the refactoring of existing tests using the new create_stub_streaming_gen helper function significantly improves clarity and maintainability. The new preemption test is well-designed, especially with the consideration of disabling object inlining to properly test metadata fetching from a failed node. I've identified a couple of minor issues: a logical error in DataOpTask where a callback receives an incorrect argument, and a redundant line in the new test. Once these are addressed, this will be an excellent contribution.

python/ray/data/_internal/execution/interfaces/physical_operator.py

python/ray/data/tests/test_streaming_executor.py

…or.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Balaji Veeramani <[email protected]>

bveeramani · 2025-10-18T20:26:48Z

python/ray/data/tests/test_streaming_executor.py

+        # Ray inlines small objects (including metadata) by storing them directly with
+        # the object reference itself rather than in the remote node's object store.
+        # Consequently, when the streaming executor calls `ray.get` on metadata from a
+        # node that has died, the call succeeds because the inlined metadata is not
+        # stored in the failed node's object store. To explicitly test the case where
+        # metadata resides in the object store (and becomes unavailable when the node
+        # dies), we disable inlining by setting the maximum inline size to 0. This
+        # simulates scenarios where metadata is too large to inline, which can occur in
+        # practice when schemas contain many fields.


@israbbani Can you confirm I'm not lying here

Signed-off-by: Balaji Veeramani <[email protected]>

…d-test Signed-off-by: Balaji Veeramani <[email protected]>

Signed-off-by: Balaji Veeramani <[email protected]>

python/ray/data/tests/test_streaming_executor.py

python/ray/data/_internal/execution/interfaces/physical_operator.py

Signed-off-by: Balaji Veeramani <[email protected]>

python/ray/tests/conftest.py

Signed-off-by: Balaji Veeramani <[email protected]>

…es (ray-project#57883) This PR adds a test to verify that DataOpTask handles node failures correctly during execution. To enable this testing, callback seams are added to DataOpTask that allow tests to simulate preemption scenarios by killing and restarting nodes at specific points during task execution. ## Summary - Add callback seams (`block_ready_callback` and `metadata_ready_callback`) to `DataOpTask` for testing purposes - Add `has_finished` property to track task completion state - Create `create_stub_streaming_gen` helper function to simplify test setup - Refactor existing `DataOpTask` tests to use the new helper function - Add new parametrized test `test_on_data_ready_with_preemption` to verify behavior when nodes fail during execution ## Test plan - Existing tests pass with refactored code - New preemption test validates that `on_data_ready` handles node failures correctly by testing both block and metadata callback scenarios --------- Signed-off-by: Balaji Veeramani <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

…es (ray-project#57883) This PR adds a test to verify that DataOpTask handles node failures correctly during execution. To enable this testing, callback seams are added to DataOpTask that allow tests to simulate preemption scenarios by killing and restarting nodes at specific points during task execution. ## Summary - Add callback seams (`block_ready_callback` and `metadata_ready_callback`) to `DataOpTask` for testing purposes - Add `has_finished` property to track task completion state - Create `create_stub_streaming_gen` helper function to simplify test setup - Refactor existing `DataOpTask` tests to use the new helper function - Add new parametrized test `test_on_data_ready_with_preemption` to verify behavior when nodes fail during execution ## Test plan - Existing tests pass with refactored code - New preemption test validates that `on_data_ready` handles node failures correctly by testing both block and metadata callback scenarios --------- Signed-off-by: Balaji Veeramani <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Aydin Abiar <[email protected]>

…es (ray-project#57883) This PR adds a test to verify that DataOpTask handles node failures correctly during execution. To enable this testing, callback seams are added to DataOpTask that allow tests to simulate preemption scenarios by killing and restarting nodes at specific points during task execution. ## Summary - Add callback seams (`block_ready_callback` and `metadata_ready_callback`) to `DataOpTask` for testing purposes - Add `has_finished` property to track task completion state - Create `create_stub_streaming_gen` helper function to simplify test setup - Refactor existing `DataOpTask` tests to use the new helper function - Add new parametrized test `test_on_data_ready_with_preemption` to verify behavior when nodes fail during execution ## Test plan - Existing tests pass with refactored code - New preemption test validates that `on_data_ready` handles node failures correctly by testing both block and metadata callback scenarios --------- Signed-off-by: Balaji Veeramani <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Future-Outlier <[email protected]>

Initial commit

cece106

Signed-off-by: Balaji Veeramani <[email protected]>

bveeramani requested a review from a team as a code owner October 18, 2025 20:23

gemini-code-assist bot reviewed Oct 18, 2025

View reviewed changes

python/ray/data/_internal/execution/interfaces/physical_operator.py Outdated Show resolved Hide resolved

python/ray/data/tests/test_streaming_executor.py Outdated Show resolved Hide resolved

This comment was marked as outdated.

Sign in to view

bveeramani assigned israbbani Oct 18, 2025

Update python/ray/data/_internal/execution/interfaces/physical_operat…

91b8e46

…or.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Balaji Veeramani <[email protected]>

bveeramani commented Oct 18, 2025

View reviewed changes

bveeramani added 2 commits October 18, 2025 13:28

Fix bug

8fbc515

Signed-off-by: Balaji Veeramani <[email protected]>

Merge branch 'add-test' of https://github.com/ray-project/ray into ad…

e0ff24a

…d-test Signed-off-by: Balaji Veeramani <[email protected]>

bveeramani changed the title ~~Add preemption test for DataOpTask and refactor test utilities~~ [Data] Add preemption test for DataOpTask and refactor test utilities Oct 18, 2025

bveeramani added the go add ONLY when ready to merge, run all tests label Oct 18, 2025

bveeramani assigned iamjustinhsu Oct 18, 2025

Merge branch 'master' into add-test

368a86a

This comment was marked as outdated.

Sign in to view

ray-gardener bot added the data Ray Data-related issues label Oct 19, 2025

bveeramani marked this pull request as draft October 20, 2025 16:28

Fix flakiness

41f90b6

Signed-off-by: Balaji Veeramani <[email protected]>

bveeramani marked this pull request as ready for review October 20, 2025 17:16

iamjustinhsu reviewed Oct 20, 2025

View reviewed changes

python/ray/data/tests/test_streaming_executor.py Show resolved Hide resolved

python/ray/data/_internal/execution/interfaces/physical_operator.py Show resolved Hide resolved

Address review comments

2b87518

Signed-off-by: Balaji Veeramani <[email protected]>

bveeramani requested a review from a team as a code owner October 21, 2025 18:54

iamjustinhsu approved these changes Oct 21, 2025

View reviewed changes

python/ray/tests/conftest.py Outdated Show resolved Hide resolved

Remove diff

49387b0

Signed-off-by: Balaji Veeramani <[email protected]>

bveeramani enabled auto-merge (squash) October 22, 2025 00:00

bveeramani merged commit 222c22e into master Oct 22, 2025
7 checks passed

bveeramani deleted the add-test branch October 22, 2025 01:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Data] Add preemption test for `DataOpTask` and refactor test utilities #57883

[Data] Add preemption test for `DataOpTask` and refactor test utilities #57883

Uh oh!

bveeramani commented Oct 18, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

bveeramani Oct 18, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Data] Add preemption test for DataOpTask and refactor test utilities #57883

[Data] Add preemption test for DataOpTask and refactor test utilities #57883

Uh oh!

Conversation

bveeramani commented Oct 18, 2025

Summary

Test plan

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

bveeramani Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Data] Add preemption test for `DataOpTask` and refactor test utilities #57883

[Data] Add preemption test for `DataOpTask` and refactor test utilities #57883