[Refactor] [Data]: test_limit_pushdown_conservative - spilt tests & fix ordering assumption #58746

ryankert01 · 2025-11-18T16:51:36Z

Description

Split multi-case test function

test_limit_pushdown_conservative → 10 separate tests (basic fusion, limit fusion reversed, multiple limit fusion, maprows, mapbatches, filter, project, sort, complex interweaved operations, and between two map operators)

Fixed ordering assumptions

Added check_ordering=False to union tests (blocks may interleave)
Added check_ordering=False to project test with override_num_blocks (parallel execution)

Related issues

Related to #58655

Additional information

gemini-code-assist

Code Review

This pull request is a great refactoring effort, splitting a large, multi-case test function into ten smaller, more focused tests. This significantly improves the readability and maintainability of the test suite. The fixes for ordering assumptions by adding check_ordering=False are also correct and well-justified for parallel execution and union operations. I've added a few suggestions to further enhance the refactoring by replacing helper identity functions with more concise lambda expressions.

gemini-code-assist · 2025-11-18T16:53:29Z

python/ray/data/tests/test_execution_optimizer_limit_pushdown.py

+    def f1(x):
+        return x
+
    ds = ray.data.range(100, override_num_blocks=100).map(f1).limit(1)
    _check_valid_plan_and_result(
        ds, "Read[ReadRange] -> Limit[limit=1] -> MapRows[Map(f1)]", [{"id": 0}]
    )


For conciseness and better readability, the f1 identity function can be replaced with an inline lambda x: x. This change also requires updating the expected plan string to use Map(<lambda>) instead of Map(f1).

Suggested change

def f1(x):

return x

ds = ray.data.range(100, override_num_blocks=100).map(f1).limit(1)

_check_valid_plan_and_result(

ds, "Read[ReadRange] -> Limit[limit=1] -> MapRows[Map(f1)]", [{"id": 0}]

)

ds = ray.data.range(100, override_num_blocks=100).map(lambda x: x).limit(1)

_check_valid_plan_and_result(

ds, "Read[ReadRange] -> Limit[limit=1] -> MapRows[Map(<lambda>)]", [{"id": 0}]

)

gemini-code-assist · 2025-11-18T16:53:30Z

python/ray/data/tests/test_execution_optimizer_limit_pushdown.py

+    def f2(x):
+        return x
+
    ds = ray.data.range(100, override_num_blocks=100).map_batches(f2).limit(1)
    _check_valid_plan_and_result(
        ds,
        "Read[ReadRange] -> Limit[limit=1] -> MapBatches[MapBatches(f2)]",
        [{"id": 0}],
    )


Similar to other tests, the f2 identity function can be replaced with a lambda x: x for better code conciseness. The expected plan string should be updated accordingly.

ds = ray.data.range(100, override_num_blocks=100).map_batches(lambda x: x).limit(1) _check_valid_plan_and_result( ds, "Read[ReadRange] -> Limit[limit=1] -> MapBatches[MapBatches(<lambda>)]", [{"id": 0}], )

ryankert01 · 2025-11-19T01:01:39Z

PTAL @owenowenisme @bveeramani

bveeramani · 2025-11-19T02:08:30Z

python/ray/data/tests/test_execution_optimizer_limit_pushdown.py

-        return x
-
-    # Test 1: Basic Limit -> Limit fusion (should still work)
+def test_limit_pushdown_basic_limit_fusion(ray_start_regular_shared_2_cpus):


Except for the tests listed below, the Ray Data pipelines in this diff aren't guaranteed to produce rows in a specific output order. So, if you don't set check_ordering=False, the tests might raise false negatives.

Could you update the appropriate tests to use check_ordering=False?

List of tests with guaranteed order (okay to use check_ordering=True):

test_limit_pushdown_union_with_sort

test_limit_pushdown_complex_interweaved_operations

test_limit_pushdown_stops_at_sort

cursor

Bug: Missing check_ordering parameter in parametrized test (Bugbot Rules)

The test test_limit_pushdown_udf_modifying_row_count_with_map_batches calls _check_valid_plan_and_result without check_ordering=False, but the pipeline doesn't include a .sort() operation to guarantee row order. According to the reviewer's feedback, all tests without guaranteed order need check_ordering=False to avoid false negatives from non-deterministic row ordering in Ray Data pipelines.

python/ray/data/tests/test_execution_optimizer_limit_pushdown.py#L594-L599

https://github.com/ray-project/ray/blob/c17fca7a4eef301dca5ac4b3c3fa1006ab522a87/python/ray/data/tests/test_execution_optimizer_limit_pushdown.py#L594-L599

ryankert01 · 2025-11-22T01:37:16Z

@bveeramani Sorry for the late response. I walked through the code but still wasn’t able to conclude safely. I’ve gone ahead and updated them with check_ordering=False.

Signed-off-by: ryankert01 <[email protected]>

bveeramani

LGTM!

bveeramani · 2025-11-24T17:57:26Z

Just enabled auto-merge. ty for the contribution!

…ix ordering assumption (ray-project#58746) ## Description **Split multi-case test function** - `test_limit_pushdown_conservative` → 10 separate tests (basic fusion, limit fusion reversed, multiple limit fusion, maprows, mapbatches, filter, project, sort, complex interweaved operations, and between two map operators) **Fixed ordering assumptions** - Added `check_ordering=False` to union tests (blocks may interleave) - Added `check_ordering=False` to project test with `override_num_blocks` (parallel execution) ## Related issues Related to ray-project#58655 ## Additional information --------- Signed-off-by: ryankert01 <[email protected]> Signed-off-by: YK <[email protected]>

…ix ordering assumption (ray-project#58746) ## Description **Split multi-case test function** - `test_limit_pushdown_conservative` → 10 separate tests (basic fusion, limit fusion reversed, multiple limit fusion, maprows, mapbatches, filter, project, sort, complex interweaved operations, and between two map operators) **Fixed ordering assumptions** - Added `check_ordering=False` to union tests (blocks may interleave) - Added `check_ordering=False` to project test with `override_num_blocks` (parallel execution) ## Related issues Related to ray-project#58655 ## Additional information --------- Signed-off-by: ryankert01 <[email protected]>

ryankert01 requested a review from a team as a code owner November 18, 2025 16:51

ryankert01 changed the title ~~[Refactor] [Data]: spilt tests & fix ordering assumption~~ [Refactor] [Data]: test_limit_pushdown_conservative - spilt tests & fix ordering assumption Nov 18, 2025

gemini-code-assist bot reviewed Nov 18, 2025

View reviewed changes

ray-gardener bot added data Ray Data-related issues community-contribution Contributed by the community labels Nov 18, 2025

bveeramani reviewed Nov 19, 2025

View reviewed changes

cursor bot reviewed Nov 22, 2025

View reviewed changes

ryankert01 requested a review from bveeramani November 23, 2025 05:38

ryankert01 force-pushed the refactor/test_execution_optimizer_limit_pushdown branch from 8865ad2 to 5099f4c Compare November 24, 2025 01:12

ryankert01 added 2 commits November 24, 2025 14:50

refactor: spilt tests & fix ordering assumption

1ce004c

Signed-off-by: ryankert01 <[email protected]>

add check_ordering=False

a7f5a06

Signed-off-by: ryankert01 <[email protected]>

ryankert01 force-pushed the refactor/test_execution_optimizer_limit_pushdown branch from 5099f4c to a7f5a06 Compare November 24, 2025 06:50

bveeramani approved these changes Nov 24, 2025

View reviewed changes

bveeramani enabled auto-merge (squash) November 24, 2025 17:57

github-actions bot added the go add ONLY when ready to merge, run all tests label Nov 24, 2025

bveeramani merged commit 03fd86f into ray-project:master Nov 24, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Refactor] [Data]: test_limit_pushdown_conservative - spilt tests & fix ordering assumption #58746

[Refactor] [Data]: test_limit_pushdown_conservative - spilt tests & fix ordering assumption #58746

Uh oh!

ryankert01 commented Nov 18, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 18, 2025

Uh oh!

gemini-code-assist bot Nov 18, 2025

Uh oh!

ryankert01 commented Nov 19, 2025

Uh oh!

bveeramani Nov 19, 2025

Uh oh!

cursor bot left a comment

Uh oh!

ryankert01 commented Nov 22, 2025

Uh oh!

bveeramani left a comment

Uh oh!

bveeramani commented Nov 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Refactor] [Data]: test_limit_pushdown_conservative - spilt tests & fix ordering assumption #58746

[Refactor] [Data]: test_limit_pushdown_conservative - spilt tests & fix ordering assumption #58746

Uh oh!

Conversation

ryankert01 commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues

Additional information

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

ryankert01 commented Nov 19, 2025

Uh oh!

bveeramani Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Bug: Missing check_ordering parameter in parametrized test (Bugbot Rules)

Uh oh!

ryankert01 commented Nov 22, 2025

Uh oh!

bveeramani left a comment

Choose a reason for hiding this comment

Uh oh!

bveeramani commented Nov 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ryankert01 commented Nov 18, 2025 •

edited

Loading