Skip to content

Fix PushdownSort dropping LIMIT when source returns Exact#153

Merged
sgrebnov merged 1 commit into
spiceai-52.5from
sgrebnov/0416-improve-sort-pushdown
Apr 16, 2026
Merged

Fix PushdownSort dropping LIMIT when source returns Exact#153
sgrebnov merged 1 commit into
spiceai-52.5from
sgrebnov/0416-improve-sort-pushdown

Conversation

@sgrebnov
Copy link
Copy Markdown

@sgrebnov sgrebnov commented Apr 16, 2026

Which issue does this PR close?

When PushdownSort removes a SortExec because the source returns Exact (guaranteeing ordering), any fetch (LIMIT) on the SortExec was silently dropped, causing queries like SELECT * FROM t ORDER BY a LIMIT 1 to return all rows.

LimitPushdown runs before PushdownSort and merges GlobalLimitExec into SortExec.fetch. When PushdownSort later removes the SortExec in the Exact branch, the fetch value is lost

What changes are included in this PR?

In the Exact branch of PushdownSort, propagate fetch by first trying with_fetch() on the source (allowing sources like SQL-backed scans to add LIMIT to their query), falling back to wrapping with GlobalLimitExec.

Alternative considered

An alternative approach would be to pass the limit directly as a parameter to try_pushdown_sort(), allowing sources to incorporate both ORDER BY and LIMIT in plan if supported. This was rejected because try_pushdown_sort has ~30 implementations and is intentionally single-purpose (ordering only). Handling fetch separately via with_fetch() / GlobalLimitExec follows the existing composition pattern used by EnforceSorting and keeps the try_pushdown_sort API focused.

Copilot AI review requested due to automatic review settings April 16, 2026 11:28
@sgrebnov sgrebnov self-assigned this Apr 16, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a correctness issue in the physical optimizer where PushdownSort could eliminate a SortExec (when the source reports SortOrderPushdownResult::Exact) but accidentally drop SortExec.fetch (LIMIT), changing results for queries like ORDER BY ... LIMIT N.

Changes:

  • Preserve SortExec.fetch in the Exact pushdown path by attempting with_fetch() on the pushed-down source, with a fallback to adding a limit operator.
  • Add an ExactTestScan test helper that returns Exact from try_pushdown_sort and supports with_fetch.
  • Add unit tests asserting fetch preservation (and no-op behavior when fetch is absent) for the Exact path.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
datafusion/physical-optimizer/src/pushdown_sort.rs Propagates fetch when eliminating SortExec in the Exact branch, using with_fetch or a limit wrapper.
datafusion/core/tests/physical_optimizer/test_utils.rs Introduces ExactTestScan to simulate an Exact-ordering source and support fetch pushdown in tests.
datafusion/core/tests/physical_optimizer/pushdown_sort.rs Adds snapshot tests validating Exact sort elimination preserves fetch and avoids introducing a limit when fetch is absent.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread datafusion/physical-optimizer/src/pushdown_sort.rs
Comment thread datafusion/physical-optimizer/src/pushdown_sort.rs
Comment thread datafusion/physical-optimizer/src/pushdown_sort.rs
Comment thread datafusion/physical-optimizer/src/pushdown_sort.rs Outdated
Comment thread datafusion/core/tests/physical_optimizer/test_utils.rs Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants