Skip to content

Fix inconsistent ordering with offset and limit#25216

Merged
tdcmeehan merged 1 commit intoprestodb:masterfrom
xieandrew:fix-offset-limit-ordering
Jul 21, 2025
Merged

Fix inconsistent ordering with offset and limit#25216
tdcmeehan merged 1 commit intoprestodb:masterfrom
xieandrew:fix-offset-limit-ordering

Conversation

@xieandrew
Copy link
Member

@xieandrew xieandrew commented May 28, 2025

Description

Fixes inconsistent output order when using offset and limit together by passing through parent StreamPreferredProperties in AddLocalExchanges. This prevents visitLimit from overriding the parent orderSensitive field, which currently causes the loss of ordering. This change causes order-breaking Round Robin exchanges to be removed from the logical plan.

An implementation of visitOffset in PlanPrinter was also added so that debugging with the session property verbose_optimizer_info_enabled will not cause "not yet implemented" errors.

Motivation and Context

Fixes #25071.

Impact

The offset clause when used with limit and order by will now produce accurate results.

Test Plan

I ran the following test query both before and after this change on the HiveQueryRunner:
select c_customer_id from hive.tpcds.customer order by c_customer_id offset 1 limit 800;

Without this change, the query produces completely different output when retried multiple times. With this change, the query returns consistent output every time. I also compared the logical plans using explain (type logical) and the Round Robin exchanges were removed after this change.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* Fix inconsistent ordering with offset and limit

@xieandrew xieandrew requested a review from a team as a code owner May 28, 2025 16:44
@xieandrew xieandrew requested a review from jaystarshot May 28, 2025 16:44
Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this change! Can we also add a unit test that uses the limit and offset clauses to verify that this change works? I think the test should

  1. Execute a query with order by, offset, and limit, then save the results.
  2. For N times, execute the query in (1). Verify the results are the same as in (1)

We need to ensure that the query runner's parallelism is > 1 in order to trigger the original bug, and (2) choose a suitable N such that without this change, the failure is triggered somewhat consistently.

@xieandrew
Copy link
Member Author

xieandrew commented May 28, 2025

I added the requested unit test for checking result consistency over 5 runs. The minimum limit value required is 256 as the bug needs multiple pages to trigger. This test passes with this change and fails without this change (failed 50/50 times with 5 query runs per test run, max query runs required to fail was 4). Not sure where to specify query runner parallelism but the bug can be reproduced already anyways. Let me know if I need to move this test to a different package/class.

Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM- will wait for @aaneja to take a look before approving+merging. I couldn't come to an understanding of why the properties weren't propagated previously. It also doesn't seem to cause failures in our tests, so I am ok with this change.

@xieandrew xieandrew force-pushed the fix-offset-limit-ordering branch from 4d2efd9 to 666f006 Compare May 29, 2025 19:03
@xieandrew xieandrew requested a review from aaneja May 30, 2025 16:31
@xieandrew xieandrew force-pushed the fix-offset-limit-ordering branch from d129603 to cc351b0 Compare June 6, 2025 18:18
Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but one request to move the test

@xieandrew xieandrew force-pushed the fix-offset-limit-ordering branch from cc351b0 to d454d9c Compare June 6, 2025 19:04
@xieandrew xieandrew force-pushed the fix-offset-limit-ordering branch from d454d9c to ec54f68 Compare June 6, 2025 19:06
@xieandrew xieandrew requested a review from ZacBlanco June 10, 2025 16:09
ZacBlanco
ZacBlanco previously approved these changes Jun 10, 2025
Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's wait for @aaneja 's review

@xieandrew xieandrew added the from:IBM PR from IBM label Jun 17, 2025
@prestodb-ci prestodb-ci requested review from a team and namya28 and removed request for a team June 17, 2025 20:42
@xieandrew xieandrew force-pushed the fix-offset-limit-ordering branch from ec54f68 to 5fbe852 Compare June 25, 2025 16:34
@steveburnett
Copy link
Contributor

An implementation of visitOffset in PlanPrinter was also added so that debugging with the session property
verbose_optimizer_info_enabled will not cause "not yet implemented" errors.

I don't find verbose_optimizer_info_enabled in the Presto documentation, and I don't think it's appropriate to add that doc requirement to this PR.

Should I open a doc issue for the missing session property?

@xieandrew
Copy link
Member Author

I'm not sure what the policy is on documenting session properties is since there are a lot of missing ones, though this one is only used for debugging during development. Separately, it may be important to document the offset_clause_enabled session prop and offset-clause-enabled configuration prop since those are needed by the user to actually use the main feature this fix relates to. I only see those properties in a release note: https://prestodb.io/docs/current/release/release-0.257.html#general-changes

@steveburnett
Copy link
Contributor

I'm not sure what the policy is on documenting session properties is since there are a lot of missing ones, though this one is only used for debugging during development. Separately, it may be important to document the offset_clause_enabled session prop and offset-clause-enabled configuration prop since those are needed by the user to actually use the main feature this fix relates to. I only see those properties in a release note: https://prestodb.io/docs/current/release/release-0.257.html#general-changes

The policy is that they should all be documented but many never were. I am trying to address that as I notice individual gaps until I can plan how to catch up on all of them.

Thank you for the help you gave by identifying offset_clause_enabled session prop and offset-clause-enabled configuration properties! I will open a doc issue for these missing properties.

aaneja
aaneja previously approved these changes Jul 1, 2025
Copy link
Contributor

@aaneja aaneja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit changes. LGTM otherwise, since no tests failed

@xieandrew xieandrew force-pushed the fix-offset-limit-ordering branch from 8f041a5 to d0d441e Compare July 1, 2025 16:42
steveburnett added a commit that referenced this pull request Jul 2, 2025
…ies (#25463)

## Description
Add doc for `offset-clause-enabled` and `offset_clause_enabled`
configuration and session properties. Fixes #25453.

## Motivation and Context
In [a
comment](#25216 (comment))
on #25216, @xieandrew mentioned
_Separately, it may be important to document the offset_clause_enabled
session prop and offset-clause-enabled configuration prop since those
are needed by the user to actually use the main feature this fix relates
to. I only see those properties in a release note:
https://prestodb.io/docs/current/release/release-0.257.html#general-changes_.

## Impact
Documentation. 

## Test Plan
Local doc builds. Screenshots: 

Configuration property `offset-clause-enabled`
<img width="631" alt="Screenshot 2025-06-30 at 3 12 34 PM"
src="https://github.com/user-attachments/assets/a371db01-228d-4690-99ae-b8b5bd0a4fdd"
/>

Session property `offset_clause_enabled`
<img width="632" alt="Screenshot 2025-06-30 at 3 12 39 PM"
src="https://github.com/user-attachments/assets/7962ee37-a77b-4de5-ae50-37a2ad4600e5"
/>

## Contributor checklist

- [x] Please make sure your submission complies with our [contributing
guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md),
in particular [code
style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style)
and [commit
standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards).
- [x] PR description addresses the issue accurately and concisely. If
the change is non-trivial, a GitHub Issue is referenced.
- [x] Documented new properties (with its default value), SQL syntax,
functions, or other functionality.
- [x] If release notes are required, they follow the [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines).
- [x] Adequate tests were added if applicable.
- [x] CI passed.

## Release Notes
```
== NO RELEASE NOTE ==
```
@xieandrew xieandrew requested a review from ZacBlanco July 2, 2025 16:15
@xieandrew
Copy link
Member Author

@ZacBlanco could I get a final review on this?

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch.

@tdcmeehan tdcmeehan merged commit bffed75 into prestodb:master Jul 21, 2025
112 of 113 checks passed
@prestodb-ci prestodb-ci mentioned this pull request Jul 28, 2025
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OFFSET and LIMIT combination gives inconsistent output

5 participants