Avoid planning unnecessary LIMIT/TopN/Sort/DistinctLimit by fgwang7w · Pull Request #14915 · prestodb/presto

fgwang7w · 2020-07-29T05:53:23Z

Cherry-pick of trinodb/trino#441 and trinodb/trino#818

== RELEASE NOTES ==

General Changes
* Avoid planning unnecessary LIMIT/TopN/Sort/DistinctLimit when relation is know to single row or less rows than requested
* The analyzer will emit a warning if a redundant ORDER BY is present

fixes #14897

fgwang7w · 2020-07-30T01:04:58Z

@mbasmanova @tdcmeehan this commit is ready for review. thank you!

fgwang7w · 2020-08-19T18:53:52Z

Hi @rschlussel could you please help review this pr? this item is perf-related, thank you!

rschlussel

I recommend adding a new rule to replace any input that isAtMost(0) with values node, and then removing the specialized logic here for 0 input.

This new rule can also replace the EvaluateZeroSample rule if you add SampleNode to the CardinalityExtractorPlanVisitor, and have it return zero for 0% sample (and otherwise, atLeast 0L).

rschlussel · 2020-08-19T19:27:22Z

...o-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/RemoveRedundantSort.java

when would this e true. this doesn't seem related to sort nodes. Maybe it should be its own rule to replace subplans that would return zero rows with an empty values.

So the new rule is EvaluateZeroCount which evaluates zeroSample, zero-topN, zero-limit, and zero-distinctLimit count.

don't need this if block anymore.

rschlussel · 2020-08-19T19:27:50Z

...o-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/RemoveRedundantSort.java

if we change the isAtMost(0) rule to be separate, then this should be isAtMostScalar()

Since the new rule does not implement a particular planNode, my take is to remain this code as it is to lookup from the tree and can be verified by testForZeroCardinality

rschlussel · 2020-08-19T19:28:30Z

...o-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/RemoveRedundantTopN.java

same comment as above regarding zero and scalar

this check is moved to the new rule now

rschlussel · 2020-08-19T19:40:20Z

also, can you separate out the changes in the analyzer/planner (don't plan unnecessary sort) into its own commit separate from the new optimizers, so it's easier to see what's going on there. I haven't looked at that part yet because it's hard to find all the related pieces.

kaikalur · 2020-09-01T22:44:58Z

I'm thinking this PR should be broken up into 2 or even 3. In particular, it will be good to do redundant orderby in a separate PR

fgwang7w · 2020-09-02T01:17:21Z

@kaikalur @rschlussel thank you all for reviewing the code changes. I am currently reworking this PR into multiple commits and put the unnecessary sort and redundant orderby in different PRs.

fgwang7w · 2020-11-06T17:58:01Z

@rschlussel could you please help review this PR again? all fix are in right commit version now. many thanks!

fgwang7w · 2020-12-09T23:35:23Z

@rschlussel could you please help approve this fix to be merged? This is a good performance fix to be included if possible soon. many thanks for help!

rschlussel

The title for the first commit is too long. Shorten it to "Avoid planning unnecessary LIMIT/TopN/Sort/DistinctLimit", and then add more description in the commit message body. We generally follow these guidelines: https://chris.beams.io/posts/git-commit/

...to-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/EvaluateZeroSample.java

presto-main/src/main/java/com/facebook/presto/SystemSessionProperties.java

presto-main/src/main/java/com/facebook/presto/sql/planner/PlanOptimizers.java

rschlussel · 2020-12-10T16:19:40Z

presto-main/src/main/java/com/facebook/presto/sql/planner/plan/Patterns.java

this belongs in the previous commit where DistinctLimit was introduced.

rschlussel · 2020-12-10T16:20:33Z

presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestOrderByQueries.java

why is this test removed?

actually this is working as design because ORDER BY in a subquery can be ignored. Rows in a table (or in a subquery in the FROM clause) do not come in any specific order, only when ORDER BY ... LIMIT changes the result, the set of rows.

rschlussel · 2020-12-10T16:21:31Z

presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestOrderByQueries.java

why is this test removed?

The purpose of the 2nd commit is to make the subquery's ORDER BY not preserved. The ordering only makes sense on the outermost query now. The original design of this cherry-pick states the same concept.

fgwang7w · 2020-12-15T01:28:27Z

Hi @rschlussel I revised both commits, regarding the last test removed, I have revised the testcase. Basically the 2nd commit introduces less preserved ordering in the subquery in which only combination of ORDER BYand LIMIT changes the specific order and set of rows, otherwise it would remain as unordered rows per SQL standard. @kaikalur please FYI

rschlussel

Nearly there. just some small comments about the tests.

rschlussel · 2020-12-23T17:34:21Z

presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestQueries.java

It seems in this case the limit is also redundant. We're just not smart enough to remove it. Maybe instead have 2 rows in the values node.

or I can change it to an upper bound of LIMIT, then it will make sense,but this is a good catch, we should remove limit if there's a enforceSingleRow like count(*) to make the optimizer smarter. I propose we fix it in a separate PR so that TransformCorrelatedSingleRowSubqueryToProject can take it into account

rschlussel · 2020-12-23T17:37:39Z

presto-main/src/test/java/com/facebook/presto/sql/query/TestSubqueries.java

I think limit here was incidental and we still want this test for unsupported subqueries to ensure they throw proper errors.

then I will add a seperate tc with group by a limit 1 to guard this sanity test

In addition to the Cherry-pick for removing redundant Limit/TopN/Sort/DistinctLimit, there are a few more rules added to replace any input that is zero-TopN/DistinctLimit/Limit Cherry-pick of trinodb/trino#441 Co-authored-by: praveenkrishna <praveenkrishna@tutanota.com>

Cherry-pick of trinodb/trino#818 Co-author: Martin Traverso <mtraverso@gmail.com>

rschlussel

Looks good. Thanks for sticking with it!

kaikalur · 2020-12-28T18:49:09Z

I'm not sure if we should hav those plan tests. We are planning to do more rewrites/optimizer rules so it's going to be painful to test these in the future. I say the plan tests should not be included. @rongrong WDYT?

rschlussel · 2020-12-28T19:40:40Z

I'm not sure if we should hav those plan tests. We are planning to do more rewrites/optimizer rules so it's going to be painful to test these in the future. I say the plan tests should not be included. @rongrong WDYT?

Which plan tests do you mean? If you mean the tpch plan tests, those already exist, so if you want to get rid of them, i think that belongs in a separate pr. (I think the purpose of them is just to let you know if you are altering tpch or tpcds plans, so that you can benchmark whether there was a regression).

presto-main/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java

kaikalur

LGTM

fgwang7w requested a review from tdcmeehan July 29, 2020 05:53

fgwang7w assigned mbasmanova and fgwang7w Jul 29, 2020

fgwang7w requested a review from mbasmanova July 29, 2020 05:54

fgwang7w unassigned mbasmanova and fgwang7w Jul 29, 2020

fgwang7w force-pushed the 14897 branch 4 times, most recently from 237c0b0 to 915e3e5 Compare July 30, 2020 01:04

fgwang7w force-pushed the 14897 branch from 915e3e5 to 01ef407 Compare July 30, 2020 05:54

tdcmeehan requested review from kaikalur and rongrong July 30, 2020 14:13

fgwang7w force-pushed the 14897 branch 4 times, most recently from b6e735d to de71aad Compare August 3, 2020 17:18

fgwang7w force-pushed the 14897 branch 4 times, most recently from cd08001 to b77e70e Compare August 19, 2020 03:27

fgwang7w requested a review from rschlussel August 19, 2020 18:47

rschlussel reviewed Aug 19, 2020

View reviewed changes

fgwang7w force-pushed the 14897 branch from a189b81 to b871fe8 Compare November 6, 2020 17:51

rschlussel reviewed Dec 10, 2020

View reviewed changes

fgwang7w force-pushed the 14897 branch 7 times, most recently from a87e1ac to 4da8ff6 Compare December 14, 2020 23:21

fgwang7w force-pushed the 14897 branch 2 times, most recently from ba959e5 to f3eb48c Compare December 15, 2020 05:37

fgwang7w requested a review from rschlussel December 15, 2020 18:22

fgwang7w force-pushed the 14897 branch 2 times, most recently from 6ffd10f to 6d3513b Compare December 16, 2020 22:27

rschlussel reviewed Dec 23, 2020

View reviewed changes

fgwang7w force-pushed the 14897 branch from 6d3513b to 0bd9ac3 Compare December 28, 2020 04:48

Avoid planning unnecessary Sort

4bf0c85

Cherry-pick of trinodb/trino#818 Co-author: Martin Traverso <mtraverso@gmail.com>

fgwang7w requested a review from rschlussel December 28, 2020 08:29

rschlussel approved these changes Dec 28, 2020

View reviewed changes

kaikalur reviewed Dec 28, 2020

View reviewed changes

presto-main/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java Show resolved Hide resolved

kaikalur approved these changes Dec 28, 2020

View reviewed changes

caithagoras mentioned this pull request Jan 11, 2021

Add release notes for 0.246 #15602

Merged

5 tasks

This was referenced Mar 28, 2024

【BUG】 Order by limit sorting problem, is there any forced sorting configuration? #22353

Open

Order by limit sorting problem, is there any forced sorting configuration? trinodb/trino#21300

Closed

Conversation

fgwang7w commented Jul 29, 2020 • edited by elharo Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fgwang7w commented Jul 30, 2020

Uh oh!

fgwang7w commented Aug 19, 2020

Uh oh!

rschlussel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rschlussel commented Aug 19, 2020

Uh oh!

kaikalur commented Sep 1, 2020

Uh oh!

fgwang7w commented Sep 2, 2020

Uh oh!

fgwang7w commented Nov 6, 2020

Uh oh!

fgwang7w commented Dec 9, 2020

Uh oh!

rschlussel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fgwang7w Dec 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fgwang7w Dec 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fgwang7w commented Dec 15, 2020

Uh oh!

rschlussel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rschlussel left a comment

Choose a reason for hiding this comment

Uh oh!

kaikalur commented Dec 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rschlussel commented Dec 28, 2020

Uh oh!

Uh oh!

kaikalur left a comment

fgwang7w commented Jul 29, 2020 •

edited by elharo

Loading

fgwang7w Dec 14, 2020 •

edited

Loading

fgwang7w Dec 14, 2020 •

edited

Loading

kaikalur commented Dec 28, 2020 •

edited

Loading