Skip to content

Make Aggregation use RowExpression instead#12710

Merged
hellium01 merged 4 commits intoprestodb:masterfrom
hellium01:RefactorAggregation
Jun 14, 2019
Merged

Make Aggregation use RowExpression instead#12710
hellium01 merged 4 commits intoprestodb:masterfrom
hellium01:RefactorAggregation

Conversation

@hellium01
Copy link
Contributor

@hellium01 hellium01 commented Apr 22, 2019

This will remove the dependency to Expression in following plan nodes:

AggregationNode
TableWriterNode
TableFinishNode

It is splitted to small commits that may need to squash together later.

@hellium01 hellium01 force-pushed the RefactorAggregation branch 8 times, most recently from cb23b2b to 7613f63 Compare April 25, 2019 08:11
@hellium01 hellium01 changed the title [WIP] Make Aggregation use RowExpression instead Make Aggregation use RowExpression instead Apr 25, 2019
@hellium01 hellium01 requested review from highker and wenleix April 25, 2019 16:55
@wenleix wenleix self-assigned this Apr 27, 2019
@hellium01 hellium01 force-pushed the RefactorAggregation branch 5 times, most recently from 7d299e3 to 07ce82b Compare May 1, 2019 07:27
@highker
Copy link

highker commented May 1, 2019

@hellium01 is this ready to review?

@hellium01
Copy link
Contributor Author

hellium01 commented May 2, 2019

Yes, it should be OK to review now. It was splitted to small commits but I can squash them after review.

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many places still having both getCall and the other new getters. To make it clean, we'd better clean them in one commit.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this to the previous line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be a little long to fit in one line.

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Change expressionExtractor for aggregation" done

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we use function manager?

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished the first 3 commits. I think we need some care to make sure refactoring can be easily carried out. Moving arguments out of FunctionCall may not be proper. It loosens the bound as a function. We need to keep the basic signature of a function as a whole and separating out other info like order by or distinct. Keeping arguments inside FunctionCall can help replacing it with CallExpression with one shot. Also, it won't introduce the problem of resolving lambdas.

Copy link
Contributor

@wenleix wenleix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glance over commits through "Format aggregation in explainer" . I leave detailed review to @highker 😃

@hellium01 hellium01 force-pushed the RefactorAggregation branch 3 times, most recently from 203991e to 5157d0e Compare May 3, 2019 01:35
@highker highker requested review from highker and wenleix May 3, 2019 05:21
@hellium01 hellium01 force-pushed the RefactorAggregation branch from 8fc4043 to f0140f0 Compare June 10, 2019 23:37
@hellium01
Copy link
Contributor Author

Rebased on top #12606

@hellium01 hellium01 force-pushed the RefactorAggregation branch from f0140f0 to 5b6b053 Compare June 11, 2019 00:58
@hellium01 hellium01 force-pushed the RefactorAggregation branch 3 times, most recently from 1966174 to 6d14042 Compare June 11, 2019 18:03
@hellium01 hellium01 requested a review from highker June 11, 2019 18:04
Copy link
Contributor

@wenleix wenleix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Modify RowExpressionSymbolInliner to RowExpressionVariableInliner": LGTM % one nit.

Copy link
Contributor

@wenleix wenleix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Consolidate conversion from OrderBy expression to OrderingScheme".

LGTM % nits.

Copy link
Contributor

@wenleix wenleix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Replace Expression with RowExpression in Aggregation".

Skimmed through since most of the comments are discussed in previous reviews. @highker and @rongrong can have a more detailed review on optimizer and expression rewrite related stuff.

@hellium01 hellium01 force-pushed the RefactorAggregation branch from 6d14042 to 79349e6 Compare June 12, 2019 01:24
@hellium01 hellium01 force-pushed the RefactorAggregation branch from 79349e6 to 11e9fca Compare June 12, 2019 19:03
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Modify RowExpressionSymbolInliner to RowExpressionVariableInliner" LGTM

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Replace Expression with RowExpression in Aggregation" minor comments only

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move this part of logic to the caller. The caller should decide which method to use.

Copy link
Contributor Author

@hellium01 hellium01 Jun 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SymbolMapper will be used in multiple optimization rules. As we are pushing the clean up up, we will need this function to handle both Expression/RowExpression anyway. I would prefer we just remove isExpression later once everything is cleaned up.

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two other commits LGTM

Expression is replaced by RowExpression in AggregationNode#Aggregation.
For lambda argument of aggregation, we have to manually provide the
types based on Function's signature. The operation is moved from
LocalExecutionPlanner to earlier stage.
Since we now have access to CallExpression, we add back checks to make
sure function arguments is valid
@hellium01 hellium01 force-pushed the RefactorAggregation branch from 11e9fca to a04b4bc Compare June 13, 2019 21:16
Copy link
Contributor

@wenleix wenleix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Make sure Travis turn green before merge.

Thanks for the great work!

@wenleix wenleix assigned hellium01 and unassigned wenleix Jun 13, 2019
@hellium01 hellium01 merged commit ee7b43e into prestodb:master Jun 14, 2019
@hellium01 hellium01 deleted the RefactorAggregation branch June 14, 2019 01:55
@mbasmanova
Copy link
Contributor

@hellium01 Yi, I'm seeing Streaming aggregation with input not grouped on the grouping keys planning error for the following query. Could it be related to this change?

SELECT
      linenumber,
      'xxx'
  FROM
  (
      (SELECT orderkey, linenumber FROM lineitem)
      UNION
      (SELECT orderkey, linenumber FROM lineitem)
  ) WHERE orderkey = 1 
  GROUP BY
      1;



java.lang.IllegalArgumentException: Streaming aggregation with input not grouped on the grouping keys

    at com.google.common.base.Preconditions.checkArgument(Preconditions.java:135)
    at com.facebook.presto.sql.planner.sanity.ValidateStreamingAggregations$Visitor.visitAggregation(ValidateStreamingAggregations.java:89)
    at com.facebook.presto.sql.planner.sanity.ValidateStreamingAggregations$Visitor.visitAggregation(ValidateStreamingAggregations.java:52)
    at com.facebook.presto.sql.planner.plan.AggregationNode.accept(AggregationNode.java:200)
    at com.facebook.presto.sql.planner.plan.InternalPlanNode.accept(InternalPlanNode.java:31)
    at com.facebook.presto.sql.planner.sanity.ValidateStreamingAggregations$Visitor.lambda$visitPlan$0(ValidateStreamingAggregations.java:73)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants