Skip to content

Partial projections pushdown#1868

Merged
martint merged 4 commits intotrinodb:masterfrom
phd3:partial-projections-pushdown
Nov 27, 2019
Merged

Partial projections pushdown#1868
martint merged 4 commits intotrinodb:masterfrom
phd3:partial-projections-pushdown

Conversation

@phd3
Copy link
Copy Markdown
Member

@phd3 phd3 commented Oct 25, 2019

Splitting partial projection extraction from #1720. cc @martint @wagnermarkd

Some notes:

Currently, PushProjectionIntoTableScan is applied only in cases where all expressions are completely convertible to a connector level expressions. The only expressions that satisfy this criteria are simple column references, dereference expressions and constants. Any expression that involves anything more complex will not invoke PushProjectionIntoTableScan. So there is no opportunity for pushing down such expressions into the connector.

For example, for projections [“a.b.c”, “d”], the PushProjectionIntoTableScan rule will be invoked. But, for projections [“a.b.c” + “c.d”, “e”], it will be a no-op. This is very primitive, because such a pushdown will not work for even slightly complex projections. In such cases, we know that even if “a.b.c” + “c.d” cannot be pushed down, the individual subexpressions “a.b.c” and “c.d” can be. We need to implement the partial projection pushdown support for that.

After the partial projection support, the PushProjectionIntoTableScan would look like the following:

Blank Diagram (5)

Copy link
Copy Markdown
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(only skimming)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to test this?

also, cmt title is too long. Maybe just Fix translation of FieldDereference to DereferenceExpression.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@findepi Fixed the message. Added a test in TestConnectorExpressionTranslator. I haven't added literals for the test since the translation for them would rely on how ExpressionAnalyzer and LiteralEncoder.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalize "Mismatch", include list sizes in the error message.

Suggested change
checkState(newConnectorProjections.size() == projections.size(), "mismatch between input and output projections from the connector");
checkState(newConnectorProjections.size() == projections.size(), "Mismatch between input and output projections from the connector: expected %s but got %s", projections.size(), newConnectorProjections.size());

(maybe you need to rewrap this)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not private? please revert

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Override

@findepi findepi requested a review from martint October 29, 2019 09:20
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a very similar class for this already: ReferenceAwareExpressionNodeInliner

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally try to avoid generic "Util" classes, as they tend to become junk drawers over time. Instead, I'd create dedicated utilities (i.e., PartialTranslator, ExpressionReplacer). See, for example ExpressionNodeInliner, ExpressionExtractor or SortExpressionExtractor.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary formatting change. This was previously:

        Map<String, ColumnHandle> assignments = tableScan.getAssignments()
                .entrySet().stream()

@phd3 phd3 force-pushed the partial-projections-pushdown branch 3 times, most recently from d1553c7 to 4c4ade5 Compare November 5, 2019 01:48
@phd3
Copy link
Copy Markdown
Member Author

phd3 commented Nov 5, 2019

@martint Thanks for the review, I've addressed your comments.

@phd3 phd3 requested review from findepi and martint November 5, 2019 01:53
@findepi findepi assigned martint and unassigned phd3 Nov 8, 2019
@findepi findepi removed their request for review November 8, 2019 21:31
@findepi findepi assigned phd3 and unassigned martint Nov 8, 2019
@findepi
Copy link
Copy Markdown
Member

findepi commented Nov 8, 2019

@phd3 please see Travis.

@phd3 phd3 force-pushed the partial-projections-pushdown branch from 4c4ade5 to 488cafc Compare November 11, 2019 16:06
@phd3
Copy link
Copy Markdown
Member Author

phd3 commented Nov 11, 2019

thanks @findepi, I've fixed related errors.

@findepi findepi assigned martint and unassigned phd3 Nov 12, 2019
@phd3 phd3 requested a review from wagnermarkd November 25, 2019 19:26
Copy link
Copy Markdown
Member

@martint martint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment, but otherwise, it looks good. Let me know once you've addressed it and I'll merge it. Thanks for working on this!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrase the commit message: "Update constructor visibility in ReferenceAwareExpressionNodeInliner"

@phd3 phd3 force-pushed the partial-projections-pushdown branch from 488cafc to 1f7a1d9 Compare November 26, 2019 03:11
@phd3
Copy link
Copy Markdown
Member Author

phd3 commented Nov 26, 2019

@martint Modified the commit message and cleanly rebased.

@phd3 phd3 force-pushed the partial-projections-pushdown branch from 1f7a1d9 to 9ff9864 Compare November 26, 2019 20:27
@phd3
Copy link
Copy Markdown
Member Author

phd3 commented Nov 26, 2019

@martint There's one non-trivial change for catering to lambda expressions that I've made after the rebase. All tests seem to pass now. Curious if there are other cases of context-dependent symbols like in lambda expressions.

@martint
Copy link
Copy Markdown
Member

martint commented Nov 27, 2019

There's one non-trivial change for catering to lambda expressions that I've made after the rebase. All tests seem to pass now. Curious if there are other cases of context-dependent symbols like in lambda expressions.

The only other case would be in the context of a correlated subquery. But those appear at the "top level" from the point of view of the plan operation (i.e., not within a nested context like lambda expressions), are are not bound to columns.

@martint martint merged commit 14a301a into trinodb:master Nov 27, 2019
@martint
Copy link
Copy Markdown
Member

martint commented Nov 27, 2019

Merged, thanks for working on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants