Conversation
Reviewer's GuideThis PR extends the planner to fully support table function invocation with table arguments by updating the RelationPlanner to build TableFunctionNode with sources, argument properties, copartitioning, and proper outputs; enhances QueryPlanner and SymbolMapper to handle coercion, ordering translation, and symbol mapping; enriches PlanPrinter to render table function details; updates TableFunctionNode to carry new metadata; augments optimizer passes to rewrite and validate the new node fields; and adds a testing framework (TableFunctionMatcher) with plan tests. Class diagram for updated TableFunctionNode and related typesclassDiagram
class TableFunctionNode {
+String name
+Map<String, Argument> arguments
+List<VariableReferenceExpression> outputVariables
+List<PlanNode> sources
+List<TableArgumentProperties> tableArgumentProperties
+List<List<String>> copartitioningLists
+TableFunctionHandle handle
+List<VariableReferenceExpression> getOutputVariables()
+List<VariableReferenceExpression> getProperOutput()
+List<TableArgumentProperties> getTableArgumentProperties()
+List<List<String>> getCopartitioningLists()
}
class TableArgumentProperties {
+String argumentName
+boolean rowSemantics
+boolean pruneWhenEmpty
+PassThroughSpecification passThroughSpecification
+List<VariableReferenceExpression> requiredColumns
+Optional<DataOrganizationSpecification> specification
+String getArgumentName()
+boolean isRowSemantics()
+boolean isPruneWhenEmpty()
+PassThroughSpecification getPassThroughSpecification()
+List<VariableReferenceExpression> getRequiredColumns()
+Optional<DataOrganizationSpecification> getSpecification()
}
class PassThroughSpecification {
+boolean declaredAsPassThrough
+List<PassThroughColumn> columns
+boolean isDeclaredAsPassThrough()
+List<PassThroughColumn> getColumns()
}
class PassThroughColumn {
+VariableReferenceExpression outputVariables
+boolean isPartitioningColumn
+VariableReferenceExpression getOutputVariables()
+boolean isPartitioningColumn()
}
TableFunctionNode "1" *-- "many" TableArgumentProperties
TableArgumentProperties "1" *-- "1" PassThroughSpecification
PassThroughSpecification "1" *-- "many" PassThroughColumn
Class diagram for QueryPlanner and OrderingTranslator changesclassDiagram
class QueryPlanner {
+PlanAndMappings coerce(...)
+static OrderingScheme translateOrderingScheme(...)
+static SymbolReference toSymbolReference(...)
}
class OrderingTranslator {
+static SortOrder sortItemToSortOrder(SortItem)
}
QueryPlanner <.. OrderingTranslator : uses
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
92850b6 to
b114c23
Compare
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- RelationPlanner.visitTableFunctionInvocation has grown quite large and mixes source processing, column mapping, and specification creation—consider extracting the table‐argument planning logic into smaller helper methods or a dedicated builder to improve readability and maintainability.
- PlanPrinter now includes extensive logic to format scalar, descriptor, and table arguments; moving that into a dedicated formatter or visitor would help keep PlanPrinter focused and avoid cluttering its class.
- The new OrderingTranslator.sortItemToSortOrder and QueryPlanner.translateOrderingScheme overlap in functionality—consolidating them into a single utility would reduce duplication and ensure consistent behavior.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- RelationPlanner.visitTableFunctionInvocation has grown quite large and mixes source processing, column mapping, and specification creation—consider extracting the table‐argument planning logic into smaller helper methods or a dedicated builder to improve readability and maintainability.
- PlanPrinter now includes extensive logic to format scalar, descriptor, and table arguments; moving that into a dedicated formatter or visitor would help keep PlanPrinter focused and avoid cluttering its class.
- The new OrderingTranslator.sortItemToSortOrder and QueryPlanner.translateOrderingScheme overlap in functionality—consolidating them into a single utility would reduce duplication and ensure consistent behavior.
## Individual Comments
### Comment 1
<location> `presto-main-base/src/main/java/com/facebook/presto/testing/LocalQueryRunner.java:773-777` </location>
<code_context>
@Override
public void createCatalog(String catalogName, String connectorName, Map<String, String> properties)
{
- throw new UnsupportedOperationException();
+ nodeManager.addCurrentNodeConnector(new ConnectorId(catalogName));
+ connectorManager.createConnection(catalogName, connectorName, properties);
}
</code_context>
<issue_to_address>
**suggestion (bug_risk):** LocalQueryRunner.createCatalog now adds the connector to nodeManager before creating the connection.
Please verify that both managers remain consistent if connector creation fails, as the order of operations may affect initialization and error handling.
```suggestion
public void createCatalog(String catalogName, String connectorName, Map<String, String> properties)
{
ConnectorId connectorId = new ConnectorId(catalogName);
nodeManager.addCurrentNodeConnector(connectorId);
try {
connectorManager.createConnection(catalogName, connectorName, properties);
}
catch (RuntimeException | Error e) {
nodeManager.removeCurrentNodeConnector(connectorId);
throw e;
}
}
```
</issue_to_address>
### Comment 2
<location> `presto-parser/src/main/java/com/facebook/presto/sql/SqlFormatter.java:274-276` </location>
<code_context>
Relation relation = node.getTable();
- Relation unaliased = relation instanceof AliasedRelation ? ((AliasedRelation) relation).getRelation() : relation;
+ Node unaliased = relation instanceof AliasedRelation ? ((AliasedRelation) relation).getRelation() : relation;
+ if (unaliased instanceof TableSubquery) {
+ // unpack the relation from TableSubquery to avoid adding another pair of parentheses
+ unaliased = ((TableSubquery) unaliased).getQuery();
+ }
builder.append("TABLE(");
</code_context>
<issue_to_address>
**issue:** SqlFormatter now unpacks TableSubquery to avoid extra parentheses.
Ensure that removing parentheses from TableSubquery does not impact SQL correctness, especially in scenarios where explicit parentheses are necessary.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| public void createCatalog(String catalogName, String connectorName, Map<String, String> properties) | ||
| { | ||
| throw new UnsupportedOperationException(); | ||
| nodeManager.addCurrentNodeConnector(new ConnectorId(catalogName)); | ||
| connectorManager.createConnection(catalogName, connectorName, properties); | ||
| } | ||
|
|
There was a problem hiding this comment.
suggestion (bug_risk): LocalQueryRunner.createCatalog now adds the connector to nodeManager before creating the connection.
Please verify that both managers remain consistent if connector creation fails, as the order of operations may affect initialization and error handling.
| public void createCatalog(String catalogName, String connectorName, Map<String, String> properties) | |
| { | |
| throw new UnsupportedOperationException(); | |
| nodeManager.addCurrentNodeConnector(new ConnectorId(catalogName)); | |
| connectorManager.createConnection(catalogName, connectorName, properties); | |
| } | |
| public void createCatalog(String catalogName, String connectorName, Map<String, String> properties) | |
| { | |
| ConnectorId connectorId = new ConnectorId(catalogName); | |
| nodeManager.addCurrentNodeConnector(connectorId); | |
| try { | |
| connectorManager.createConnection(catalogName, connectorName, properties); | |
| } | |
| catch (RuntimeException | Error e) { | |
| nodeManager.removeCurrentNodeConnector(connectorId); | |
| throw e; | |
| } | |
| } |
| if (unaliased instanceof TableSubquery) { | ||
| // unpack the relation from TableSubquery to avoid adding another pair of parentheses | ||
| unaliased = ((TableSubquery) unaliased).getQuery(); |
There was a problem hiding this comment.
issue: SqlFormatter now unpacks TableSubquery to avoid extra parentheses.
Ensure that removing parentheses from TableSubquery does not impact SQL correctness, especially in scenarios where explicit parentheses are necessary.
| @Override | ||
| public MatchResult detailMatches(PlanNode node, StatsProvider stats, Session session, Metadata metadata, SymbolAliases symbolAliases) | ||
| { | ||
| checkState(shapeMatches(node), "Plan testing framework error: shapeMatches returned false in detailMatches in %s", this.getClass().getName()); |
There was a problem hiding this comment.
This function is long. Can we split out the checkArguments into a separate function ?
There was a problem hiding this comment.
Added getMatchResult and switched the if's to a switch case per previous code changes.
presto-main-base/src/main/java/com/facebook/presto/sql/planner/RelationPlanner.java
Show resolved
Hide resolved
presto-main-base/src/main/java/com/facebook/presto/sql/planner/RelationPlanner.java
Show resolved
Hide resolved
aditi-pandit
left a comment
There was a problem hiding this comment.
LGTM minus the nit.
|
|
||
| // add output symbols passed from the table argument | ||
| ImmutableList.Builder<PassThroughColumn> passThroughColumns = ImmutableList.builder(); | ||
| addOutputSymbols(outputVariables, tableArgument, sourcePlan, specification, passThroughColumns, sourcePlanBuilder); |
There was a problem hiding this comment.
Maybe this should be addPassthroughColumns ?
|
@jaystarshot : Please can you help with the review. |
|
Taking a look |
| { | ||
| private OrderingTranslator() {} | ||
|
|
||
| public static SortOrder sortItemToSortOrder(SortItem sortItem) |
There was a problem hiding this comment.
Can you please add some comment description on what this is doing
| for (int i = 0; i < coerced.size(); i++) { | ||
| VariableReferenceExpression variable = coerced.get(i); | ||
| // for multiple sort items based on the same expression, retain the first one: | ||
| // ORDER BY x DESC, x ASC, y --> ORDER BY x DESC, y |
There was a problem hiding this comment.
Can we add a unit test for this, i checked and one only had order by x
| private OrderingTranslator() {} | ||
|
|
||
| public static SortOrder sortItemToSortOrder(SortItem sortItem) | ||
| { |
There was a problem hiding this comment.
We already have this in planner utils and this is a duplicate which is not needed.
link
| return new PlanAndMappings(subPlan, mappings.build()); | ||
| } | ||
|
|
||
| public static OrderingScheme translateOrderingScheme(List<SortItem> items, Function<com.facebook.presto.sql.tree.Expression, VariableReferenceExpression> coercions) |
There was a problem hiding this comment.
Can we reuse toOrderingScheme in plannerutils?
There was a problem hiding this comment.
Removed translateOrderingScheme and replaced the usage
orderBy = Optional.of(translateOrderingScheme(tableArgument.getOrderBy().get().getSortItems(), sourcePlanBuilder::translate));
to
List<SortItem> sortItems = tableArgument.getOrderBy().get().getSortItems();
// the ordering symbols are not coerced
List<VariableReferenceExpression> coerced = sortItems.stream()
.map(SortItem::getSortKey)
.map(sourcePlanBuilder::translate)
.collect(toImmutableList());
List<SortOrder> sortOrders = sortItems.stream()
.map(OrderingTranslator::sortItemToSortOrder)
.collect(toImmutableList());
orderBy = Optional.of(PlannerUtils.toOrderingScheme(coerced, sortOrders));
| .collect(toImmutableList()); | ||
| } | ||
|
|
||
| public static SymbolReference toSymbolReference(VariableReferenceExpression variable) |
There was a problem hiding this comment.
lets just reuse the function above?
There was a problem hiding this comment.
I'm not sure if you mean use the the above function toSymbolReferences instead of this function, or use toSymbolReference in toSymbolReferences.
I think this function is needed or else the actualPassThrough call becomes very messy.
Set<SymbolReference> actualPassThrough = argumentProperties.getPassThroughSpecification().getColumns().stream()
.map(var -> toSymbolReference(var.getOutputVariables()))
.collect(toImmutableSet());
to:
Set<SymbolReference> actualPassThrough = toSymbolReferences(
argumentProperties.getPassThroughSpecification().getColumns().stream()
.map(TableFunctionNode.PassThroughColumn::getOutputVariables)
.collect(Collectors.toList()))
.stream()
.map(SymbolReference.class::cast)
.collect(Collectors.toSet());
If you mean reuse toSymbolReference in toSymbolReferences, I made that change.
| return new VariableReferenceExpression(variable.getSourceLocation(), canonical, types.get(new SymbolReference(getNodeLocation(variable.getSourceLocation()), canonical))); | ||
| } | ||
|
|
||
| public List<VariableReferenceExpression> map(List<VariableReferenceExpression> variables) |
There was a problem hiding this comment.
This can just be a lambda, no need of a function here (anyway its unclear by the name)
node.getGroupIdVariable().map(this::map),
There was a problem hiding this comment.
Removed, changed usage
inputMapper.map(properties.getRequiredColumns()),
to
properties.getRequiredColumns().stream().map(inputMapper::map).collect(Collectors.toList()),
| return processChildren(node, context); | ||
| } | ||
|
|
||
| @Override |
There was a problem hiding this comment.
This was removed in a future change. Removed.
| .skip(node.getPreSortedOrderPrefix()) | ||
| .map(symbol -> symbol + " " + orderingScheme.getOrdering(symbol))) | ||
| .collect(Collectors.joining(", ")))); | ||
| .collect(joining(", ")))); |
There was a problem hiding this comment.
we can remove these chagnes which are unrelated
| } | ||
|
|
||
| @Test | ||
| public void testTableFunctionInitialPlan() |
There was a problem hiding this comment.
I think these tests are not enough for an end to end planning scenario.
Can we have more test cases (joins etc) to make sure the planning is working as expected, Complex copartitioning scenarios, multible table functions
There was a problem hiding this comment.
Added new test case testTableFunctionInitialPlanWithCoercionForCopartitioning.
Will add some more.
| // if there are partitioning columns, they might have to be coerced for copartitioning | ||
| if (tableArgument.getPartitionBy().isPresent() && !tableArgument.getPartitionBy().get().isEmpty()) { | ||
| List<Expression> partitioningColumns = tableArgument.getPartitionBy().get(); | ||
| QueryPlanner partitionQueryPlanner = new QueryPlanner(analysis, variableAllocator, idAllocator, lambdaDeclarationToVariableMap, metadata, session, context, sqlParser); |
There was a problem hiding this comment.
Do we need to create a new queryplanner all times? This will be compute intensive
There was a problem hiding this comment.
Moved out of the loop.
a5570ba to
6fc81e7
Compare
72c1c3c to
4a2a80b
Compare
|
Codenotify: Notifying subscribers in CODENOTIFY files for diff f046e9f...5c400de. No notifications. |
a5e59ef to
3d260db
Compare
|
@jaystarshot This PR now includes all of the changes until the end. Could you take a look at the first split from this PR? This includes all final SQL parser changes. Thanks! |
Final SQL Parsing changes for TVF. Includes changes for unaliasing as well as a change to prevent COPARTITION ambiguity. Please check trinodb/trino#15734 for more details. Contains all changes under `presto-parser/src/main/java/com/facebook/presto/sql` New complete PR: #26188 ## Description <!---Describe your changes in detail--> Changes adapted from trino/PR#16014, 14175 ## Motivation and Context <!---Why is this change required? What problem does it solve?--> <!---If it fixes an open issue, please link to the issue here.--> The keyword COPARTITION is specific to table function invocation. It is not a reserved keyword. In some cases, the word COPARTITION can be ambiguously interpreted: either as table argument alias, or as part of the COPARTITION clause. To solve the ambiguity, we decided to fail queries using the word "copartition" as table argument alias, while keeping the word non-reserved and available to be used as identifier in other contexts. ## Test Plan <!---Please fill in how you tested your change--> testCopartitionInTableArgumentAlias in TestSqlParser.java Tests ran to check for regressions: TestSqlParser TestSqlParserErrorHandling TestAnalyzer TestTableFunctionInvocation TestTableFunctionRegistry ``` == NO RELEASE NOTE == ``` Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com> Co-authored-by: Xin Zhang <desertsxin@gmail.com>
Changes adapted from trino/PR#14175 Author: kasiafi Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com> Co-authored-by: Xin Zhang <desertsxin@gmail.com>
Changes adapted from trino/PR#14566 Author: kasiafi Co-authored-by: mohsaka <135669458+mohsaka@users.noreply.github.com> Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com> Co-authored-by: Xin Zhang <desertsxin@gmail.com>
Changes adapted from trino/PR#15575 Author: kasiafi Co-authored-by: Xin Zhang <desertsxin@gmail.com> Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
TODO: TestTableFuncitonInvocation still failing. 2 tests
Changes adapted from trino/PR#16584 Author: kasiafi Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#16716 Author: kasiafi Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#25493 Author: kasiafi Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
b70f4ca to
1471dbf
Compare
Co-authored-by: Xin Zhang <desertsxin@gmail.com>
Description
This PR contains the planner changes for table function arguments that were analyzed in Part 5. It also includes initial plan planner tests with the addition of a TableFunctionMatcher for current and future planning tests.
Motivation and Context
Impact
Test Plan
Added additional test case to planner's TestTableFunctionInvocation.
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.