Skip to content

Analyze table function arguments#13602

Merged
kasiafi merged 15 commits intotrinodb:masterfrom
kasiafi:375AnalyzeTFArguments
Sep 30, 2022
Merged

Analyze table function arguments#13602
kasiafi merged 15 commits intotrinodb:masterfrom
kasiafi:375AnalyzeTFArguments

Conversation

@kasiafi
Copy link
Copy Markdown
Member

@kasiafi kasiafi commented Aug 10, 2022

This PR adds the Analyzer support for table and descriptor arguments of table functions.
These arguments are not yet supported in further phases of query processing, and they are caught in RelationPlanner.

No docs required. We should document how table and descriptor arguments work when we add full support for them.

Some SPI changes are introduced. However, the affected classes are marked as experimental, so probably we don't need to announce the changes in the release notes. Also, the changed classes can't be effectively used, as the related argument types are not supported beyond the Analyzer.

Beside the SPI changes mentioned above, no release notes required.

Copy link
Copy Markdown
Member

@martint martint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(still reviewing the last commit)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are they lower cased? What happens with delimited names? That would seem to go against what the SQL spec describes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is not precise. It should say "All names are resolved case-insensitive".

I am aware that this is not compliant with the SQL spec. Regarding resolution of co-partitioned tables, I chose to follow how table references are resolved in Trino.

Currently in Trino, all table references, including CTEs and aliased relations, are resolved case-insensitive, and sadly, the 'delimited' property is disregarded. In theory, we could do spec-compliant resolution for the co-partitioned tables, but I think it would be unintuitive to the user. Similarly, we decided to resolve TF names in the usual non-ANSI way to avoid confusion between different function types.

If we decided to resolve co-partitioned table references in the spec-compliant way, we'd have to decide how "permissive" we want to be with respect to schema and catalog names.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point out where is the name conflict? I can't see it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It happened in the StatementAnalyzer where we analyze the incoming io.trino.sql.tree.TableArgument, and use the io.trino.spi.ptf.TableArgument.Builder. I renamed the former (tree). Analogically with the descriptor arguments.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method has two outputs -- the return value Argument, and the tableArgumentAnalyses output list via side effects. That makes the code more complicated than necessary and harder to understand. Instead, one option make the method return TableArgumentAnalysis, add the Argument to it, and remove the side effects from the function. The caller should be responsible for unpacking and collecting the TableArgumentAnalysis objects into a list if so desired.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a helper class to return both the Argument, and the optional TableArgumentAnalysis. TableArgumentAnalysis couldn't be extended to carry the Argument, because it only applies to table arguments, while Argument is created for each argument regardless of its type.

@kasiafi kasiafi force-pushed the 375AnalyzeTFArguments branch 3 times, most recently from 72aa347 to 617bca2 Compare September 6, 2022 11:23
@kasiafi kasiafi requested a review from martint September 6, 2022 18:54
@kasiafi kasiafi force-pushed the 375AnalyzeTFArguments branch 3 times, most recently from 8e726a4 to bdc1b8e Compare September 18, 2022 10:56
@kasiafi kasiafi force-pushed the 375AnalyzeTFArguments branch from bdc1b8e to 5cd35fa Compare September 27, 2022 10:47
Prune when empty is enforced for a table argument with row semantics.
For a table argument with set semantocs, keep when empty is the default,
and it can be changed to prune when empty.
According to the SQL standard, the only properties
of a table argument used during analysis are: the row type,
partitioning, and ordering. Other information, like the
prune / keep when empty property, are not involved in the
analysis.
This is a change of approach on resolving arguments
of table functions.
Before this change, the mapping of descriptor arguments
to table arguments had to be determined during query
analysis. It was too limiting: in that model, input table
references were limited to the specified descriptor fields.

After this change, any input fields can be referenced,
and the required fields (channels) are to be determined
dynamically during function compilation.

One advantage of determining the required fields early
was the possibility to prune any columns that were
not required by the table function. This should be replaced
by adding a way for a table function to declare required
fields during query optimization.
It should make using descriptor arguments easier for the
table function author, and allow to compare with
NULL_DESCRIPTOR by equality.
@kasiafi kasiafi force-pushed the 375AnalyzeTFArguments branch from 5cd35fa to dbef4d9 Compare September 29, 2022 10:55
@kasiafi kasiafi merged commit f096336 into trinodb:master Sep 30, 2022
@github-actions github-actions bot added this to the 399 milestone Sep 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants