Fix CTE reference in analyzer#22515
Conversation
| } | ||
|
|
||
| @Test | ||
| public void testNesteCteWithSameName() |
There was a problem hiding this comment.
Will fail without change here.
| else { | ||
| // cte considered for materialization | ||
| String normalizedCteId = context.getCteInfo().normalize(analysis, namedQuery.getQuery(), cteName); | ||
| String normalizedCteId = context.getCteInfo().normalize(NodeRef.of(namedQuery.getQuery()), cteName); |
There was a problem hiding this comment.
Use NodeRef, because we want to use == rather than equals comparison.
| public TreeSet<Query> getReferencedQuerySet() | ||
| { | ||
| return referencedQuerySet; | ||
| String identityString = queryNodeRef.hashCode() + delimiter + cteName; |
There was a problem hiding this comment.
Use hash code to uniquely identify a CTE
There was a problem hiding this comment.
I am not sure if repeated runs of the query will have the same hashcode.
If they do then please ignore.
If they don't, is it better to normalize this hashcode by ID too like done previously?
The goal is to make debugging easier in prod since its better for repeated query runs to have the same id of the cte, also the test framework changes might not be required
There was a problem hiding this comment.
Good point. Instead of using hash code, now I use an incremental prefix for the mapping purpose, which should be fixed now, and get rid of the massive change of test framework.
presto-main/src/main/java/com/facebook/presto/sql/planner/SqlPlannerContext.java
Outdated
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/sql/planner/assertions/CteConsumerMatcher.java
Outdated
Show resolved
Hide resolved
| String cteName = ((CteConsumerNode) node).getCteId(); | ||
|
|
||
| return match(); | ||
| if (cteNameMapping.containsKey(expectedCteName) && cteNameMapping.inverse().containsKey(cteName) && cteNameMapping.get(expectedCteName).equals(cteName)) { |
There was a problem hiding this comment.
One to one mapping exists and match.
presto-main/src/test/java/com/facebook/presto/sql/planner/assertions/CteConsumerMatcher.java
Outdated
Show resolved
Hide resolved
2221e6d to
71801c9
Compare
presto-main/src/test/java/com/facebook/presto/sql/planner/assertions/Matcher.java
Outdated
Show resolved
Hide resolved
42f07d3 to
172098f
Compare
jaystarshot
left a comment
There was a problem hiding this comment.
Nice catch, just added couple of comments
presto-main/src/test/java/com/facebook/presto/sql/planner/assertions/CteConsumerMatcher.java
Outdated
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/sql/planner/assertions/CteConsumerMatcher.java
Outdated
Show resolved
Hide resolved
| public TreeSet<Query> getReferencedQuerySet() | ||
| { | ||
| return referencedQuerySet; | ||
| String identityString = queryNodeRef.hashCode() + delimiter + cteName; |
There was a problem hiding this comment.
I am not sure if repeated runs of the query will have the same hashcode.
If they do then please ignore.
If they don't, is it better to normalize this hashcode by ID too like done previously?
The goal is to make debugging easier in prod since its better for repeated query runs to have the same id of the cte, also the test framework changes might not be required
presto-hive/src/test/java/com/facebook/presto/hive/TestCteExecution.java
Outdated
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/sql/planner/assertions/Matcher.java
Outdated
Show resolved
Hide resolved
172098f to
af9495a
Compare
|
LGTM, however I only have codeowner access for presto main so you will need other reviews. |
af9495a to
cdb69bd
Compare
cdb69bd to
408cb8f
Compare
Description
Fix #22514
Currently when creating CTE Reference node, the NamedQuery used in the CTE is used to uniquely identify the CTE, which can lead to conflicts, which can be demonstrated with example in the linked issue.
Instead of trying to build the scope manually, we can simply rely on the underlying Query object linked to each CTE definition and reference, which refers to the same object if are referring to the same CTE.
Another advantage of this approach is that, it's much faster. I've seen query planning timeout due to the recursive query reference visit when dealing with large queries with nested ctes.
Motivation and Context
Fix a bug in CTE materialization
Impact
Bug fix.
Test Plan
Add unit test.
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.