Skip to content

Conversation

@peter-toth
Copy link
Contributor

What changes were proposed in this pull request?

This PR adds support of column aliasing in a CTE so this query becomes valid:

WITH t(x) AS (SELECT 1)
SELECT * FROM t WHERE x = 1

How was this patch tested?

Added new UTs.

@peter-toth
Copy link
Contributor Author

@dongjoon-hyun, @gatorsmile this is another feature that PostgreSQL does support.

@liancheng
Copy link
Contributor

ok to test

@liancheng
Copy link
Contributor

LGTM pending Jenkins, thanks!

@liancheng
Copy link
Contributor

liancheng commented Jun 11, 2019

In fact, I was thinking, instead of adding test cases in the SQLQuerySuite, maybe it's better to add test cases in the PlanParserSuite and AnalysisSuite.

The changes in this PR are purely about the parsing phase. Running full-blown SQL queries for testing parsing changes is slow and unnecessary. The Spark PR builder already takes a long time to finish, would be better to cut the cost whenever possible.

@SparkQA
Copy link

SparkQA commented Jun 11, 2019

Test build #106393 has finished for PR 24842 at commit 8a7e7e0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@peter-toth
Copy link
Contributor Author

peter-toth commented Jun 12, 2019

Thanks @liancheng for the review. I've moved analysis error test cases to AnalysisSuite as you suggested. But I left the positive test case in SQLQuerySuite as I think the value of x column should be tested. Please let me know if you disagree.

BTW there is another improvement PR regarding WITH clause I opened recently: #24831
Any comment is very welcome if you have some time to review it.

@SparkQA
Copy link

SparkQA commented Jun 12, 2019

Test build #106405 has finished for PR 24842 at commit 92cc5d2.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@hvanhovell
Copy link
Contributor

@peter-toth can you still add a test to the plan parser suite?

@dongjoon-hyun
Copy link
Member

Thank you for ping me, @peter-toth . +1 for the above test case comments. And, this PR also looks good to me.

Copy link
Member

@dongjoon-hyun dongjoon-hyun Jun 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, please move this to cte.sql. That is a perfect place for this.
cc @gatorsmile

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, moved

@peter-toth
Copy link
Contributor Author

@peter-toth can you still add a test to the plan parser suite?

@hvanhovell sure, I added a test to PlanParserSuite

@SparkQA
Copy link

SparkQA commented Jun 12, 2019

Test build #106438 has finished for PR 24842 at commit ddb0554.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 13, 2019

Test build #106456 has finished for PR 24842 at commit 510eee1.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 13, 2019

Test build #106458 has finished for PR 24842 at commit 0a00a03.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@peter-toth
Copy link
Contributor Author

retest this please

CROSS JOIN CTE1 t2;

-- CTE with column alias
WITH t(x) AS (SELECT 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add more test cases? For example, can a with clause in a subquery shadow a with clause in an enclosing query with the same name? Another example, use with clauses in a subquery expression?

Copy link
Contributor Author

@peter-toth peter-toth Jun 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to focus on column aliases in this PR, but I have another open here which focuses on nested WITH clauses: #24831

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. For the nested WITH, #24831 would be a better place to add them.
After merging this, let's proceed as @gatorsmile suggested, @peter-toth .

interceptParseException(parsePlan)(sqlCommand, messages: _*)

private def cte(plan: LogicalPlan, namedPlans: (String, LogicalPlan)*): With = {
private def cte(plan: LogicalPlan, namedPlans: (String, (LogicalPlan, Seq[String]))*): With = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for consolidating both cte functions into one!

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@gatorsmile . Could you do the final review and sign-off this? For the nested WITH, I hope we can do that in his another PRs.

  • #24831 is adding nested WITH.
  • #24860 is adding WITH test cases.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Jun 14, 2019

Test build #106523 has finished for PR 24842 at commit 0a00a03.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Jun 15, 2019

Test build #106533 has finished for PR 24842 at commit 0a00a03.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Merged to master.

Thank you so much, @peter-toth , @liancheng, @hvanhovell , @gatorsmile !

@peter-toth . Please proceed to #24831 and #24860 .

emanuelebardelli pushed a commit to emanuelebardelli/spark that referenced this pull request Jun 15, 2019
## What changes were proposed in this pull request?

This PR adds support of column aliasing in a CTE so this query becomes valid:
```
WITH t(x) AS (SELECT 1)
SELECT * FROM t WHERE x = 1
```
## How was this patch tested?

Added new UTs.

Closes apache#24842 from peter-toth/SPARK-28002.

Authored-by: Peter Toth <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
@peter-toth
Copy link
Contributor Author

Thanks @dongjoon-hyun @liancheng @gatorsmile @hvanhovell for the review.

I have prepared #24831 for review.
Unfortunately I can't work on #24860 next week but you can expect updates on it after that.

@dongjoon-hyun
Copy link
Member

Thanks. No problem, @peter-toth !


namedQuery
: name=identifier AS? '(' query ')'
: name=identifier (columnAliases=identifierList)? AS? '(' query ')'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate names within a single CTE definition are not allowed.

-- !query 7
DROP VIEW IF EXISTS t
WITH t(x) AS (SELECT 1)
SELECT * FROM t WHERE x = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost all the test cases are using one column in the CTE definition. Can you try your best to improve the current test coverage in this new syntax?

@gatorsmile
Copy link
Member

Sorry for my late response. @peter-toth @dongjoon-hyun Could you submit a follow-up PR to improve it?

@peter-toth
Copy link
Contributor Author

@gatorsmile @dongjoon-hyun I opened #24949 to add some new test cases. Please let me know if you want more cases.

Please note that I'm also working on #24860 and it will add many new tests that cover WITH column aliases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants