Skip to content

Conversation

@amaliujia
Copy link
Contributor

What changes were proposed in this pull request?

The commit #38029 actually intended to do the right thing: it checks CTE more aggressively even if a CTE is not used, which is ok. However, it triggers an existing issue where a subquery checks itself but in the CTE case if the subquery contains a CTE which is defined outside of the subquery, the check will fail as CTE not found (e.g. key not found).

So it is:

the commit checks more thus in the repro examples, every CTE is checked now (in the past only used CTE is checked).

One of the CTE that is checked after the commit in the example contains subquery.

The subquery contains another CTE which is defined outside of the subquery.

The subquery checks itself thus fail due to CTE not found.

This PR fixes the issue by removing the subquery self-validation on CTE case.

Why are the changes needed?

This fixed a regression that

    val df = sql("""
                   |    WITH
                   |    cte1 as (SELECT 1 col1),
                   |    cte2 as (SELECT (SELECT MAX(col1) FROM cte1))
                   |    SELECT * FROM cte1
                   |""".stripMargin
    )
    checkAnswer(df, Row(1) :: Nil)

cannot pass analyzer anymore.

Does this PR introduce any user-facing change?

No

How was this patch tested?

UT

@github-actions github-actions bot added the SQL label Jan 5, 2023
@amaliujia
Copy link
Contributor Author

@cloud-fan

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 89666d4 Jan 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants