-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-32237][SQL] Resolve hint in CTE #29062
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -200,18 +200,18 @@ class Analyzer( | |
| val postHocResolutionRules: Seq[Rule[LogicalPlan]] = Nil | ||
|
|
||
| lazy val batches: Seq[Batch] = Seq( | ||
| Batch("Substitution", fixedPoint, | ||
| CTESubstitution, | ||
| WindowsSubstitution, | ||
| EliminateUnions, | ||
| new SubstituteUnresolvedOrdinals(conf)), | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I checked the order of analysis rules in branch-2.4.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's hard to find which change causing it for me since 2.4 and 3.0 have many differences that I don't know. |
||
| Batch("Disable Hints", Once, | ||
| new ResolveHints.DisableHints(conf)), | ||
| Batch("Hints", fixedPoint, | ||
| new ResolveHints.ResolveJoinStrategyHints(conf), | ||
| new ResolveHints.ResolveCoalesceHints(conf)), | ||
| Batch("Simple Sanity Check", Once, | ||
| LookupFunctions), | ||
| Batch("Substitution", fixedPoint, | ||
| CTESubstitution, | ||
| WindowsSubstitution, | ||
| EliminateUnions, | ||
| new SubstituteUnresolvedOrdinals(conf)), | ||
|
Comment on lines
202
to
-214
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems cool : ) |
||
| Batch("Resolution", fixedPoint, | ||
| ResolveTableValuedFunctions :: | ||
| ResolveNamespace(catalogManager) :: | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -26,7 +26,7 @@ import org.apache.log4j.Level | |
| import org.scalatest.Matchers | ||
|
|
||
| import org.apache.spark.api.python.PythonEvalType | ||
| import org.apache.spark.sql.catalyst.TableIdentifier | ||
| import org.apache.spark.sql.catalyst.{AliasIdentifier, TableIdentifier} | ||
| import org.apache.spark.sql.catalyst.catalog.{CatalogStorageFormat, CatalogTable, CatalogTableType, InMemoryCatalog, SessionCatalog} | ||
| import org.apache.spark.sql.catalyst.dsl.expressions._ | ||
| import org.apache.spark.sql.catalyst.dsl.plans._ | ||
|
|
@@ -895,4 +895,27 @@ class AnalysisSuite extends AnalysisTest with Matchers { | |
| assertAnalysisError(testRelation2.select(RowNumber() + 1), | ||
| Seq("Window function row_number() requires an OVER clause.")) | ||
| } | ||
|
|
||
| test("SPARK-32237: Hint in CTE") { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you add end-2-end tests, too, e.g., in |
||
| val plan = With( | ||
| Project( | ||
| Seq(UnresolvedAttribute("cte.a")), | ||
| UnresolvedRelation(TableIdentifier("cte")) | ||
| ), | ||
| Seq( | ||
| ( | ||
| "cte", | ||
| SubqueryAlias( | ||
| AliasIdentifier("cte"), | ||
| UnresolvedHint( | ||
| "REPARTITION", | ||
| Seq(Literal(3)), | ||
| Project(testRelation.output, testRelation) | ||
| ) | ||
| ) | ||
| ) | ||
| ) | ||
| ) | ||
| assertAnalysisSuccess(plan) | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3560,6 +3560,18 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark | |
| } | ||
| } | ||
| } | ||
|
|
||
| test("SPARK-32237: Hint in CTE") { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Like JoinHintSuite, we should have our own hint-specific suite. It can help us understand the test coverage of SQL HINT.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure. I will add a new suite. |
||
| withTable("t") { | ||
| sql("CREATE TABLE t USING PARQUET AS SELECT 1 AS id") | ||
| checkAnswer( | ||
| sql(s""" | ||
| |WITH cte AS (SELECT /*+ REPARTITION(3) */ * FROM t) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. as long as we have a hint in CTE, it will fail in 3.0? can you add more test cases?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. The same issue reported by multiple times in Jira with different use cases, like SPARK-32237, SPARK-32347, SPARK-32535, I will file a new PR to add more unit tests. |
||
| |SELECT * FROM cte | ||
| """.stripMargin), | ||
| Row(1) :: Nil) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| case class Foo(bar: Option[String]) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we have to execute the batch Substitution before the batch ResolveHints? What is the root cause? I am unable to tell it from either PR description or the code comments. Could any of you explain it?
cc @maryannxue
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know which PR causes this regression problem since there is a big gap between 2.4 and 3.0 about CTE. I am not familiar with this part.
I fixed this by debuging. In 3.0, I found only the
child of CTEcould enter into the rule ofResolveCoalesceHints. For the above test case:Only this child
Projectbranch could enter into the rule ofResolveCoalesceHints