[SPARK-17751] [SQL] Remove spark.sql.eagerAnalysis and Output the Plan if Existed in AnalysisException#15316
[SPARK-17751] [SQL] Remove spark.sql.eagerAnalysis and Output the Plan if Existed in AnalysisException#15316gatorsmile wants to merge 8 commits into
Conversation
|
Test build #66189 has finished for PR 15316 at commit
|
|
Huh, when did we remove eager analysis? |
|
@hvanhovell It was removed when we migrated DataFrame to Dataset. For details, see the PR: #11443 |
|
Hmmmm. I did kinda like this feature. @liancheng why did we remove this? |
|
Shouldn't we bring it back? This is an important feature when there is a bug. |
|
To @hvanhovell @rxin @liancheng listed the reason in the original PR.
However, there is a bug in AnslysisException output. It does not output the plan. I just fixed it. See the updated PR description. |
|
Do we need a test for this fix? |
|
Test build #66203 has finished for PR 15316 at commit
|
|
Test build #66222 has finished for PR 15316 at commit
|
|
Test build #66221 has finished for PR 15316 at commit
|
| } | ||
|
|
||
| override def getMessage: String = { | ||
| val planAnnotation = plan.map(p => s";\n$p").getOrElse("") |
There was a problem hiding this comment.
Why do we need a separate method here?
There was a problem hiding this comment.
When hitting an AnalysisException, we called getMessage in SQLQueryTestSuite.scala. That means, we also output the logical plan in the .sql.out when hitting an AnalysisException. The logical plan contains expression IDs. Thus, when we doing the text compare between the query results with the contents in .sql.out, it might fail due to the mismatch of expression IDs. Thus, I added an extra function here. Maybe I need to add a comment to say this is for testing only.
|
Test build #66668 has finished for PR 15316 at commit
|
|
retest this please |
|
Test build #66687 has finished for PR 15316 at commit
|
|
cc @hvanhovell @rxin Any more comment about this PR? I assume Spark 2.0.2 needs it. Recently, when we analyzing the JIRA https://issues.apache.org/jira/browse/SPARK-17709, we are unable to see the plan due the analyzer failure. The users have to manually rebuild it with this fix. Then, we can see the failed analyzed plan. |
|
Also cc @cloud-fan and @liancheng |
|
LGTM, cc @hvanhovell @rxin for final sign-off |
|
LGTM |
|
Test build #67058 has finished for PR 15316 at commit
|
|
LGTM. Merging to master. Thanks! |
|
@gatorsmile I cannot merge this 2.0. Can you open a backport for this? |
|
Sure, will do it soon. Thanks! |
…utput the Plan if Existed in AnalysisException ### What changes were proposed in this pull request? This PR is to backport the fix #15316 to 2.0. Dataset always does eager analysis now. Thus, `spark.sql.eagerAnalysis` is not used any more. Thus, we need to remove it. This PR also outputs the plan. Without the fix, the analysis error is like ``` cannot resolve '`k1`' given input columns: [k, v]; line 1 pos 12 ``` After the fix, the analysis error becomes: ``` org.apache.spark.sql.AnalysisException: cannot resolve '`k1`' given input columns: [k, v]; line 1 pos 12; 'Project [unresolvedalias(CASE WHEN ('k1 = 2) THEN 22 WHEN ('k1 = 4) THEN 44 ELSE 0 END, None), v#6] +- SubqueryAlias t +- Project [_1#2 AS k#5, _2#3 AS v#6] +- LocalRelation [_1#2, _2#3] ``` ### How was this patch tested? N/A Author: gatorsmile <gatorsmile@gmail.com> Closes #15529 from gatorsmile/eagerAnalysis20.
… if Existed in AnalysisException
### What changes were proposed in this pull request?
Dataset always does eager analysis now. Thus, `spark.sql.eagerAnalysis` is not used any more. Thus, we need to remove it.
This PR also outputs the plan. Without the fix, the analysis error is like
```
cannot resolve '`k1`' given input columns: [k, v]; line 1 pos 12
```
After the fix, the analysis error becomes:
```
org.apache.spark.sql.AnalysisException: cannot resolve '`k1`' given input columns: [k, v]; line 1 pos 12;
'Project [unresolvedalias(CASE WHEN ('k1 = 2) THEN 22 WHEN ('k1 = 4) THEN 44 ELSE 0 END, None), v#6]
+- SubqueryAlias t
+- Project [_1#2 AS k#5, _2#3 AS v#6]
+- LocalRelation [_1#2, _2#3]
```
### How was this patch tested?
N/A
Author: gatorsmile <gatorsmile@gmail.com>
Closes apache#15316 from gatorsmile/eagerAnalysis.
… if Existed in AnalysisException
### What changes were proposed in this pull request?
Dataset always does eager analysis now. Thus, `spark.sql.eagerAnalysis` is not used any more. Thus, we need to remove it.
This PR also outputs the plan. Without the fix, the analysis error is like
```
cannot resolve '`k1`' given input columns: [k, v]; line 1 pos 12
```
After the fix, the analysis error becomes:
```
org.apache.spark.sql.AnalysisException: cannot resolve '`k1`' given input columns: [k, v]; line 1 pos 12;
'Project [unresolvedalias(CASE WHEN ('k1 = 2) THEN 22 WHEN ('k1 = 4) THEN 44 ELSE 0 END, None), v#6]
+- SubqueryAlias t
+- Project [_1#2 AS k#5, _2#3 AS v#6]
+- LocalRelation [_1#2, _2#3]
```
### How was this patch tested?
N/A
Author: gatorsmile <gatorsmile@gmail.com>
Closes apache#15316 from gatorsmile/eagerAnalysis.
What changes were proposed in this pull request?
Dataset always does eager analysis now. Thus,
spark.sql.eagerAnalysisis not used any more. Thus, we need to remove it.This PR also outputs the plan. Without the fix, the analysis error is like
After the fix, the analysis error becomes:
How was this patch tested?
N/A