-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-21055][SQL] replace grouping__id with grouping_id() #18270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #77893 has finished for PR 18270 at commit
|
|
Test build #77913 has started for PR 18270 at commit |
|
why failed? |
|
Retest this please. |
|
Test build #78305 has started for PR 18270 at commit |
|
i will retrigger this once jenkins restart |
|
test this please |
|
Thank you, @shaneknapp . |
|
Test build #78312 has finished for PR 18270 at commit
|
|
Test build #78582 has finished for PR 18270 at commit
|
|
Retest this please. |
|
Ping, @cenyuhai . |
|
Test build #80086 has finished for PR 18270 at commit
|
|
Jenkins, retest this please. |
|
Test build #80905 has finished for PR 18270 at commit
|
|
@cenyuhai |
|
I realize the reason that leads to UTs failure is that the query result has a fixed order even though a sql statement doesn't include |
|
@cenyuhai Could you update this PR? I will review it then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do all the changes you made in this file in the rule ResolveFunctions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I can do it,because ResolveFunctions is behind ResolveGroupingAnalytics
|
Ok,I will update it |
|
Test build #81257 has finished for PR 18270 at commit
|
|
Test build #81258 has finished for PR 18270 at commit
|
|
@gatorsmile |
|
retest this please |
|
@jinxing64 #10677 made the changes. Hive generates a wrong result. See the JIRA opened by Davies: https://issues.apache.org/jira/browse/HIVE-12833 |
|
Thank you so much ! |
|
@gatorsmile I had already tried to resolve grouping__id in ResolveFunctions. But ResolveFunctions is behind ResolveGroupingAnalytics. grouping__id may change in ResolveGroupingAnalytics. |
|
Test build #81342 has finished for PR 18270 at commit
|
|
@jinxing64 I think you may revert the changes in Spark, and use the same logic of grouping__id as hive. Keep the wrong result consistently as hive did. |
|
Test build #81345 has finished for PR 18270 at commit
|
|
Thanks for notification. Actually we implement the same logic with hive, though there's a bug ... |
| -- !query 16 output | ||
| org.apache.spark.sql.AnalysisException | ||
| grouping__id is deprecated; use grouping_id() instead; | ||
| Java 2012 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you manually editing this group-analytics.sql.out? The test failure is due to mismatching between spaces and tab. Please generate the output file with the instructions in SQLQueryTestSuite and don't edit it manually.
| try { | ||
| expr transformUp { | ||
| case GetColumnByOrdinal(ordinal, _) => plan.output(ordinal) | ||
| case u @ UnresolvedAttribute(nameParts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks suspicious. Doesn't ResolveMissingReferences resolve grouping_id used in order by?
| VirtualColumn.hiveGroupingIdName)() | ||
| } | ||
| } | ||
| case u @ UnresolvedAttribute(nameParts) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just need to add if !resolver(u.name, VirtualColumn.hiveGroupingIdName) here
| VirtualColumn.hiveGroupingIdName)() | ||
| } | ||
| } | ||
| case u @ UnresolvedAttribute(nameParts) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here too.
|
Test build #82540 has finished for PR 18270 at commit
|
|
Test build #82543 has finished for PR 18270 at commit
|
|
@cenyuhai Could you also address this comment: https://github.com/apache/spark/pull/18270/files#r136121931? |
|
retest this please |
|
Test build #82925 has finished for PR 18270 at commit
|
|
LGTM Let us resolve the issue as the follow-up PR. |
|
Thanks! Merged to master. |
## What changes were proposed in this pull request? Simplifies the test cases that were added in the PR #18270. ## How was this patch tested? N/A Author: gatorsmile <[email protected]> Closes #19546 from gatorsmile/backportSPARK-21055.
What changes were proposed in this pull request?
spark does not support grouping__id, it has grouping_id() instead.
But it is not convenient for hive user to change to spark-sql
so this pr is to replace grouping__id with grouping_id()
hive user need not to alter their scripts
How was this patch tested?
test with SQLQuerySuite.scala