Skip to content

Conversation

@cenyuhai
Copy link
Contributor

What changes were proposed in this pull request?

spark does not support grouping__id, it has grouping_id() instead.
But it is not convenient for hive user to change to spark-sql
so this pr is to replace grouping__id with grouping_id()
hive user need not to alter their scripts

How was this patch tested?

test with SQLQuerySuite.scala

@SparkQA
Copy link

SparkQA commented Jun 11, 2017

Test build #77893 has finished for PR 18270 at commit 6fd567c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 12, 2017

Test build #77913 has started for PR 18270 at commit f532d9f.

@cenyuhai
Copy link
Contributor Author

cenyuhai commented Jun 12, 2017

why failed?

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Jun 20, 2017

Test build #78305 has started for PR 18270 at commit f532d9f.

@shaneknapp
Copy link
Contributor

i will retrigger this once jenkins restart

@shaneknapp
Copy link
Contributor

test this please

@dongjoon-hyun
Copy link
Member

Thank you, @shaneknapp .

@SparkQA
Copy link

SparkQA commented Jun 20, 2017

Test build #78312 has finished for PR 18270 at commit f532d9f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 25, 2017

Test build #78582 has finished for PR 18270 at commit 3b361d7.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please.

@dongjoon-hyun
Copy link
Member

Ping, @cenyuhai .

@SparkQA
Copy link

SparkQA commented Jul 31, 2017

Test build #80086 has finished for PR 18270 at commit 3b361d7.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jinxing64
Copy link

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Aug 20, 2017

Test build #80905 has finished for PR 18270 at commit 3b361d7.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jinxing64
Copy link

@cenyuhai
Are you still working on this? Could please fix the test?

@YannByron
Copy link

I realize the reason that leads to UTs failure is that the query result has a fixed order even though a sql statement doesn't include order by, such as the output of query 16 in group-analytics.sql.out.
And, it's not same to the execution result you run this sql statement though spark-sql.
Just modifying the output order will work.

@gatorsmile
Copy link
Member

@cenyuhai Could you update this PR? I will review it then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do all the changes you made in this file in the rule ResolveFunctions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I can do it,because ResolveFunctions is behind ResolveGroupingAnalytics

@cenyuhai
Copy link
Contributor Author

Ok,I will update it

@SparkQA
Copy link

SparkQA commented Aug 30, 2017

Test build #81257 has finished for PR 18270 at commit 1423875.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 30, 2017

Test build #81258 has finished for PR 18270 at commit e49742b.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jinxing64
Copy link

jinxing64 commented Aug 30, 2017

@gatorsmile
Could you please give some ideas why the value of grouping_id() generated in Spark is different from grouping__id in Hive? Is it designed on purpose? A lot of our users are using grouping__id in if(...) clause. The incompatibility between Spark and Hive is making our migration very difficult.

@cenyuhai cenyuhai reopened this Aug 30, 2017
@cenyuhai
Copy link
Contributor Author

retest this please

@gatorsmile
Copy link
Member

@jinxing64 #10677 made the changes. Hive generates a wrong result. See the JIRA opened by Davies: https://issues.apache.org/jira/browse/HIVE-12833

@jinxing64
Copy link

Thank you so much !

@cenyuhai
Copy link
Contributor Author

cenyuhai commented Sep 2, 2017

@gatorsmile I had already tried to resolve grouping__id in ResolveFunctions. But ResolveFunctions is behind ResolveGroupingAnalytics. grouping__id may change in ResolveGroupingAnalytics.

@SparkQA
Copy link

SparkQA commented Sep 2, 2017

Test build #81342 has finished for PR 18270 at commit 059d486.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cenyuhai
Copy link
Contributor Author

cenyuhai commented Sep 2, 2017

@jinxing64 I think you may revert the changes in Spark, and use the same logic of grouping__id as hive. Keep the wrong result consistently as hive did.

@SparkQA
Copy link

SparkQA commented Sep 2, 2017

Test build #81345 has finished for PR 18270 at commit e4d6d48.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jinxing64
Copy link

Thanks for notification. Actually we implement the same logic with hive, though there's a bug ...

-- !query 16 output
org.apache.spark.sql.AnalysisException
grouping__id is deprecated; use grouping_id() instead;
Java 2012 0
Copy link
Member

@viirya viirya Sep 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you manually editing this group-analytics.sql.out? The test failure is due to mismatching between spaces and tab. Please generate the output file with the instructions in SQLQueryTestSuite and don't edit it manually.

try {
expr transformUp {
case GetColumnByOrdinal(ordinal, _) => plan.output(ordinal)
case u @ UnresolvedAttribute(nameParts)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks suspicious. Doesn't ResolveMissingReferences resolve grouping_id used in order by?

VirtualColumn.hiveGroupingIdName)()
}
}
case u @ UnresolvedAttribute(nameParts) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just need to add if !resolver(u.name, VirtualColumn.hiveGroupingIdName) here

VirtualColumn.hiveGroupingIdName)()
}
}
case u @ UnresolvedAttribute(nameParts) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here too.

@SparkQA
Copy link

SparkQA commented Oct 8, 2017

Test build #82540 has finished for PR 18270 at commit 1202bfa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 8, 2017

Test build #82543 has finished for PR 18270 at commit eac37f0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cenyuhai
Copy link
Contributor Author

cenyuhai commented Oct 9, 2017

@gatorsmile

@gatorsmile
Copy link
Member

@gatorsmile
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Oct 20, 2017

Test build #82925 has finished for PR 18270 at commit eac37f0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

LGTM

Let us resolve the issue as the follow-up PR.

@gatorsmile
Copy link
Member

Thanks! Merged to master.

@asfgit asfgit closed this in 16c9cc6 Oct 20, 2017
asfgit pushed a commit that referenced this pull request Oct 21, 2017
## What changes were proposed in this pull request?
Simplifies the test cases that were added in the PR #18270.

## How was this patch tested?
N/A

Author: gatorsmile <[email protected]>

Closes #19546 from gatorsmile/backportSPARK-21055.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants