Skip to content

Conversation

@bkrieger
Copy link

@bkrieger bkrieger commented Jun 7, 2018

What changes were proposed in this pull request?

Currently, the Analyzer throws an exception if you try to nest a generator. However, it special cases generators "nested" in an alias, and allows that. If you try to alias a generator twice, it is not caught by the special case, so an exception is thrown.

This PR trims the unnecessary, non-top-level aliases, so that the generator is allowed.

How was this patch tested?

new tests in AnalysisSuite.

@hvanhovell
Copy link
Contributor

ok to test

case MultiAlias(_: Generator, _) => false
case other => hasGenerator(other)
private def hasNestedGenerator(expr: NamedExpression): Boolean = {
CleanupAliases.trimNonTopLevelAliases(expr) match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CleanupAliases.trimNonTopLevelAliases only strips Alias expressions. Should we also handle the other two cases?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to handle the MultiAlias and UnresolvedAlias, and updated the unit test to test all 3.

@SparkQA
Copy link

SparkQA commented Jun 8, 2018

Test build #91564 has finished for PR 21508 at commit 44ae34d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 8, 2018

Test build #91568 has finished for PR 21508 at commit 46c4a55.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 8, 2018

Test build #91569 has finished for PR 21508 at commit f174263.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}
}

def trimNonTopLevelAliases(e: Expression): Expression = e match {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of duplicating the function here, could we just fixing CleanupAliases.trimNonTopLevelAliases

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure- I didn't want to break any existing functionality, but I can do that instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

def trimNonTopLevelAliases(e: Expression): Expression = e match {
case a: UnresolvedAlias =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to handle UnresolvedAlias?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my use case, no. But I wasn't sure if another use case would care.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it'll hurt to handle it.

@SparkQA
Copy link

SparkQA commented Jun 11, 2018

Test build #91665 has finished for PR 21508 at commit abd1457.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

case other => hasGenerator(other)
private def hasNestedGenerator(expr: NamedExpression): Boolean = {
CleanupAliases.trimNonTopLevelAliases(expr) match {
case UnresolvedAlias(_: Generator, _) => false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do not have a valid case here, we should not add it. Here, I think we just need to handle the resolved alias.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hasNestedGenerator already handled UnresolvedAlias. I'll change CleanupAliases back to only handling resolved aliases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@SparkQA
Copy link

SparkQA commented Jun 11, 2018

Test build #91674 has finished for PR 21508 at commit 5d5e8e5.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@bkrieger
Copy link
Author

The test failure looks like a flake to me?

@SparkQA
Copy link

SparkQA commented Jun 12, 2018

Test build #91725 has finished for PR 21508 at commit e9605dc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • logInfo(s\"Using output committer class $
  • public class JavaPowerIterationClusteringExample
  • class PowerIterationClustering(HasMaxIter, HasWeightCol, JavaParams, JavaMLReadable,

@bkrieger
Copy link
Author

@gatorsmile @hvanhovell is this good to merge?

@bkrieger
Copy link
Author

@gatorsmile @hvanhovell can you take a last look at this? I think it's good to merge.

@bkrieger
Copy link
Author

@gatorsmile @hvanhovell Gentle ping. Let me know if there's someone else who would be better to review.

@bkrieger
Copy link
Author

@gatorsmile @hvanhovell can you take another look at this?

@mccheah
Copy link
Contributor

mccheah commented Jun 25, 2018

@gatorsmile @hvanhovell, I'm working with @bkrieger and we need this patch soon. May we please get a sign off or else any suggested changes here?

@SparkQA
Copy link

SparkQA commented Jul 17, 2018

Test build #93177 has finished for PR 21508 at commit 3021918.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@bkrieger
Copy link
Author

@gatorsmile @hvanhovell any chance you can take a look at this?

@gatorsmile
Copy link
Member

cc @maropu Help review this?

Copy link
Contributor

@hvanhovell hvanhovell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, merging to master. Sorry for the hold-up.

@asfgit asfgit closed this in 597bdef Jul 20, 2018
@bkrieger bkrieger deleted the bk/SPARK-24488 branch July 23, 2018 13:48
bkrieger pushed a commit to bkrieger/spark that referenced this pull request Jul 23, 2018
Currently, the Analyzer throws an exception if your try to nest a generator. However, it special cases generators "nested" in an alias, and allows that. If you try to alias a generator twice, it is not caught by the special case, so an exception is thrown.

This PR trims the unnecessary, non-top-level aliases, so that the generator is allowed.

new tests in AnalysisSuite.

Author: Brandon Krieger <[email protected]>

Closes apache#21508 from bkrieger/bk/SPARK-24488.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants