Skip to content

[SPARK-20708][CORE] Make addExclusionRules up-to-date#17947

Closed
dongjoon-hyun wants to merge 5 commits intoapache:masterfrom
dongjoon-hyun:SPARK-20708
Closed

[SPARK-20708][CORE] Make addExclusionRules up-to-date#17947
dongjoon-hyun wants to merge 5 commits intoapache:masterfrom
dongjoon-hyun:SPARK-20708

Conversation

@dongjoon-hyun
Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

Since SPARK-9263, resolveMavenCoordinates ignores Spark and Spark's dependencies by using addExclusionRules. This PR aims to make addExclusionRules up-to-date to neglect correctly because it fails to neglect some components like the following.

mllib (correct)

$ bin/spark-shell --packages org.apache.spark:spark-mllib_2.11:2.1.1
...
---------------------------------------------------------------------
|                  |            modules            ||   artifacts   |
|       conf       | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
|      default     |   0   |   0   |   0   |   0   ||   0   |   0   |
---------------------------------------------------------------------

mllib-local (wrong)

$ bin/spark-shell --packages org.apache.spark:spark-mllib-local_2.11:2.1.1
...
---------------------------------------------------------------------
|                  |            modules            ||   artifacts   |
|       conf       | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
|      default     |   15  |   2   |   2   |   0   ||   15  |   2   |
---------------------------------------------------------------------

How was this patch tested?

Pass the Jenkins with a updated test case.

@SparkQA
Copy link
Copy Markdown

SparkQA commented May 11, 2017

Test build #76793 has finished for PR 17947 at commit 6ac403a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Copy Markdown
Member Author

Hi, @srowen .
Could you review this PR about --packages?

@dongjoon-hyun
Copy link
Copy Markdown
Member Author

Retest this please.

@SparkQA
Copy link
Copy Markdown

SparkQA commented May 19, 2017

Test build #77098 has finished for PR 17947 at commit 6ac403a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Copy Markdown
Member Author

Retest this please.

@SparkQA
Copy link
Copy Markdown

SparkQA commented May 19, 2017

Test build #77103 has finished for PR 17947 at commit 6ac403a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Copy Markdown
Contributor

cc @brkyvz to verify the change.

@dongjoon-hyun
Copy link
Copy Markdown
Member Author

Thank you, @jiangxb1987 .

@dongjoon-hyun
Copy link
Copy Markdown
Member Author

Could you review this please, @brkyvz ?


val coordinates =
components.map(comp => s"org.apache.spark:spark-${comp}2.10:1.2.0").mkString(",") +
components.map(comp => s"org.apache.spark:spark-${comp}2.10:2.1.1").mkString(",") +
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also change Scala version too, since we will remove 2.10 soon

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Thank you for review.

test("neglects Spark and Spark's dependencies") {
val components = Seq("catalyst_", "core_", "graphx_", "hive_", "mllib_", "repl_",
"sql_", "streaming_", "yarn_", "network-common_", "network-shuffle_", "network-yarn_")
val components = Seq("catalyst_", "core_", "graphx_", "hive_", "hive-thriftserver_",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the hive, mesos stuff is not in the assembly jar by default, I don't think we need to exclude those

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for yarn

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I'll remove hive, hive-thriftserver, mesos, and yarn.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, network-yarn_, too.

@brkyvz
Copy link
Copy Markdown
Contributor

brkyvz commented May 30, 2017

We shouldn't neglect dependencies that are not in the assembly jar by default. Otherwise thanks for doing this!

// spark-streaming_2.1x and spark-streaming-kafka-0-8-assembly_2.1x
val components = Seq("catalyst_", "core_", "graphx_", "hive_", "mllib_", "repl_",
"sql_", "streaming_", "yarn_", "network-common_", "network-shuffle_", "network-yarn_")
val components = Seq("catalyst_", "core_", "graphx_", "launcher_", "mllib_", "mllib-local_",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also make this a static variable and re-use in tests?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise it's hard to keep both in sync

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!


val coordinates =
components.map(comp => s"org.apache.spark:spark-${comp}2.10:1.2.0").mkString(",") +
components.map(comp => s"org.apache.spark:spark-${comp}2.11:2.1.1").mkString(",") +
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@SparkQA
Copy link
Copy Markdown

SparkQA commented May 31, 2017

Test build #77554 has finished for PR 17947 at commit bfa5238.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link
Copy Markdown

SparkQA commented May 31, 2017

Test build #77560 has finished for PR 17947 at commit d562ea2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Copy Markdown
Member Author

Hi, @brkyvz .
Could you review this again?

// We need to specify each component explicitly, otherwise we miss spark-streaming-kafka-0-8 and
// other spark-streaming utility components. Underscore is there to differentiate between
// spark-streaming_2.1x and spark-streaming-kafka-0-8-assembly_2.1x
val components = Seq("catalyst_", "core_", "graphx_", "launcher_", "mllib_", "mllib-local_",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it's in global scope now can we rename it to something more meaningful?
IVY_DEFAULT_EXCLUDES

Also add to the docs, since it has lost its context from the method and the docs now

@dongjoon-hyun
Copy link
Copy Markdown
Member Author

@brkyvz . Sure. I renamed it into IVY_DEFAULT_EXCLUDES and add more documents about that.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Jun 1, 2017

Test build #77614 has finished for PR 17947 at commit 8b184e7.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Jun 1, 2017

Test build #77615 has finished for PR 17947 at commit 2cf5400.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@brkyvz
Copy link
Copy Markdown
Contributor

brkyvz commented Jun 1, 2017

Thanks @dongjoon-hyun LGTM. Merging to master

@asfgit asfgit closed this in 34661d8 Jun 1, 2017
@dongjoon-hyun
Copy link
Copy Markdown
Member Author

Thank you so much, @brkyvz .

@dongjoon-hyun dongjoon-hyun deleted the SPARK-20708 branch January 7, 2019 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants