Skip to content

Conversation

@vanzin
Copy link
Contributor

@vanzin vanzin commented Apr 7, 2015

Add exclusions and explicit dependencies so that the examples
assembly does not duplicate classes already packaged in the main
assembly.

Also avoid relocating the commons-math3 package since it's already
a dependency of spark-core, and thus is already available in the
main assembly.

Add exclusions and explicit dependencies so that the examples
assembly does not duplicate classes already packaged in the main
assembly.

Also avoid relocating the commons-math3 package since it's already
a dependency of spark-core, and thus is already available in the
main assembly.
@vanzin
Copy link
Contributor Author

vanzin commented Apr 7, 2015

Locally, with a few *-provided profiles, the examples assembly shrunk from > 80MB to about 16MB.

@SparkQA
Copy link

SparkQA commented Apr 7, 2015

Test build #29763 has started for PR 5379 at commit 12c258e.

@vanzin vanzin changed the title [minor] [examples] Avoid re-packaging unneeded classes. [minor] [examples] Avoid packaging duplicate classes. Apr 7, 2015
@srowen
Copy link
Member

srowen commented Apr 7, 2015

Individually those changes seem believable, like, marking Scala as provided and not including the Spark-shaded classes. Most affect the Cassandra dependency and the logic there is that they are definitely provided by Spark?

@SparkQA
Copy link

SparkQA commented Apr 7, 2015

Test build #29763 has finished for PR 5379 at commit 12c258e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29763/
Test PASSed.

@vanzin
Copy link
Contributor Author

vanzin commented Apr 7, 2015

Yes, I checked all the exclusions I'm adding; they're either direct dependencies of Spark (and thus can be provided), or are transitive (e.g. Hadoop or some other Spark dependency pulls them in), so the assembly (or the cluster where the assembly is being run) is expected to provide them.

@srowen
Copy link
Member

srowen commented Apr 8, 2015

This LGTM as that is a huge decrease in the size of the examples JAR. I'll leave it open for comments one more day.

@asfgit asfgit closed this in 470d745 Apr 9, 2015
@vanzin vanzin deleted the examples-deps branch April 10, 2015 21:30
vanzin pushed a commit to vanzin/spark that referenced this pull request Apr 20, 2015
Add exclusions and explicit dependencies so that the examples
assembly does not duplicate classes already packaged in the main
assembly.

Also avoid relocating the commons-math3 package since it's already
a dependency of spark-core, and thus is already available in the
main assembly.

Author: Marcelo Vanzin <[email protected]>

Closes apache#5379 from vanzin/examples-deps and squashes the following commits:

12c258e [Marcelo Vanzin] [minor] [examples] Avoid re-packaging unneeded classes.

(cherry picked from commit 470d745)
(cherry picked from commit 5dcfaf2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants