[SPARK-6491] Spark will put the current working dir to the CLASSPATH #5156

marsishandsome · 2015-03-24T06:56:45Z

When running "bin/computer-classpath.sh", the output will be:
:/spark/conf:/spark/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.5.0-cdh5.2.0.jar:/spark/lib_managed/jars/datanucleus-rdbms-3.2.9.jar:/spark/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar:/spark/lib_managed/jars/datanucleus-core-3.2.10.jar
Java will add the current working dir to the CLASSPATH, if the first ":" exists, which is not expected by spark users.
For example, if I call spark-shell in the folder /root. And there exists a "core-site.xml" under /root/. Spark will use this file as HADOOP CONF file, even if I have already set HADOOP_CONF_DIR=/etc/hadoop/conf.

AmplabJenkins · 2015-03-24T06:57:11Z

Can one of the admins verify this patch?

srowen · 2015-03-24T10:40:42Z

bin/compute-classpath.sh

So, specifically this is taking care of the case where SPARK_SUBMIT_CLASSPATH is empty or not set? Otherwise I think it would be equivalent, but yeah, we should handle that case and not add empty elements.

However your change here also puts the conf directories first, rather than last, on the classpath. I don't know if that was intentional but I suppose it would be better to not change that here if the problem being solved is just an empty classpath element.

srowen · 2015-03-24T10:41:18Z

ok to test

SparkQA · 2015-03-24T10:42:43Z

Test build #29081 has started for PR 5156 at commit 35c25d4.

This patch merges cleanly.

SparkQA · 2015-03-24T12:30:58Z

Test build #29081 has finished for PR 5156 at commit 35c25d4.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2015-03-24T12:31:02Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29081/
Test PASSed.

SparkQA · 2015-03-24T14:27:40Z

Test build #29092 has started for PR 5156 at commit 3e859f9.

This patch merges cleanly.

marsishandsome · 2015-03-24T14:31:18Z

@srowen I've pushed another commit and keep the original order of the classpath.
A help function "addClassPath()" is used to avoid code duplication.
I'm pleased to modify other codes to use "addClassPath", if you think it's ok.

SparkQA · 2015-03-24T16:15:05Z

Test build #29092 has finished for PR 5156 at commit 3e859f9.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2015-03-24T16:15:09Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29092/
Test PASSed.

srowen · 2015-03-24T16:30:16Z

I like this. @nchammas for bash thoughts.
@vanzin does the new launcher have any issue of this form? It replaces the file being changed.
There are many places in this file that could use the new function? might be more consistent. Maybe call it appendToClasspath to be clearer about where it adds.

nchammas · 2015-03-24T16:35:07Z

Bash changes LGTM. ⭐

vanzin · 2015-03-24T17:05:01Z

The launcher lib in master shouldn't have this problem (see AbstractCommandBuilder::addToClassPath; classpath is treated as a list, not as a string.)

marsishandsome · 2015-03-25T00:16:25Z

@srowen Please review.

SparkQA · 2015-03-25T00:18:27Z

Test build #29129 has started for PR 5156 at commit 0656eeb.

This patch does not merge cleanly.

SparkQA · 2015-03-25T01:58:27Z

Test build #29136 has started for PR 5156 at commit 5ae214f.

This patch merges cleanly.

SparkQA · 2015-03-25T02:15:13Z

Test build #29129 has finished for PR 5156 at commit 0656eeb.

This patch passes all tests.
This patch does not merge cleanly.
This patch adds no public classes.

AmplabJenkins · 2015-03-25T02:15:17Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29129/
Test PASSed.

SparkQA · 2015-03-25T03:44:31Z

Test build #29136 has finished for PR 5156 at commit 5ae214f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2015-03-25T03:44:35Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29136/
Test PASSed.

srowen · 2015-03-25T16:26:04Z

I like this, especially how much it reduces duplication. Obviously this can go into 1.3 at the latest. If there's little or no conflict, I think it can be merged back to 1.2 too. Let me leave it a short while longer for comments.

sryza · 2015-03-25T18:19:12Z

Just to confirm my understanding: this makes it so that Spark no longer puts the current working dir on the classpath? Or it changes Spark to now include the current working dir?

srowen · 2015-03-25T19:54:43Z

This takes the current working dir off of the classpath, since it appears to be there only accidentally, because weirdly "java -cp :foo.jar" (note the leading empty entry) does this.

When running "bin/computer-classpath.sh", the output will be: :/spark/conf:/spark/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.5.0-cdh5.2.0.jar:/spark/lib_managed/jars/datanucleus-rdbms-3.2.9.jar:/spark/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar:/spark/lib_managed/jars/datanucleus-core-3.2.10.jar Java will add the current working dir to the CLASSPATH, if the first ":" exists, which is not expected by spark users. For example, if I call spark-shell in the folder /root. And there exists a "core-site.xml" under /root/. Spark will use this file as HADOOP CONF file, even if I have already set HADOOP_CONF_DIR=/etc/hadoop/conf. Author: guliangliang <[email protected]> Closes #5156 from marsishandsome/Spark6491 and squashes the following commits: 5ae214f [guliangliang] use appendToClasspath to change CLASSPATH b21f3b2 [guliangliang] keep the classpath order 5d1f870 [guliangliang] [SPARK-6491] Spark will put the current working dir to the CLASSPATH

srowen · 2015-03-27T13:31:19Z

@marsishandsome could you close this PR? since it wasn't opened vs master, the automatic process does not close it. It was merged into branch-1.3.

srowen reviewed Mar 24, 2015
View reviewed changes

guliangliang added 3 commits March 25, 2015 09:42

[SPARK-6491] Spark will put the current working dir to the CLASSPATH

5d1f870

keep the classpath order

b21f3b2

use appendToClasspath to change CLASSPATH

5ae214f

marsishandsome force-pushed the Spark6491 branch from 0656eeb to 5ae214f Compare March 25, 2015 01:55

marsishandsome closed this Mar 27, 2015

[SPARK-6491] Spark will put the current working dir to the CLASSPATH #5156

[SPARK-6491] Spark will put the current working dir to the CLASSPATH #5156

Uh oh!

Conversation

marsishandsome commented Mar 24, 2015

Uh oh!

AmplabJenkins commented Mar 24, 2015

Uh oh!

srowen Mar 24, 2015

Choose a reason for hiding this comment

Uh oh!

srowen commented Mar 24, 2015

Uh oh!

SparkQA commented Mar 24, 2015

Uh oh!

SparkQA commented Mar 24, 2015

Uh oh!

AmplabJenkins commented Mar 24, 2015

Uh oh!

SparkQA commented Mar 24, 2015

Uh oh!

marsishandsome commented Mar 24, 2015

Uh oh!

SparkQA commented Mar 24, 2015

Uh oh!

AmplabJenkins commented Mar 24, 2015

Uh oh!

srowen commented Mar 24, 2015

Uh oh!

nchammas commented Mar 24, 2015

Uh oh!

vanzin commented Mar 24, 2015

Uh oh!

marsishandsome commented Mar 25, 2015

Uh oh!

SparkQA commented Mar 25, 2015

Uh oh!

SparkQA commented Mar 25, 2015

Uh oh!

SparkQA commented Mar 25, 2015

Uh oh!

AmplabJenkins commented Mar 25, 2015

Uh oh!

SparkQA commented Mar 25, 2015

Uh oh!

AmplabJenkins commented Mar 25, 2015

Uh oh!

srowen commented Mar 25, 2015

Uh oh!

sryza commented Mar 25, 2015

Uh oh!

srowen commented Mar 25, 2015

Uh oh!

srowen commented Mar 27, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants