-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-6491] Spark will put the current working dir to the CLASSPATH #5156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
bin/compute-classpath.sh
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, specifically this is taking care of the case where SPARK_SUBMIT_CLASSPATH is empty or not set? Otherwise I think it would be equivalent, but yeah, we should handle that case and not add empty elements.
However your change here also puts the conf directories first, rather than last, on the classpath. I don't know if that was intentional but I suppose it would be better to not change that here if the problem being solved is just an empty classpath element.
|
ok to test |
|
Test build #29081 has started for PR 5156 at commit
|
|
Test build #29081 has finished for PR 5156 at commit
|
|
Test PASSed. |
|
Test build #29092 has started for PR 5156 at commit
|
|
@srowen I've pushed another commit and keep the original order of the classpath. |
|
Test build #29092 has finished for PR 5156 at commit
|
|
Test PASSed. |
|
Bash changes LGTM. ⭐ |
|
The launcher lib in master shouldn't have this problem (see AbstractCommandBuilder::addToClassPath; classpath is treated as a list, not as a string.) |
|
@srowen Please review. |
|
Test build #29129 has started for PR 5156 at commit
|
0656eeb to
5ae214f
Compare
|
Test build #29136 has started for PR 5156 at commit
|
|
Test build #29129 has finished for PR 5156 at commit
|
|
Test PASSed. |
|
Test build #29136 has finished for PR 5156 at commit
|
|
Test PASSed. |
|
I like this, especially how much it reduces duplication. Obviously this can go into 1.3 at the latest. If there's little or no conflict, I think it can be merged back to 1.2 too. Let me leave it a short while longer for comments. |
|
Just to confirm my understanding: this makes it so that Spark no longer puts the current working dir on the classpath? Or it changes Spark to now include the current working dir? |
|
This takes the current working dir off of the classpath, since it appears to be there only accidentally, because weirdly "java -cp :foo.jar" (note the leading empty entry) does this. |
When running "bin/computer-classpath.sh", the output will be: :/spark/conf:/spark/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.5.0-cdh5.2.0.jar:/spark/lib_managed/jars/datanucleus-rdbms-3.2.9.jar:/spark/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar:/spark/lib_managed/jars/datanucleus-core-3.2.10.jar Java will add the current working dir to the CLASSPATH, if the first ":" exists, which is not expected by spark users. For example, if I call spark-shell in the folder /root. And there exists a "core-site.xml" under /root/. Spark will use this file as HADOOP CONF file, even if I have already set HADOOP_CONF_DIR=/etc/hadoop/conf. Author: guliangliang <[email protected]> Closes #5156 from marsishandsome/Spark6491 and squashes the following commits: 5ae214f [guliangliang] use appendToClasspath to change CLASSPATH b21f3b2 [guliangliang] keep the classpath order 5d1f870 [guliangliang] [SPARK-6491] Spark will put the current working dir to the CLASSPATH
|
@marsishandsome could you close this PR? since it wasn't opened vs |
When running "bin/computer-classpath.sh", the output will be:
:/spark/conf:/spark/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.5.0-cdh5.2.0.jar:/spark/lib_managed/jars/datanucleus-rdbms-3.2.9.jar:/spark/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar:/spark/lib_managed/jars/datanucleus-core-3.2.10.jar
Java will add the current working dir to the CLASSPATH, if the first ":" exists, which is not expected by spark users.
For example, if I call spark-shell in the folder /root. And there exists a "core-site.xml" under /root/. Spark will use this file as HADOOP CONF file, even if I have already set HADOOP_CONF_DIR=/etc/hadoop/conf.