-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-1403] Move the class loader creation back to where it was in 0.9.0 #322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
|
After looking at the code a bit more, I see that the code to setContextClassLoader does not use SecurityManager AFAIS. createClassLoader is creating a File object. addReplClassLoaderIfNeeded is dynamically loading a class file. I dont see any use of SecurityManager in these two methods. I am not sure if it gets used implicitly. |
|
This patch also adds back a line that we earlier removed: Does your fix require that line to work? If so, what is the error if you remove it? |
|
@ueshin removed it in SPARK-1210: I proposed a more conservative change in this comment: But we ultimately went ahead and just removed it because we couldn't see a case where it was necessary... |
|
I am testing out if removing the last line is ok. |
|
I can confirm that the third line is needed. Without that line I see the same failure as earlier. |
|
Which exact failure is it? Could you post the stack trace? |
|
Hi, I mistakenly thought that |
|
@ueshin - ah cool. Do you mind giving a code snippet of what that would look like? Then @manku-timma can see if it fixes the problem. Probably is a better solution... |
|
The actual error is below. Line numbers might be a bit off due to some of my code. |
|
From my limited understanding, Executor is used by the mesos backend and the standalone backend. If mesos backend has its own class loader, will that be enough? Does the standalone backend already have its own class loader? |
|
Put the following line at the beginning of |
|
@manku-timma Yes, the standalone backend |
|
BTW, we might have to restore the context class loader of |
|
@ueshin, your one line change works for me. Let me know if you want me to test any of the restore code. Just for my understanding, the flow after your one-line fix:
|
|
@manku-timma, I just wondered that we should restore the class loader in val cl = Thread.currentThread.getContextClassLoader
try {
Thread.currentThread.setContextClassLoader(getClass.getClassLoader)
logInfo("Registered with Mesos as executor ID " + executorInfo.getExecutorId.getValue)
this.driver = driver
val properties = Utils.deserialize[Array[(String, String)]](executorInfo.getData.toByteArray)
executor = new Executor(
executorInfo.getExecutorId.getValue,
slaveInfo.getHostname,
properties)
} finally {
Thread.currentThread.setContextClassLoader(cl)
}I think this is better than before. The http based class loader for The flow you wrote would be:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need one more line here
Thread.currentThread.setContextClassLoader(getClass.getClassLoader)|
Oops. Added that line. I am facing this error in the current git tree |
|
Let me know if there is any other change I need to make. I have tested after merging from master and things look fine. This is good to be merged from my end. |
|
LGTM about the class loader issue. |
|
I see that PR 334 made the java 6 change. So I reverted mine. |
|
Hey @ueshin @manku-timma - as a simpler fix, would it be possible to just change the way SparkEnv captures the class loader? I think it was probably just an oversight when that was added. If there is not a context class loader, then it should just load the bootstrap class loader: |
|
@pwendell: I tested your fix to SparkEnv.scala (after reverting my earlier change). It does not work. SparkEnv's loader turns out to be My patch to |
|
I guess what I don't understand about the current fix is, where is the context class loader getting changed from the boostrap class loader in the first place? The current approach is to capture the class loader of MesosExecutorDriver and set it to the context class loader and then set it back. Isn't this also just the |
|
Another way of putting my question. Currently there is a line: What is the actual classloader being set here... isn't it just the |
|
@pwendell: You are right. Actually Looks like classes directly loaded by SparkEnv get to use the right classloader since it is passed as an argument. But classes loaded at the second level (like in BroadcastManager above) still use the null classloader and fail. |
|
@ueshin you said before that " But I think in this case we'd expect the bootstrap classloader to know about @manku-timma I looked more and the reason this doesn't work is that it looks like other parts of the code don't directly use the |
|
Ah I see, @ueshin you are right. It's the bootstrap class loader and it won't have any spark definitions. I was mixing this up with the system class loader. We should definitely clean this up. The behavior we want in every case is to either use the context class loader (if present) and if not use the classloader that loads spark classes (e.g. the system classloader). |
|
@pwendell Yes, the bootstrap class loader knows only core Java APIs and the Spark classes (specified by |
|
So the current fix looks fine? |
SPARK-1009 Updated MLlib docs to show how to use it in Python In addition added detailed examples for regression, clustering and recommendation algorithms in a separate Scala section. Fixed a few minor issues with existing documentation.
|
@pwendell: Do you plan to pick this up for 1.0? Is there anything more I need to do? |
|
Jenkins, retest this please. I'm fine to have this is a workaround for 1.0. I'll make a JIRA describing the broader issue and we can fix it for 1.1. |
|
Jenkins, retest this please. |
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
|
Thanks for the fix - merged into master and 1.0 |
….9.0 [SPARK-1403] I investigated why spark 0.9.0 loads fine on mesos while spark 1.0.0 fails. What I found was that in SparkEnv.scala, while creating the SparkEnv object, the current thread's classloader is null. But in 0.9.0, at the same place, it is set to org.apache.spark.repl.ExecutorClassLoader . I saw that 7edbea4 moved it to it current place. I moved it back and saw that 1.0.0 started working fine on mesos. I just created a minimal patch that allows me to run spark on mesos correctly. It seems like SecurityManager's creation needs to be taken into account for a correct fix. Also moving the creation of the serializer out of SparkEnv might be a part of the right solution. PTAL. Author: Bharath Bhushan <[email protected]> Closes #322 from manku-timma/spark-1403 and squashes the following commits: 606c2b9 [Bharath Bhushan] Merge remote-tracking branch 'upstream/master' into spark-1403 ec8f870 [Bharath Bhushan] revert the logger change for java 6 compatibility as PR 334 is doing it 728beca [Bharath Bhushan] Merge remote-tracking branch 'upstream/master' into spark-1403 044027d [Bharath Bhushan] fix compile error 6f260a4 [Bharath Bhushan] Merge remote-tracking branch 'upstream/master' into spark-1403 b3a053f [Bharath Bhushan] Merge remote-tracking branch 'upstream/master' into spark-1403 04b9662 [Bharath Bhushan] add missing line 4803c19 [Bharath Bhushan] Merge remote-tracking branch 'upstream/master' into spark-1403 f3c9a14 [Bharath Bhushan] Merge remote-tracking branch 'upstream/master' into spark-1403 42d3d6a [Bharath Bhushan] used code fragment from @ueshin to fix the problem in a better way 89109d7 [Bharath Bhushan] move the class loader creation back to where it was in 0.9.0 (cherry picked from commit ca11919) Signed-off-by: Patrick Wendell <[email protected]>
….9.0 [SPARK-1403] I investigated why spark 0.9.0 loads fine on mesos while spark 1.0.0 fails. What I found was that in SparkEnv.scala, while creating the SparkEnv object, the current thread's classloader is null. But in 0.9.0, at the same place, it is set to org.apache.spark.repl.ExecutorClassLoader . I saw that apache@7edbea4 moved it to it current place. I moved it back and saw that 1.0.0 started working fine on mesos. I just created a minimal patch that allows me to run spark on mesos correctly. It seems like SecurityManager's creation needs to be taken into account for a correct fix. Also moving the creation of the serializer out of SparkEnv might be a part of the right solution. PTAL. Author: Bharath Bhushan <[email protected]> Closes apache#322 from manku-timma/spark-1403 and squashes the following commits: 606c2b9 [Bharath Bhushan] Merge remote-tracking branch 'upstream/master' into spark-1403 ec8f870 [Bharath Bhushan] revert the logger change for java 6 compatibility as PR 334 is doing it 728beca [Bharath Bhushan] Merge remote-tracking branch 'upstream/master' into spark-1403 044027d [Bharath Bhushan] fix compile error 6f260a4 [Bharath Bhushan] Merge remote-tracking branch 'upstream/master' into spark-1403 b3a053f [Bharath Bhushan] Merge remote-tracking branch 'upstream/master' into spark-1403 04b9662 [Bharath Bhushan] add missing line 4803c19 [Bharath Bhushan] Merge remote-tracking branch 'upstream/master' into spark-1403 f3c9a14 [Bharath Bhushan] Merge remote-tracking branch 'upstream/master' into spark-1403 42d3d6a [Bharath Bhushan] used code fragment from @ueshin to fix the problem in a better way 89109d7 [Bharath Bhushan] move the class loader creation back to where it was in 0.9.0
[SPARK-1403] I investigated why spark 0.9.0 loads fine on mesos while spark 1.0.0 fails. What I found was that in SparkEnv.scala, while creating the SparkEnv object, the current thread's classloader is null. But in 0.9.0, at the same place, it is set to org.apache.spark.repl.ExecutorClassLoader . I saw that 7edbea4 moved it to it current place. I moved it back and saw that 1.0.0 started working fine on mesos.
I just created a minimal patch that allows me to run spark on mesos correctly. It seems like SecurityManager's creation needs to be taken into account for a correct fix. Also moving the creation of the serializer out of SparkEnv might be a part of the right solution. PTAL.