-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-36599][CORE] Fix the http class server to work again #33849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-36599][CORE] Fix the http class server to work again #33849
Conversation
5ff0a30 to
eea16e8
Compare
eea16e8 to
807c0d9
Compare
|
Can one of the admins verify this patch? |
807c0d9 to
c3fb027
Compare
c3fb027 to
3c1ed7b
Compare
| val parentLoader = new ParentClassLoader(parent) | ||
|
|
||
| // Allows HTTP connect and read timeouts to be controlled for testing / debugging purposes | ||
| private[repl] var httpUrlConnectionTimeoutMillis: Int = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this is unused
| private def getClassFileInputStreamFromFileSystem(fileSystem: FileSystem)( | ||
| pathInDirectory: String): InputStream = { | ||
| val path = new Path(directory, pathInDirectory) | ||
| val path = new Path(new Path(uri), pathInDirectory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain a little more what inputs fail without this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hadoop Http filesystem require the paths to be fully qualified URLs.
It does path.toUri.toUrl which fails in our case because the Path is not fully qualified.
So the class loader doesn't work if the class uri is http://..../ I raised a PR on hadoop too,
apache/hadoop#3338
But this is a regression on spark, as it used to work with a very specific implementation for http based class servers and now since it uses the Filesystem api for everything and it doesn't work anymore.
|
This hasn't been working since, 5085739 |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
HTTP based class servers no longer work because Spark switched to Hadoop Filesystem based implementation for HTTP class servers and the hadoop http filesystem is quirky in the way it accepts paths.