-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-22962][K8S] Fail fast if submission client local files are used #20320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
jiangxb1987
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM only some nits.
| .map(_.split(",")) | ||
| .getOrElse(Array.empty[String]) | ||
|
|
||
| if (existSubmissionLocalFiles(sparkJars) || existSubmissionLocalFiles(sparkFiles)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add a TODO here if this is planned to be supported in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, and also created https://issues.apache.org/jira/browse/SPARK-23153.
|
|
||
| if (existSubmissionLocalFiles(sparkJars) || existSubmissionLocalFiles(sparkFiles)) { | ||
| throw new SparkException("The Kubernetes mode does not yet support application " + | ||
| "dependencies local to the submission client. It currently only allows application" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra space in the end of line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
Kubernetes integration test starting |
foxish
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on docs change for 2.3 modulo minor comments.
Code changes ok if we also re-validate with manual tests against http (we have a caveat with our integration tests not testing this correctly yet), gcs and HDFS. Also ok with dropping the code change for 2.3, since it's a usability and not a functionality improvement.
docs/running-on-kubernetes.md
Outdated
| `SPARK_EXTRA_CLASSPATH` environment variable in your Dockerfiles. | ||
| `SPARK_EXTRA_CLASSPATH` environment variable in your Dockerfiles. The `local://` scheme is also required when referring to | ||
| dependencies in custom-built Docker images in `spark-submit`. Note that using application dependencies local to the submission | ||
| client is currently not yet supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"application dependencies from the local file system"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| by their appropriate remote URIs. Also, application dependencies can be pre-mounted into custom-built Docker images. | ||
| Those dependencies can be added to the classpath by referencing them with `local://` URIs and/or setting the | ||
| `SPARK_EXTRA_CLASSPATH` environment variable in your Dockerfiles. | ||
| `SPARK_EXTRA_CLASSPATH` environment variable in your Dockerfiles. The `local://` scheme is also required when referring to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point above already covers that local:// is needed with custom-built images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that's about adding to the classpath. I wanted to make it more specific and clearer.
|
Kubernetes integration test status success |
|
Test build #86354 has finished for PR 20320 at commit
|
|
Regarding manual tests, our integration tests cover |
|
Kubernetes integration test starting |
|
Test build #86357 has finished for PR 20320 at commit
|
|
Kubernetes integration test status success |
|
@foxish Manual tests to verify that using |
foxish
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question - is referencing a local file as a main app resource (/home/../spark-examples.jar for example) need to be dealt with separately?
| .map(_.split(",")) | ||
| .getOrElse(Array.empty[String]) | ||
|
|
||
| // TODO(SPARK-23153): remote once submission client local dependencies are supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/remote/remove/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
|
||
| // TODO(SPARK-23153): remote once submission client local dependencies are supported. | ||
| if (existSubmissionLocalFiles(sparkJars) || existSubmissionLocalFiles(sparkFiles)) { | ||
| throw new SparkException("The Kubernetes mode does not yet support application " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd shorten this to just "Kubernetes mode does not support referencing application dependencies in the local file system".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
Actually main application jar gets added to |
|
Suspected that it would use the same code path - might be worth a manual test of that as well. |
|
The manual tests I did actually use a main app jar located on gcs and http. To be specific and for record, I did the following tests:
|
|
Kubernetes integration test starting |
|
Test build #86359 has finished for PR 20320 at commit
|
|
LGTM |
|
Kubernetes integration test status success |
|
Merging to master / 2.3. |
## What changes were proposed in this pull request? In the Kubernetes mode, fails fast in the submission process if any submission client local dependencies are used as the use case is not supported yet. ## How was this patch tested? Unit tests, integration tests, and manual tests. vanzin foxish Author: Yinan Li <[email protected]> Closes #20320 from liyinan926/master. (cherry picked from commit 5d7c4ba) Signed-off-by: Marcelo Vanzin <[email protected]>
What changes were proposed in this pull request?
In the Kubernetes mode, fails fast in the submission process if any submission client local dependencies are used as the use case is not supported yet.
How was this patch tested?
Unit tests, integration tests, and manual tests.
@vanzin @foxish