-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-24137][K8s] Mount local directories as empty dir volumes. #21238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Dramatically improves performance and won't cause Spark applications to fail because they write too much data to the Docker image's specific file system. The file system's directories that back emptydir volumes are generally larger and more performant.
|
@foxish @liyinan926 please take a look, thanks! |
|
Test build #90217 has finished for PR 21238 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #90225 has finished for PR 21238 at commit
|
|
Seems like it addresses similar problem to #21095. It might be worth investigating how to unify both. |
|
@andrusha I don't think it's entirely analogous - for the simple reason that the hostPath volumes PR doesn't take into account |
|
Also #21260 currently only supports hostPath and PVCs but you definitely want emptyDir for isolation (though that looks like a trivial enough change). |
| if (contains("spark.local.dir")) { | ||
| val msg = "In Spark 1.0 and later spark.local.dir will be overridden by the value set by " + | ||
| "the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN)." | ||
| "the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone/kubernetes and LOCAL_DIRS" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, I deleted a comment here accidentally. @rxin said that we could remove this warning about Spark 1.0.
|
I agree with @mcheah that the potential code reuse is small. Keeping this as a separate pod construction step, decoupled from the user-exposed step, is cleaner. |
| val localDirVolumes = resolvedLocalDirs | ||
| .zipWithIndex | ||
| .map { | ||
| case (localDir, index) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the convention is to put case on the same line as map {.
| .map { | ||
| case (localDir, index) => | ||
| new VolumeBuilder() | ||
| .withName(s"spark-local-dir-${index + 1}-${Paths.get(localDir).getFileName.toString}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need to include the actual path in the volume name? I think spark-local-dir-${index + 1} is sufficient.
| val localDirVolumeMounts = localDirVolumes | ||
| .zip(resolvedLocalDirs) | ||
| .map { | ||
| case (localDirVolume, localDirPath) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
|
@rxin @liyinan926 @foxish addressed comments. |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
liyinan926
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
|
Test build #90431 has finished for PR 21238 at commit
|
|
Test build #90432 has finished for PR 21238 at commit
|
|
Requesting sign off and merge from @foxish |
|
LGTM. Merging to master. |
|
Maintenance releases most often have fixes for stability. We could maybe backport this since it's not a new feature but an omission from before. If it is going to be some effort, thanks to all the refactors that went in so far, we should think twice about whether we need to. |
|
@mccheah, wdyt? I just haven't heard from any users here of 2.3 - if you think it's useful for 2.3.1 and low risk, then please feel free to propose a cherrypick. |
|
I think we can afford to hold off here. |
|
What would make this difficult to backport is the fact that this patch was built on top of the big refactor PR that only went in after 2.3. So we'd need to rewrite this with the old architecture which is a non-trivial effort. |
|
SG. @liyinan926, let's revisit this if we hear from 2.3 users. |
|
Makes sense to me. |
Drastically improves performance and won't cause Spark applications to fail because they write too much data to the Docker image's specific file system. The file system's directories that back emptydir volumes are generally larger and more performant. Has been in use via the prototype version of Kubernetes support, but lost in the transition to here. Author: mcheah <[email protected]> Closes apache#21238 from mccheah/mount-local-dirs.
This PR continues #21095 and intersects with #21238. I've added volume mounts as a separate step and added PersistantVolumeClaim support. There is a fundamental problem with how we pass the options through spark conf to fabric8. For each volume type and all possible volume options we would have to implement some custom code to map config values to fabric8 calls. This will result in big body of code we would have to support and means that Spark will always be somehow out of sync with k8s. I think there needs to be a discussion on how to proceed correctly (eg use PodPreset instead) ---- Due to the complications of provisioning and managing actual resources this PR addresses only volume mounting of already present resources. ---- - [x] emptyDir support - [x] Testing - [x] Documentation - [x] KubernetesVolumeUtils tests Author: Andrew Korzhuev <[email protected]> Author: madanadit <[email protected]> Closes #21260 from andrusha/k8s-vol.
What changes were proposed in this pull request?
Drastically improves performance and won't cause Spark applications to fail because they write too much data to the Docker image's specific file system. The file system's directories that back emptydir volumes are generally larger and more performant.
How was this patch tested?
Has been in use via the prototype version of Kubernetes support, but lost in the transition to here.