Skip to content

Conversation

@chenjunjiedada
Copy link
Contributor

What changes were proposed in this pull request?

This pr is used to support using hostpath/PV volume mounts as local storage. In KubernetesExecutorBuilder.scala, the LocalDrisFeatureStep is built before MountVolumesFeatureStep which means we cannot use any volumes mount later. This pr adjust the order of feature building steps which moves localDirsFeature at last so that we can check if directories in SPARK_LOCAL_DIRS are set to volumes mounted such as hostPath, PV, or others.

How was this patch tested?

Unit tests

@dongjoon-hyun
Copy link
Member

Thank you for making a PR, @chenjunjiedada !

@dongjoon-hyun
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Jun 15, 2019

Test build #106542 has finished for PR 24879 at commit 6fca505.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 15, 2019

@SparkQA
Copy link

SparkQA commented Jun 15, 2019

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add some documentation?

new KerberosConfDriverFeatureStep(conf),
new PodTemplateConfigMapStep(conf))
new PodTemplateConfigMapStep(conf),
new LocalDirsFeatureStep(conf))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not move this right after new MountVolumesFeatureStep(conf)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some volume mounts maybe setup in PodTemplateConfigMapStep, so I move to last to ensure that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, since the intent is for users to provide custom local dir volumes via either mount volumes or pod templates it needs to appear after both

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either is fine, just wanted to point out that the pod template is initialized before any step is executed (i.e. PodTemplateConfigMapStep is not where the template is loaded).

hasVolumeMount(pod, localDirVolume.getName, localDirPath) match {
case true =>
pod.container.getVolumeMounts().asScala
.find(m => m.getName.equals(localDirVolume.getName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see other lines for indentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, will check and run scalafmt.

.endEmptyDir()
.build()
val name = s"spark-local-dir-${index + 1}"
hasVolume(pod, name) match {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a super big deal, but hasVolume is basically the same logic pod.pod.getSpec().getVolumes().asScala.exists as pod.pod.getSpec.getVolumes().asScala.find

I think you can refactor this to avoid scanning multiple times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in the latest commit.

@SparkQA
Copy link

SparkQA commented Jun 16, 2019

Test build #106543 has finished for PR 24879 at commit ef66c87.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 16, 2019

@SparkQA
Copy link

SparkQA commented Jun 16, 2019

@SparkQA
Copy link

SparkQA commented Jun 16, 2019

@SparkQA
Copy link

SparkQA commented Jun 16, 2019

Test build #106544 has finished for PR 24879 at commit 5610fe4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 16, 2019

@chenjunjiedada
Copy link
Contributor Author

Hi @dongjoon-hyun and @felixcheung,

I updated the code in the latest commits, could you please take a look? Thanks.

@felixcheung
Copy link
Member

@mccheah

Copy link
Contributor

@liyinan926 liyinan926 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

.build()
val name = s"spark-local-dir-${index + 1}"
findVolume(pod, name) match {
case Some(volume) => volume
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

integration tests should exercise both cases (found the volume or created it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, there is an existing unit test for creating a local directory via emptyDir, this PR adds a unit test for the case of found volume. Both cases should already be covered. Is that ok?

@mccheah
Copy link
Contributor

mccheah commented Jun 25, 2019

So my impression was that hostPath volumes are unsafe because they break the boundary of isolation between the container and the underlying machine, and the isolation between containers. I also was under the impression that containers have to be run as root to support hostPath volumes. Because of these factors I would be hesitant to advertise supporting this as a first-class feature in the Spark on Kubernetes integration. If a user so desired they can use the existing pod template feature to break glass, and this seems more like a break glass feature in my estimation.

@chenjunjiedada
Copy link
Contributor Author

Good point @mccheah,

First, I'd like to explain that this patch is not only for hostPath volume but also it suits for other volumes such as PV, etc. it tries to adjust feature build order to adopt volumes which already be supported as first-class features.

Second, the hostPath volume can be written by a non-root user if we change the file permission according to the description here.

Lastly, I think users that care about the performance should want this since even they can define volumes inside podTemplate, the volumes are not utilized, I have run spark-sql-perf on spark on Kubernetes with 1T scale, with the patch, most of the shuffle bounded queries can improve a lot, especially for q17, q25, q29, when using 4 disks as local storage instead of emptyDir, the query time improve more than 10X.

In summary, this patch is just to help users easily leverage what they have to improve performance.

@chenjunjiedada
Copy link
Contributor Author

@vanzin , could you please help to have a look?

@francoisfernando
Copy link

francoisfernando commented Jul 2, 2019

@mccheah
I have hit a roadblock due to this issue trying to set the spark.local.dir to a something other than emptyDir volumes. For example following configuration throws an exception when executor pods are provisioned.

--conf spark.local.dir=/spark_tmp \
--conf spark.kubernetes.executor.volumes.hostPath.spaklocal.mount.path=/spark_tmp \
--conf spark.kubernetes.executor.volumes.hostPath.spaklocal.mount.readOnly=false \
--conf spark.kubernetes.executor.volumes.hostPath.spaklocal.options.path=/tmp \

@chenjunjiedada
Copy link
Contributor Author

Hi @felixcheung @mccheah , Do you think it is ready? thanks.

@dongjoon-hyun
Copy link
Member

Retest this please.

@shaneknapp
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Jul 24, 2019

@SparkQA
Copy link

SparkQA commented Jul 24, 2019

@SparkQA
Copy link

SparkQA commented Jul 24, 2019

Test build #108131 has finished for PR 24879 at commit 9392dad.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 24, 2019

@SparkQA
Copy link

SparkQA commented Jul 24, 2019

@SparkQA
Copy link

SparkQA commented Jul 25, 2019

@SparkQA
Copy link

SparkQA commented Jul 25, 2019

Test build #108149 has finished for PR 24879 at commit 6e5fcf6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 25, 2019

```


If none volume is set as local storage, Spark uses temporary scratch space to spill data to disk during shuffles and other operations. When using Kubernetes as the resource manager the pods will be created with an [emptyDir](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir) volume mounted for each directory listed in `SPARK_LOCAL_DIRS`. If no directories are explicitly specified then a default directory is created and configured appropriately.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If no volume is set..."

Should mention the config (spark.local.dir) which is preferred over the env variable.

import java.util.UUID

import io.fabric8.kubernetes.api.model.{ContainerBuilder, HasMetadata, PodBuilder, VolumeBuilder, VolumeMountBuilder}
import collection.JavaConverters._
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import scala....

new KerberosConfDriverFeatureStep(conf),
new PodTemplateConfigMapStep(conf))
new PodTemplateConfigMapStep(conf),
new LocalDirsFeatureStep(conf))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either is fine, just wanted to point out that the pod template is initialized before any step is executed (i.e. PodTemplateConfigMapStep is not where the template is loaded).

var localDirVolumeMounts : Seq[VolumeMount] = Seq()

if (localDirs.isEmpty) {
localDirs = resolvedLocalDirs.toSeq
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get rid of resolvedLocalDirs and move that statement here instead.


def findLocalDirVolumeMount(pod: SparkPod): Seq[String] = {
val localDirVolumes = pod.pod.getSpec.getVolumes.asScala
.filter(v => v.getName.startsWith("spark-local-dir-"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: either:

.filter { v => ... }

or

.filter(_.getName()...)

Also in a few places below.

val localDirVolumes = pod.pod.getSpec.getVolumes.asScala
.filter(v => v.getName.startsWith("spark-local-dir-"))

localDirVolumes.map { volume => pod.container.getVolumeMounts.asScala
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move pod.container... to the next line.

.filter(v => v.getName.startsWith("spark-local-dir-"))

localDirVolumes.map { volume => pod.container.getVolumeMounts.asScala
.find(m => m.getName.equals(volume.getName)) match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use == for strings in scala.

case Some(m) => m.getMountPath
case _ => ""
}
}.filter(s => s.length > 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use localDirVolumes.flatMap and avoid this filter.

val configuredPod = mountVolumeStep.configurePod(SparkPod.initialPod())

val sparkConf = new SparkConfWithEnv(Map())
val localDirConf = KubernetesTestConf.createDriverConf(sparkConf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the same KubernetesConf for all steps right? That's how it works when spark-submit is run.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the idea, to keep building out one conf. (There were some early proposals around making these stages "more functional" but those were considered less broadly familiar and less spark idiomatic)


import org.apache.spark.{SparkConf, SparkFunSuite}
import org.apache.spark.deploy.k8s.{KubernetesTestConf, SparkPod}
import org.apache.spark.deploy.k8s.{KubernetesHostPathVolumeConf, KubernetesTestConf, KubernetesVolumeSpec, SparkPod}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the wildcard when the import line gets too long.

@erikerlandson
Copy link
Contributor

Just to make sure I understand the rationale - we want the option to automatically create local directories "spark-local-dir-xxx" as a simplified UX (why mess around with a second channel of config when a working dir can be created automatically from the information in the first?)

@vanzin
Copy link
Contributor

vanzin commented Jul 25, 2019

we want the option to automatically create local directories "spark-local-dir-xxx" as a simplified UX

Right. That's also more similar to other backends, where the local dirs are defined by the cluster manager themselves and users don't have to mess with the Spark configuration.

(Except here they have to, but it can be encapsulated in the pod template.)

@erikerlandson
Copy link
Contributor

@vanzin thanks, I agree that makes sense

@SparkQA
Copy link

SparkQA commented Jul 26, 2019

@SparkQA
Copy link

SparkQA commented Jul 26, 2019

Test build #108190 has finished for PR 24879 at commit 2abb8e9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 26, 2019

Copy link
Contributor

@vanzin vanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some small nits.


import io.fabric8.kubernetes.api.model.{ContainerBuilder, HasMetadata, PodBuilder, VolumeBuilder, VolumeMountBuilder}
import io.fabric8.kubernetes.api.model._
import scala.collection.JavaConverters._
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, style checker should have complained about this (scala imports should be separate from others).

}

def findLocalDirVolumeMount(pod: SparkPod): Seq[String] = {
val localDirVolumes = pod.pod.getSpec.getVolumes.asScala
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... do you need to list the volumes at all? Can't you just look at the mounts (since they already have the volume name)?

SparkPod(podWithLocalDirVolumes, containerWithLocalDirVolumeMounts)
}

def findLocalDirVolumeMount(pod: SparkPod): Seq[String] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mount => Mounts

@SparkQA
Copy link

SparkQA commented Jul 28, 2019

@SparkQA
Copy link

SparkQA commented Jul 28, 2019

Test build #108273 has finished for PR 24879 at commit c29dd7a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 28, 2019

@vanzin
Copy link
Contributor

vanzin commented Jul 29, 2019

Merging to master.

@vanzin vanzin closed this in 780d176 Jul 29, 2019
Jeffwan pushed a commit to Jeffwan/spark that referenced this pull request Feb 28, 2020
This pr is used to support using hostpath/PV volume mounts as local storage. In KubernetesExecutorBuilder.scala, the LocalDrisFeatureStep is built before MountVolumesFeatureStep which means we cannot use any volumes mount later. This pr adjust the order of feature building steps which moves localDirsFeature at last so that we can check if directories in SPARK_LOCAL_DIRS are set to volumes mounted such as hostPath, PV, or others.

Unit tests

Closes apache#24879 from chenjunjiedada/SPARK-28042.

Lead-authored-by: Junjie Chen <[email protected]>
Co-authored-by: Junjie Chen <[email protected]>
Signed-off-by: Marcelo Vanzin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.