[SPARK-40350][Kubernetes] Use spark config to configure the parameters of volcano podgroup #37802

zheniantoushipashi · 2022-09-05T09:56:46Z

What changes were proposed in this pull request?

Use spark config to configure the parameters of volcano podgroup

Why are the changes needed?

If we use spark config to specify the parameters of the podgroup, it will be much more convenient,we don't need configmap to mount static files

In our scenario, we need to dynamically specify the volcano queue, but it is not convenient to create a static podgroup configuration file to mount

Does this PR introduce any user-facing change?

add spark config

spark.kubernetes.scheduler.volcano.podGroup.spec.queue
spark.kubernetes.scheduler.volcano.podGroup.spec.minResources.cpu
spark.kubernetes.scheduler.volcano.podGroup.spec.minResources.memory
spark.kubernetes.scheduler.volcano.podGroup.spec.minMember
spark.kubernetes.scheduler.volcano.podGroup.spec.PriorityClassName

…s of volcano podgroup

AmplabJenkins · 2022-09-05T23:16:33Z

Can one of the admins verify this patch?

Yikun

Thanks for your contributions!

Actually, we had some discussion on the configuration to support volcano in featurestep, We make a trade-off between support parameters and generality and choose the pg template finally: #35783

@zheniantoushipashi Or you have more inputs in here?

also cc @dongjoon-hyun @holdenk

Yikun · 2022-09-06T13:16:19Z

...kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStep.scala

+      if (minResourceMemory.isDefined) {
+        minResources.put("memory", new Quantity(minResourceMemory.get))
+      }
+      spec.setMinResources(minResources)


This will force to overwrite existing MinResources and ignore existing template if one of configurations is set.

Yikun · 2022-09-06T13:19:58Z

...kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStep.scala

-
    var spec = pg.getSpec
    if (spec == null) spec = new PodGroupSpec
+    val queue = kubernetesConf.getOption(POD_GROUP_SPEC_QUEUE)


val pg = template.map(client.podGroups.load(_).get).getOrElse(new PodGroup()) val builder = new PodGroupBuilder(pg) .editOrNewMetadata() .withName(podGroupName) .withNamespace(namespace) .endMetadata() .editOrNewSpec() .endSpec() if (xxx) { builder.editOrNewSpec().endSpec() } return Seq(builder.build())

If we decided to add parameters support, the builder style might more stable and easy to maintainance.

I can refactor this, if my opinion is accepted,

zheniantoushipashi · 2022-09-07T06:11:07Z

I studied your previous discussions, but i still think the change i pull request is useful， i investigative this document https://docs.google.com/document/d/1xgQGRpaHQX6-QH_J9YV2C2Dh6RpXefUpLM7KGkzL6Fg/edit#heading=h.1quyr1r2kr5n , i known spark will support more scheduler in the future, but the configuration i add in this pr is just for volcano scheduler , and the change is just in the VolcanoFeatureStep.scala , and the configuration prefixed with spark.kubernetes.scheduler.volcano.podGroup , not a global spark configuration for all the scheduler,

We can arbitrarily create queues in volcano, and In our scenario, we need to dynamically specify the volcano queue， our users can dynamically use different queues , if we must create a template file Just to dynamically use the queue I created， it is very inconvenient， i must add some code to create a file or i must use configmap to mount the a new template file and restart the k8s application, it is no cool,

i think the template file and the configuration is not conflict，

spark driver pod and executor pod can use template define the pod resource, and spark configuration can override the template file, I think this usage is reasonable。

zheniantoushipashi · 2022-09-07T06:11:37Z

@Yikun @dongjoon-hyun @holdenk

Yikun · 2022-09-07T16:03:31Z

@zheniantoushipashi Much thanks for you feedback. Most of spark + volcano users who I have come into contact with before, they also configure each job through configuration items.

There is no doubt that PodGroupTemplate's mechanism will be greatly improved for generality. Another previous consideration is to reduce the frequent addition of configuration items.

But it is worth noting that PodGroupTemplate is different from PodTemplate: the CRD of Pod has a lot of options, but the CRD of PodGroup, especially the commonly used configurations, is relatively fixed (Therefore, configuration items will not be added frequently). Commonly used are queue, minResource, priority. BTW, please note that minMember is a bit special, because minMember is set to 1 to monitor the driver/job life cycle (see more in volcano-sh/volcano#2038), then to ensure the accuracy of scheduling. Therefore, this parameter should not be set arbitrarily in spark + volcano case.

Therefore, I personally think it is reasonable to use PodGroupTemplate to cover the whole scene, and then increase flexibility through configuration items. But we still need more voice from spark community before this.

dongjoon-hyun

I understand your requirement, but Apache Spark avoids duplicating even official K8s entity like PriorityClass , @zheniantoushipashi . The custom scheduler configurations are less important than K8s entities, aren't they? Sorry, but I'd recommend to close this PR.

we need to dynamically specify the volcano queue, but it is not convenient to create a static podgroup configuration file to mount

dongjoon-hyun · 2022-09-09T23:37:24Z

Let me close this PR for now. We can continue our discussion on this PR.

zheniantoushipashi · 2022-09-14T09:12:51Z

@Yikun @dongjoon-hyun ok, If in principle we can't wrap volcano's configuration in spark configuration, then I can only implement this feature in our own version, Because in our scenario, if the queue used by the scheduler cannot be dynamically configured, the scheduler is basically useless. You can't imagine a resource manager compared to yarn, which can't dynamically specify queues when starting application

dongjoon-hyun · 2022-09-14T09:28:46Z

You can't imagine a resource manager compared to yarn, which can't dynamically specify queues when starting application

This is wrong because Apache YuniKorn is able to dynamically specify queues as you see #37877 via the existing Apache Spark labels and annotations without any modification, @zheniantoushipashi .

Because in our scenario, if the queue used by the scheduler cannot be dynamically configured, the scheduler is basically useless.

That doesn't sound like a common valid scenario. If you need to create a queue per a single Spark job, what is the meaning of scheduling?

dongjoon-hyun · 2022-09-14T09:32:33Z

Let's distinguish creating queues from specifying queues. They are not the same.

zheniantoushipashi · 2022-09-15T06:45:20Z

sorry, actually I mean specifying queues, Is there a way to specify a queue dynamically when using volcano to submit spark applicaiton ? do you mean that we should follow the YuniKorn way to implement the dynamically specified queue of volcano ?

zheniantoushipashi · 2022-09-15T06:45:30Z

@dongjoon-hyun

Yikun · 2022-09-15T07:37:23Z

Is there a way to specify a queue dynamically when using volcano to submit spark applicaiton?

@zheniantoushipashi Currnently, as you know we can only specify volcano queue by using PodGroupTemplate. @dongjoon-hyun meant Spark + Yunikorn can set queue by setting label but volcano couldn't.

Actually, there are mainly two style to specify scheduler hint in common scheduler:

Using annotation/label
Using a separate CRD

Because in K8S scheduling, the queue is only one of the capabilities, cooporate with other PodGroup information. If it is also placed in the label or annotation, will be very complicated, it is not the same way as K8S annotation suggests.

The queue and other parts of the PodGroup are usually considered at the same time, so the queue is used as a field of the PodGroup in Volcano. Users can configure all batch scheduling through the pod group CRD.

Therefore, you might want to use PodGroupTemplate to cover the whole scene, and then increase flexibility through configuration items in your own downstream version at current.

Also cc @william-wang

[SPARK-40350][Kubernetes] Use spark config to configure the parameter…

a6cec9d

…s of volcano podgroup

github-actions bot added the KUBERNETES label Sep 5, 2022

Yikun reviewed Sep 6, 2022

View reviewed changes

dongjoon-hyun requested changes Sep 8, 2022

View reviewed changes

dongjoon-hyun closed this Sep 9, 2022

[SPARK-40350][Kubernetes] Use spark config to configure the parameters of volcano podgroup #37802

[SPARK-40350][Kubernetes] Use spark config to configure the parameters of volcano podgroup #37802

Uh oh!

Conversation

zheniantoushipashi commented Sep 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AmplabJenkins commented Sep 5, 2022

Uh oh!

Yikun left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Yikun Sep 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Yikun Sep 6, 2022

Choose a reason for hiding this comment

Uh oh!

zheniantoushipashi Sep 7, 2022

Choose a reason for hiding this comment

Uh oh!

zheniantoushipashi commented Sep 7, 2022

Uh oh!

zheniantoushipashi commented Sep 7, 2022

Uh oh!

Yikun commented Sep 7, 2022

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Sep 9, 2022

Uh oh!

zheniantoushipashi commented Sep 14, 2022

Uh oh!

dongjoon-hyun commented Sep 14, 2022

Uh oh!

dongjoon-hyun commented Sep 14, 2022

Uh oh!

zheniantoushipashi commented Sep 15, 2022

Uh oh!

zheniantoushipashi commented Sep 15, 2022

Uh oh!

Yikun commented Sep 15, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zheniantoushipashi commented Sep 5, 2022 •

edited

Loading

Yikun left a comment •

edited

Loading

Yikun Sep 6, 2022 •

edited

Loading