-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-47730][K8S] Support APP_ID and EXECUTOR_ID placeholders in labels
#46149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-47730][K8S] Support APP_ID and EXECUTOR_ID placeholders in labels
#46149
Conversation
|
cc @dongjoon-hyun |
APP_ID and EXECUTOR_ID placeholders in labels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I understand the intention, let's isolate yunikorn related feature inside YuniKornSuite.scala.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, we mixed that and it turns that it was not good in a long term perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. Actually the label key used here can be any string. I've updated it to use a general label key as well as a general annotation key in this test. Also fixed the typo APPID -> APP_ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you try to add this test coverage in YuniKornSuite.scala?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better if we can move all yunikorn stuff into YuniKornSuite .
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making a PR, @jshmchenxi .
I'm interested in your proposal and use case. Could you address my review comments?
|
Gentle ping, @jshmchenxi . |
|
It's been a busy week, sorry for the delay. I'll address your comments today, thanks! @dongjoon-hyun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dongjoon-hyun Thanks for the suggestions! I've updated the label key in the unit test to use a general one instead of the app id key of YuniKorn.
And in our use case, we could use any custom label key for the app id. We are reusing yunikorn.apache.org/app-id for simplicity and it is in the new standards YuniKorn labels - YUNIKORN-1351 - but it is not yet released. So there is no need to put the labels in the YuniKornSuite yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. Actually the label key used here can be any string. I've updated it to use a general label key as well as a general annotation key in this test. Also fixed the typo APPID -> APP_ID.
391609f to
0231b0d
Compare
Currently, only the pod annotations supports `APP_ID` and `EXECUTOR_ID` placeholders. This commit aims to add the same function to pod labels. The use case is to support using customized labels for availability zone based topology pod affinity. We want to use the Spark application ID as the customized label value, to allow Spark executor pods to run in the same availability zone as Spark driver pod. Although we can use the Spark internal label `spark-app-selector` directly, this is not a good practice when using it along with YuniKorn Gang Scheduling. When Gang Scheduling is enabled, the YuniKorn placeholder pods should use the same affinity as real Spark pods. In this way, we have to add the internal `spark-app-selector` label to the placeholder pods. This is not good because the placeholder pods could be recognized as Spark pods in the monitoring system. Thus we propose supporting the `APP_ID` and `EXECUTOR_ID` placeholders in Spark pod labels as well for flexibility.
Also fix the typo value `APPID` -> `APP_ID` in `BasicDriverFeatureStepSuite`.
0231b0d to
528539c
Compare
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you for updating, @jshmchenxi .
Merged to master for Apache Spark 4.0.0.
|
I added you (Xi Chen) to Apache Spark contributor group and assigned SPARK-47730 to you. |
|
Glad to join the community. Thank you! @dongjoon-hyun |
What changes were proposed in this pull request?
Currently, only the pod annotations supports
APP_IDandEXECUTOR_IDplaceholders. This commit aims to add the same function to pod labels.Why are the changes needed?
The use case is to support using customized labels for availability zone based topology pod affinity. We want to use the Spark application ID as the customized label value, to allow Spark executor pods to run in the same availability zone as Spark driver pod.
Although we can use the Spark internal label
spark-app-selectordirectly, this is not a good practice when using it along with YuniKorn Gang Scheduling. When Gang Scheduling is enabled, the YuniKorn placeholder pods should use the same affinity as real Spark pods. In this way, we have to add the internalspark-app-selectorlabel to the placeholder pods. This is not good because the placeholder pods could be recognized as Spark pods in the monitoring system.Thus we propose supporting the
APP_IDandEXECUTOR_IDplaceholders in Spark pod labels as well for flexibility.Does this PR introduce any user-facing change?
No because the pattern strings are very specific.
How was this patch tested?
Unit tests.
Was this patch authored or co-authored using generative AI tooling?
No