[SPARK-42769][K8S] Add SPARK_DRIVER_POD_IP env variable to executor pods#40392
[SPARK-42769][K8S] Add SPARK_DRIVER_POD_IP env variable to executor pods#40392dongjoon-hyun wants to merge 2 commits intoapache:masterfrom dongjoon-hyun:SPARK-42769
SPARK_DRIVER_POD_IP env variable to executor pods#40392Conversation
|
Hi, @HyukjinKwon . Could you review this PR when you have some time? |
ENV_DRIVER_POD_IP env variable to executor podsSPARK_DRIVER_POD_IP env variable to executor pods
Hi @dongjoon-hyun, I think it's quite useful, but in #39160 (review), you left a concern
do you still concern that now? |
|
@pan3793 . The goal of PR is different from your PR's goal.
In addition, this is a kind of propagation of the information from the driver pod to the executor pods instead of exposing the executor pods' internal information. |
|
Hi, @viirya . Could you review this PR when you have some time? |
| val UI_PORT_NAME = "spark-ui" | ||
|
|
||
| // Environment Variables | ||
| val ENV_DRIVER_POD_IP = "SPARK_DRIVER_POD_IP" |
There was a problem hiding this comment.
Is it different to DRIVER_HOST_ADDRESS? I saw K8s uses DRIVER_HOST_ADDRESS to derive driver url for env var ENV_DRIVER_URL.
There was a problem hiding this comment.
Thank you for review, @viirya .
In K8s, DRIVER_HOST_ADDRESS is protected by DriverServiceFeatureStep here.
It's because we inject like this systematically.
However, when DNS doesn't work, we need IP which is unknown from the executor pods so far.
viirya
left a comment
There was a problem hiding this comment.
It is not used in this PR, will you make other change to k8s executor pods to use this env var?
|
Yes, correct, @viirya ! Thank you for the approval. |
|
Merged to master for Apache Spark 3.5.0. |
… pods
### What changes were proposed in this pull request?
Like `SPARK_EXECUTOR_POD_IP`, this PR aims to add a new environment variable `ENV_DRIVER_POD_IP` to all executor pods.
```bash
$ kubectl get pod pi-exec-1 -oyaml | grep -C1 SPARK_DRIVER_POD_IP
value: "0"
- name: SPARK_DRIVER_POD_IP
value: 10.1.0.99
```
### Why are the changes needed?
This is helpful for some executor pods to connect driver pods via IP.
### Does this PR introduce _any_ user-facing change?
No, this is a new environment variable.
### How was this patch tested?
Pass the CIs with the newly added test case.
Closes apache#40392 from dongjoon-hyun/SPARK-42769.
Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
…ARK_DRIVER_POD_IP}}:{{SPARK_UI_PORT}}`
### Why are the changes needed?
We are using [virtual-kubelet](https://github.com/virtual-kubelet/virtual-kubelet) for spark on kubernetes, and spark kubernetes pods would be allocated across kubernetes clusters.
And we use the driver POD ip as driver host, see apache/spark#40392, which is supported since spark-3.5.
The kubernetes context and namespace are virtual and we can not build the app URL by spark driver svc.
And the spark driver pod IP is accessible for our use case, so raise this PR to build the spark app url by spark driver pod id and spark ui port.
### How was this patch tested?
UT.
<img width="1532" height="626" alt="image" src="https://github.com/user-attachments/assets/5cb54602-9e79-40b7-b51c-0b873c17560b" />
<img width="710" height="170" alt="image" src="https://github.com/user-attachments/assets/6d1c9580-62d6-423a-a04f-dc6cdcee940a" />
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #7141 from turboFei/app_url_v2.
Closes #7141
1277952 [Wang, Fei] VAR
d15e6be [Cheng Pan] Update kyuubi-server/src/main/scala/org/apache/kyuubi/engine/KubernetesApplicationOperation.scala
1535e00 [Wang, Fei] spark driver pod ip
Lead-authored-by: Wang, Fei <fwang12@ebay.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
What changes were proposed in this pull request?
Like
SPARK_EXECUTOR_POD_IP, this PR aims to add a new environment variableENV_DRIVER_POD_IPto all executor pods.Why are the changes needed?
This is helpful for some executor pods to connect driver pods via IP.
Does this PR introduce any user-facing change?
No, this is a new environment variable.
How was this patch tested?
Pass the CIs with the newly added test case.