Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Mar 2, 2022

What changes were proposed in this pull request?

This PR aims to support APP_ID and EXECUTOR_ID placeholder in K8s annotation in the same way we did for EXECUTOR_JAVA_OPTIONS.

Why are the changes needed?

Although Apache Spark provides spark-app-id already, some custom schedulers are not able to recognize them.

Does this PR introduce any user-facing change?

No because the pattern strings are very specific.

How was this patch tested?

Pass the CIs and K8s IT.

This passed like the following on Docker Desktop K8s.

$ build/sbt -Psparkr -Pkubernetes -Pkubernetes-integration-tests -Dtest.exclude.tags=minikube -Dspark.kubernetes.test.deployMode=docker-for-desktop "kubernetes-integration-tests/test"
[info] KubernetesSuite:
[info] - Run SparkPi with no resources (8 seconds, 789 milliseconds)
[info] - Run SparkPi with no resources & statefulset allocation (8 seconds, 903 milliseconds)
[info] - Run SparkPi with a very long application name. (8 seconds, 586 milliseconds)
[info] - Use SparkLauncher.NO_RESOURCE (8 seconds, 409 milliseconds)
[info] - Run SparkPi with a master URL without a scheme. (8 seconds, 586 milliseconds)
[info] - Run SparkPi with an argument. (8 seconds, 708 milliseconds)
[info] - Run SparkPi with custom labels, annotations, and environment variables. (8 seconds, 626 milliseconds)
[info] - All pods have the same service account by default (8 seconds, 595 milliseconds)
[info] - Run extraJVMOptions check on driver (4 seconds, 324 milliseconds)
[info] - Run SparkRemoteFileTest using a remote data file (8 seconds, 424 milliseconds)
[info] - Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j2.properties (13 seconds, 42 milliseconds)
[info] - Run SparkPi with env and mount secrets. (16 seconds, 600 milliseconds)
[info] - Run PySpark on simple pi.py example (11 seconds, 479 milliseconds)
[info] - Run PySpark to test a pyfiles example (10 seconds, 669 milliseconds)
[info] - Run PySpark with memory customization (8 seconds, 604 milliseconds)
[info] - Run in client mode. (7 seconds, 349 milliseconds)
[info] - Start pod creation from template (8 seconds, 779 milliseconds)
[info] - Test basic decommissioning (42 seconds, 970 milliseconds)
[info] - Test basic decommissioning with shuffle cleanup (42 seconds, 650 milliseconds)
[info] - Test decommissioning with dynamic allocation & shuffle cleanups (2 minutes, 41 seconds)
[info] - Test decommissioning timeouts (43 seconds, 340 milliseconds)
[info] - SPARK-37576: Rolling decommissioning (1 minute, 6 seconds)
[info] - Run SparkR on simple dataframe.R example (11 seconds, 645 milliseconds)

@dongjoon-hyun
Copy link
Member Author

cc @yangwwei

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-38383][K8S] Support APP_ID and EXECUTOR_ID placeholder in annotations [SPARK-38383][K8S] Support APP_ID and EXECUTOR_ID placeholder in annotations Mar 2, 2022
@dongjoon-hyun
Copy link
Member Author

Could you review this when you have some time, please, @viirya ?

.set("spark.kubernetes.executor.label.label2", "label2-value")
.set("spark.kubernetes.executor.annotation.annotation1", "annotation1-value")
.set("spark.kubernetes.executor.annotation.annotation2", "annotation2-value")
.set("spark.kubernetes.executor.annotation.yunikorn.apache.org/app-id", "{{APP_ID}}")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This integration test covers driver and executor cases together.

@dongjoon-hyun
Copy link
Member Author

Thank you so much, @viirya .
Merged to master for Apache Spark 3.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants