-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-52830][K8S] Support spark.kubernetes.(driver|executor).pod.excludedFeatureSteps
#51522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…cludedFeatureSteps`
026ad6e to
8a22098
Compare
|
cc @peter-toth |
|
Could you review this PR too when you have some time, @viirya ? |
| val KUBERNETES_DRIVER_POD_EXCLUDED_FEATURE_STEPS = | ||
| ConfigBuilder("spark.kubernetes.driver.pod.excludedFeatureSteps") | ||
| .doc("Class names to exclude from driver pod feature steps. Comma separated.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be used to exclude Spark internal feature steps too?
A few questions.
Is there any feature step that are optional and users probably want to exclude?
Different to #30206 which is used to add extra feature step out from Spark internal feature steps. If we allow users to exclude any feature steps, won't it be a possible (security) issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for review, @viirya
Yes, this is designed to do that as I wrote in the PR description.
Can this be used to exclude Spark internal feature steps too?
There are two common cases.
Is there any feature step that are optional and users probably want to exclude?
- For built-in steps, a user can exclude
KerberosConfDriverFeatureStepwhen they don't useKerberossystem. This is also useful - For user-provided steps, when a production system provide many steps via
spark.kubernetes.(driver|executor).pod.featureSteps, a user can selectively exclude one or two with this new configuration.
The steps provide credential-like information. So, it will be unable to access the backend system. It's a user-own risk.
If we allow users to exclude any feature steps, won't it be a possible (security) issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For built-in steps, a user can exclude KerberosConfDriverFeatureStep when they don't use Kerberos system. This is also useful
For user-provided steps, when a production system provide many steps via spark.kubernetes.(driver|executor).pod.featureSteps, a user can selectively exclude one or two with this new configuration.
I think it maybe only useful for internal steps. For user-provided steps, users can simply pull out unnecessary steps from spark.kubernetes.(driver|executor).pod.featureSteps. No need to add steps and exclude them using two lists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. You are right because it's logical. For the user-defined ones, the main case is the time when the platform team provides spark.kubernetes.(driver|executor).pod.featureSteps by default and a user wants to customize their jobs.
### What changes were proposed in this pull request? This PR aims to upgrade Spark to `4.1.0-preview2` for `4.0.1`. ### Why are the changes needed? Since Apache Spark 4.1.0 is planned next month, we had better prepare to use new features via using `4.1.0-preview2` (September) and `4.1.0-preview2 (October)` gradually. - apache/spark#51678 - apache/spark#51522 - apache/spark#50925 ### Does this PR introduce _any_ user-facing change? No behavior change. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #364 from dongjoon-hyun/SPARK-53787. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…usage example ### What changes were proposed in this pull request? This PR aims to add `spark.kubernetes.driver.pod.excludedFeatureSteps` usage example. ### Why are the changes needed? Since `Apache Spark K8s Operator v0.6.0`, we can use new features from `Apache Spark 4.1.0-preview2`. - #364 For example, `spark.kubernetes.driver.pod.excludedFeatureSteps` allows us to skip the specific feature steps. If we set the following, we can skip `KerberosConfDriverFeatureStep` which fails always on Java 25 environment currently. - apache/spark#51522 ``` spark.kubernetes.driver.pod.excludedFeatureSteps: "org.apache.spark.deploy.k8s.features.KerberosConfDriverFeatureStep" ``` **BEFORE** ``` 25/10/09 04:22:23 INFO pi-preview default o.a.s.i.Logging You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image. WARNING: A restricted method in java.lang.System has been called WARNING: java.lang.System::loadLibrary has been called by org.apache.hadoop.util.NativeCodeLoader in an unnamed module (file:/opt/spark-operator/operator/spark-kubernetes-operator.jar) WARNING: Use --enable-native-access=ALL-UNNAMED to avoid a warning for callers in this module WARNING: Restricted methods will be blocked in a future release unless native access is enabled 25/10/09 04:22:23 WARN pi-preview default o.a.h.u.NativeCodeLoader Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 25/10/09 04:22:23 WARN pi-preview default o.a.s.i.Logging Fail to get credentials java.lang.UnsupportedOperationException: getSubject is not supported at java.base/javax.security.auth.Subject.getSubject(Subject.java:277) at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:588) at org.apache.spark.deploy.k8s.features.KerberosConfDriverFeatureStep.liftedTree1$1(KerberosConfDriverFeatureStep.scala:95) ``` **AFTER** ``` 25/10/09 04:20:34 INFO pi-preview default o.a.s.i.Logging You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image. ``` ### Does this PR introduce _any_ user-facing change? No. This is an example update. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #379 from dongjoon-hyun/SPARK-53852. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
### What changes were proposed in this pull request? This PR aims to document newly added K8s configurations as a part of Apache Spark 4.1.0 preparation. ### Why are the changes needed? To sync the document with K8s `Config.scala`. For now, three PRs added four configurations. - #51522 - #51811 - #52615 ### Does this PR introduce _any_ user-facing change? No behavior change. This is only adding new configuration documents. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #52618 from dongjoon-hyun/SPARK-53913. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
### What changes were proposed in this pull request? This PR aims to document newly added K8s configurations as a part of Apache Spark 4.1.0 preparation. ### Why are the changes needed? To sync the document with K8s `Config.scala`. For now, three PRs added four configurations. - apache#51522 - apache#51811 - apache#52615 ### Does this PR introduce _any_ user-facing change? No behavior change. This is only adding new configuration documents. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#52618 from dongjoon-hyun/SPARK-53913. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
This PR aims to support
spark.kubernetes.(driver|executor).pod.excludedFeatureStepsconfiguration.Why are the changes needed?
Since Apache Spark 3.2, we have been providing
spark.kubernetes.(driver|executor).pod.featureSteps.This PR aims to allow users to exclude feature steps selectively by configurations. Please note that this is designed to allow to exclude all steps including both built-in and user-provided steps.
Does this PR introduce any user-facing change?
No because this is a new feature.
How was this patch tested?
Pass the CIs with newly added test cases.
Was this patch authored or co-authored using generative AI tooling?
No.