-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[SPARK-39360][K8S] Remove deprecation of spark.kubernetes.memoryOverheadFactor and recover doc
#36744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…eadFactor and recover doc
spark.kubernetes.memoryOverheadFactor and recover doc
| <td> | ||
| This sets the Memory Overhead Factor that will allocate memory to non-JVM memory, which includes off-heap memory allocations, non-JVM tasks, various systems processes, and <code>tmpfs</code>-based local directories when <code>spark.kubernetes.local.dirs.tmpfs</code> is <code>true</code>. For JVM-based jobs this value will default to 0.10 and 0.40 for non-JVM jobs. | ||
| This is done as non-JVM tasks need more non-JVM heap space and such tasks commonly fail with "Memory Overhead Exceeded" errors. This preempts this error with a higher default. | ||
| This will be overridden by the value set by <code>spark.driver.memoryOverheadFactor</code> and <code>spark.executor.memoryOverheadFactor</code> explicitly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is added newly.
|
This looks fine to me. If the configuration is set by default we need follow up issue I can look more tomorrow |
|
Thank you for review, @tgravescs . Yes, I guess we can do clean deprecation during Apache Spark 3.4 timeframe. For Spark 3.3.0, it will be enough to deliver new generalized configurations first. |
|
Thank you, @tgravescs and @huaxingao . Merged to master/3.3. |
…headFactor` and recover doc ### What changes were proposed in this pull request? This PR aims to avoid the deprecation of `spark.kubernetes.memoryOverheadFactor` from Apache Spark 3.3. In addition, also recovers the documentation which is removed mistakenly at the `deprecation`. `Deprecation` is not a removal. ### Why are the changes needed? - Apache Spark 3.3.0 RC complains always about `spark.kubernetes.memoryOverheadFactor` because the configuration has the default value (which is not given by the users). There is no way to remove the warnings which means the directional message is not helpful and makes the users confused in a wrong way. In other words, we still get warnings even we use only new configurations or no configuration. ``` 22/06/01 23:53:49 WARN SparkConf: The configuration key 'spark.kubernetes.memoryOverheadFactor' has been deprecated as of Spark 3.3.0 and may be removed in the future. Please use spark.driver.memoryOverheadFactor and spark.executor.memoryOverheadFactor 22/06/01 23:53:49 WARN SparkConf: The configuration key 'spark.kubernetes.memoryOverheadFactor' has been deprecated as of Spark 3.3.0 and may be removed in the future. Please use spark.driver.memoryOverheadFactor and spark.executor.memoryOverheadFactor 22/06/01 23:53:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 22/06/01 23:53:50 WARN SparkConf: The configuration key 'spark.kubernetes.memoryOverheadFactor' has been deprecated as of Spark 3.3.0 and may be removed in the future. Please use spark.driver.memoryOverheadFactor and spark.executor.memoryOverheadFactor ``` - The minimum constraint is slightly different because `spark.kubernetes.memoryOverheadFactor` allowed 0 since Apache Spark 2.4 while new configurations disallow `0`. - This documentation removal might be too early because the deprecation is not the removal of configuration. This PR recoveres the removed doc and added the following. ``` This will be overridden by the value set by <code>spark.driver.memoryOverheadFactor</code> and <code>spark.executor.memoryOverheadFactor</code> explicitly. ``` ### Does this PR introduce _any_ user-facing change? No. This is a consistent with the existing behavior. ### How was this patch tested? Pass the CIs. Closes #36744 from dongjoon-hyun/SPARK-39360. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 6d43556) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
|
filed https://issues.apache.org/jira/browse/SPARK-39363 as followup |
What changes were proposed in this pull request?
This PR aims to avoid the deprecation of
spark.kubernetes.memoryOverheadFactorfrom Apache Spark 3.3. In addition, also recovers the documentation which is removed mistakenly at thedeprecation.Deprecationis not a removal.Why are the changes needed?
spark.kubernetes.memoryOverheadFactorbecause the configuration has the default value (which is not given by the users). There is no way to remove the warnings which means the directional message is not helpful and makes the users confused in a wrong way. In other words, we still get warnings even we use only new configurations or no configuration.The minimum constraint is slightly different because
spark.kubernetes.memoryOverheadFactorallowed 0 since Apache Spark 2.4 while new configurations disallow0.This documentation removal might be too early because the deprecation is not the removal of configuration. This PR recoveres the removed doc and added the following.
Does this PR introduce any user-facing change?
No. This is a consistent with the existing behavior.
How was this patch tested?
Pass the CIs.