-
Notifications
You must be signed in to change notification settings - Fork 29k
Backport patches / PRs #22323 and #24879 to Spark 2.4 #28982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backport patches / PRs #22323 and #24879 to Spark 2.4 #28982
Conversation
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your first contribution, @hopper-signifyd .
However, Apache Spark community has a policy which backports only bug fixes. In other words, we cannot backport New Feature or Improvement. Since both SPARK-25262 and SPARK-28042 are new feature at 3.0.0, could you close this PR?
|
Hi @dongjoon-hyun, I realize that this does technically backport a feature. However, this feature is backported in order to fix a bug that prevents pods / spark jobs from launching if they try to mount volumes and assign them as local dirs (even using spark.local.dirs). See this issue SPARK-31666 and this one too: kubeflow/spark-operator#828 |
|
Also, @dongjoon-hyun I wanted to say I really enjoyed your talk on Prometheus at the Spark Summit. I'm looking forward to setting up Prometheus for metrics gathering for my Spark jobs. |
|
Can one of the admins verify this patch? |
|
Thank you for saying that.
BTW,
|
|
@dongjoon-hyun Thank you for the explanation. I will close this and open separate PRs. |
|
Thank you, @hopper-signifyd . I'll review, #28985 . |
What changes were proposed in this pull request?
We should backport the fixes for local dirs on Kubernetes to Spark 2.4.
Why are the changes needed?
Running Spark on Kubernetes and not being able to use mounted NVME drives as local storage causes issues on services such as AWS EKS. Upgrading to 3.0 just to fix this bug is more hassle than it's worth for some organizations.
Does this PR introduce any user-facing change?
Technically, yes. This adds the
spark.kubernetes.local.dirs.tmpfsback to Spark 2.4 from Spark 3. However, there's no "breaking changes" per se.How was this patch tested?
The tests were backported. Also, we've been running our own custom Spark 2.4.5 build with this patch applied at my org for the past few months.