Skip to content

Conversation

@pmackles
Copy link

@pmackles pmackles commented Apr 9, 2018

When running spark driver in a container such as when using the Mesos dispatcher service, we need to apply the same rules as for executors in order to avoid the JVM going over the allotted limit and then killed.

Tested manually on spark 2.3 branch

@felixcheung
Copy link
Member

Jenkins, ok to test

@felixcheung
Copy link
Member

@susanxhuynh

@SparkQA
Copy link

SparkQA commented Apr 10, 2018

Test build #89091 has finished for PR 21006 at commit 1197c0b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member

ok to test

@felixcheung
Copy link
Member

@tnachen

@SparkQA
Copy link

SparkQA commented Jul 10, 2018

Test build #92795 has finished for PR 21006 at commit 1197c0b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@susanxhuynh
Copy link
Contributor

cc @skonto @samvantran

@tnachen
Copy link
Contributor

tnachen commented Aug 1, 2018

jenkins ok to test

.filter(_.getName.equals("mem"))
.map(_.getScalar.getValue)
.head
assert(1384.0 === taskMem)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also test the 10% case as well?

@tnachen
Copy link
Contributor

tnachen commented Aug 1, 2018

Besides the test comment, everything else LGTM

@SparkQA
Copy link

SparkQA commented Aug 1, 2018

Test build #93914 has finished for PR 21006 at commit 1197c0b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member

@pmackles ?

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@dmcwhorter
Copy link
Contributor

We just had to build a custom version of 2.4.4 to pull in this change, are there any plans to merge it into an upcoming release?

@dmcwhorter
Copy link
Contributor

I created this pull request pmackles#1 to update the branch for this pull request with changes on branch https://github.com/dmcwhorter/spark/tree/dmcwhorter-SPARK-22256.

The updates do the following:

  1. Rebase pmackles/paul-SPARK-22256 onto the latest apache/spark master branch
  2. Add a test case for when the default value of spark.mesos.driver.memoryOverhead is 10% of driver memory as requested in [SPARK-22256][MESOS] - Introduce spark.mesos.driver.memoryOverhead #21006

My hope is this update will make this pull request ready to merge and include in the next spark release.

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Jan 17, 2020
@github-actions github-actions bot closed this Jan 18, 2020
dongjoon-hyun pushed a commit that referenced this pull request Dec 15, 2020
### What changes were proposed in this pull request?
This is a simple change to support allocating a specified amount of overhead memory for the driver's mesos container.  This is already supported for executors.

### Why are the changes needed?
This is needed to keep the driver process from exceeding memory limits and being killed off when running on mesos.

### Does this PR introduce _any_ user-facing change?
Yes, it adds a `spark.mesos.driver.memoryOverhead` configuration option.  Documentation changes for this option are included in the PR.

### How was this patch tested?
Test cases covering allocation of driver memory overhead are included in the changes.

### Other notes
This is a second attempt to get this change reviewed, accepted and merged.  The original pull request was closed as stale back in January: #21006.
For this pull request, I took the original change by pmackles, rebased it onto the current master branch, and added a test case that was requested in the original code review.
I'm happy to make any further edits or do anything needed so that this can be included in a future spark release.  I keep having to build custom spark distributions so that we can use spark within our mesos clusters.

Closes #30739 from dmcwhorter/dmcwhorter-SPARK-22256.

Lead-authored-by: David McWhorter <[email protected]>
Co-authored-by: Paul Mackles <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants