Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Nov 15, 2023

What changes were proposed in this pull request?

This PR fixes Spark Standalone documentation table layout.

Why are the changes needed?

BEFORE

AFTER

  • Spark Standalone
Screenshot 2023-11-15 at 2 40 59 AM

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manual review.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the DOCS label Nov 15, 2023
@dongjoon-hyun
Copy link
Member Author

Could you review this when you have some time, @yaooqinn ?

<table class="table table-striped">
<thead><tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr></thead>
<tr>
<td><code>spark.kubernetes.context</code></td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might break the doc search?

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. Let me spin-off kubernetes part from this PR, @yaooqinn .

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-45934][DOCS] Fix spark-standalone.md and running-on-kubernetes.md table layout [SPARK-45934][DOCS] Fix Spark Standalone documentation table layout Nov 15, 2023
@dongjoon-hyun
Copy link
Member Author

Could you review this PR, @bjornjorgensen ?

This sets the Memory Overhead Factor that will allocate memory to non-JVM memory, which includes off-heap memory allocations, non-JVM tasks, various systems processes, and <code>tmpfs</code>-based local directories when <code>local.dirs.tmpfs</code> is <code>true</code>. For JVM-based jobs this value will default to 0.10 and 0.40 for non-JVM jobs.
This sets the Memory Overhead Factor that will allocate memory to non-JVM memory, which includes off-heap memory allocations, non-JVM tasks, various systems processes, and <code>tmpfs</code>-based local directories when <code>spark.kubernetes.local.dirs.tmpfs</code> is <code>true</code>. For JVM-based jobs this value will default to 0.10 and 0.40 for non-JVM jobs.
This is done as non-JVM tasks need more non-JVM heap space and such tasks commonly fail with "Memory Overhead Exceeded" errors. This preempts this error with a higher default.
This will be overridden by the value set by <code>spark.driver.memoryOverheadFactor</code> and <code>spark.executor.memoryOverheadFactor</code> explicitly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should spark.executor.memoryOverheadFactor this be spark.kubernetes.executor.memoryOverheadFactor ?

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only for Spark Standalone documetation, @bjornjorgensen 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that you are looking at the first commit. I removed K8s part from this PR completely at the latest commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here.

$ git diff HEAD~2 --stat
 docs/spark-standalone.md | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I did read the K8s part.

@bjornjorgensen
Copy link
Contributor

Thank you for fixing the dokumentasjon for K8S and Standalone :)

@dongjoon-hyun
Copy link
Member Author

Thank you for fixing the dokumentasjon for K8S and Standalone :)

Thanks, but I'm going to proceed K8s part in a new JIRA because of the previous comment.

@dongjoon-hyun
Copy link
Member Author

Could you review this Spark Standalone documentation PR when you have some time, @huaxingao ?

@huaxingao
Copy link
Contributor

LGTM Thanks @dongjoon-hyun

@dongjoon-hyun
Copy link
Member Author

Thank you so much, @huaxingao . Merged to master.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-45934 branch November 15, 2023 22:13
@dongjoon-hyun
Copy link
Member Author

Also, thank you, @yaooqinn and @bjornjorgensen , too.

dongjoon-hyun added a commit that referenced this pull request Nov 15, 2023
This PR fixes `Spark Standalone` documentation table layout.

**BEFORE**
- https://spark.apache.org/docs/3.5.0/spark-standalone.html

**AFTER**
- Spark Standalone
<img width="965" alt="Screenshot 2023-11-15 at 2 40 59 AM" src="https://github.com/apache/spark/assets/9700541/281ca898-f252-47c2-8cf3-0504bcdcbfb3">

No.

Manual review.

No.

Closes #43814 from dongjoon-hyun/SPARK-45934.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit e8c2a59)
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun
Copy link
Member Author

I also cherry-picked this to branch-3.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants