-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-40187][DOCS] Add Apache YuniKorn scheduler docs
#37622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
docs/running-on-kubernetes.md
Outdated
| [Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling | ||
| capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies. | ||
| For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot use the version number alias next here because it will be fragile in the future.
- We need to use a concrete version
v1.0.0link instead. - However, Apache YuniKorn doesn't provide
1.0.0yet.- https://yunikorn.apache.org/docs/1.0.0/ is broken.
- The only latest version seems to be https://yunikorn.apache.org/docs/0.12.2/ .
We need Apache YuniKorn community's help here, @yangwwei .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @dongjoon-hyun This is how the doc site works, next -> is the current under-development version, we shouldn't use this, that's a good point; but I think we can use the latest stable version: this points to https://yunikorn.apache.org/docs/. Only the past versions are accessible via https://yunikorn.apache.org/docs/{VERSION_NUM}, that's why you did not see 1.0.0 there, 1.0.0 is the current stable version.
If we use a hard-coded version, e.g 1.0.0 here, we will need to come back to update the doc quite often, I don't feel that is good. So my question is: is it better to use the latest stable version here or a hard-coded version that will need updates over time? Please let me know, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I asked here. Apache YuniKorn community should provide 1.0.0 like Apache Spark did.
That is mandatory in order to guarantee when we support something. Please see Volcano example.
we will need to come back to update the doc quite often,
docs/running-on-kubernetes.md
Outdated
| helm install yunikorn yunikorn/yunikorn --namespace yunikorn | ||
| ``` | ||
|
|
||
| the above steps will install the latest version of YuniKorn on an existing Kubernetes cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use specific version instead recommending the latest version in the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the above -> The above
Apache YuniKorn scheduler docs
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making a PR, @yangwwei . I reviewed and will hold on this PR to align with other validation.
|
Can one of the admins verify this patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that I wasn't clear enough to you. We need a specific version number, @yangwwei . -1 for adding a doc without version version.
You are absolutely clear. Sure, let me find out to get this supported with our documentation framework. Thanks |
|
@dongjoon-hyun could you please take a look at the updated version? Here are the changes:
there is one more thing about our version doc. I have implemented some temp workaround in https://issues.apache.org/jira/browse/YUNIKORN-1293, but YuniKorn community folks made a very good point that the workaround wasn't sustainable. Since now there is only the following links in the doc: I think it should be fine that not bind to a specific version, what do you think? YuniKorn folks also pointed out that there are similar docs like the following already: and please let me know your thought for this, thanks! |
|
In that case, why don't you add
Specifically, I'm suggesting the following.
Apache Spark does that for Arrow project's PyArrow support. |
|
BTW, let me test YuniKorn according to this doc tomorrow further, @yangwwei . |
|
According to apache/yunikorn-site#180 (comment) , I understand the context (including rollback) and let's hold on this until next week (YuniKorn v1.1). |
|
Please see my thoughts w.r.t this comment
I don't think that will be necessary. Apache YuniKorn is a scheduler, a replacement to the default scheduler, it isn't so sensitive about Spark versions. Any Spark can run on YuniKorn with some necessary configs. In the recent version with the support of https://issues.apache.org/jira/browse/SPARK-38383, submitting jobs to YuniKorn is even easier (what was introduced in this doc). This is like we do not need a Spark support matrix in YARN, or Kubernetes. Adding such a matrix on the YuniKorn side is cumbersome, we probably will need to list all Spark versions, which isn't useful for the end users. |
Please take a look at the updated version. I do not think we depend on the 1.0.0 doc link anymore. We just make sure we explicitly set the version in the installation example. For general docs, we can point to https://yunikorn.apache.org. |
|
Got it. Let me think about that again from Apache Spark user perspective. I believe we can find some sweet spots where both communities satisfy, @yangwwei . Thank you again for all your contribution and collaboration. |
|
@dongjoon-hyun circle back on this. does the latest version look good to you? |
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @yangwwei . I also tested all existing tests with YuniKorn scheduler.
Merged to master/3.3.
|
Sorry for being late, @yangwwei . We can move forward more based on this. |
### What changes were proposed in this pull request? Add a section under [customized-kubernetes-schedulers-for-spark-on-kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html#customized-kubernetes-schedulers-for-spark-on-kubernetes) to explain how to run Spark with Apache YuniKorn. This is based on the review comments from #35663. ### Why are the changes needed? Explain how to run Spark with Apache YuniKorn ### Does this PR introduce _any_ user-facing change? No Closes #37622 from yangwwei/SPARK-40187. Authored-by: Weiwei Yang <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 4b18773) Signed-off-by: Dongjoon Hyun <[email protected]>
|
I created a test suite PR, @yangwwei . |
|
Hi, @dongjoon-hyun thanks a lot for helping on this. |
|
Thank YOU, @yangwwei . |

What changes were proposed in this pull request?
Add a section under customized-kubernetes-schedulers-for-spark-on-kubernetes to explain how to run Spark with Apache YuniKorn. This is based on the review comments from #35663.
Why are the changes needed?
Explain how to run Spark with Apache YuniKorn
Does this PR introduce any user-facing change?
No