Skip to content

Conversation

@yangwwei
Copy link
Contributor

What changes were proposed in this pull request?

Add a section under customized-kubernetes-schedulers-for-spark-on-kubernetes to explain how to run Spark with Apache YuniKorn. This is based on the review comments from #35663.

Why are the changes needed?

Explain how to run Spark with Apache YuniKorn

Does this PR introduce any user-facing change?

No

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-40187] Add doc for using Apache YuniKorn as a customized scheduler [SPARK-40187][DOCS] Add Apache YuniKorn scheduler docs Aug 23, 2022
[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot use the version number alias next here because it will be fragile in the future.

We need Apache YuniKorn community's help here, @yangwwei .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @dongjoon-hyun This is how the doc site works, next -> is the current under-development version, we shouldn't use this, that's a good point; but I think we can use the latest stable version: this points to https://yunikorn.apache.org/docs/. Only the past versions are accessible via https://yunikorn.apache.org/docs/{VERSION_NUM}, that's why you did not see 1.0.0 there, 1.0.0 is the current stable version.

If we use a hard-coded version, e.g 1.0.0 here, we will need to come back to update the doc quite often, I don't feel that is good. So my question is: is it better to use the latest stable version here or a hard-coded version that will need updates over time? Please let me know, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I asked here. Apache YuniKorn community should provide 1.0.0 like Apache Spark did.

That is mandatory in order to guarantee when we support something. Please see Volcano example.

we will need to come back to update the doc quite often,

helm install yunikorn yunikorn/yunikorn --namespace yunikorn
```

the above steps will install the latest version of YuniKorn on an existing Kubernetes cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use specific version instead recommending the latest version in the doc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the above -> The above

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-40187][DOCS] Add Apache YuniKorn scheduler docs [SPARK-40187][DOCS] Add Apache YuniKorn scheduler docs Aug 23, 2022
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making a PR, @yangwwei . I reviewed and will hold on this PR to align with other validation.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that I wasn't clear enough to you. We need a specific version number, @yangwwei . -1 for adding a doc without version version.

https://yunikorn.apache.org/docs/get_started/core_features.

@yangwwei
Copy link
Contributor Author

It seems that I wasn't clear enough to you. We need a specific version number, @yangwwei . -1 for adding a doc without version version.

https://yunikorn.apache.org/docs/get_started/core_features.

You are absolutely clear. Sure, let me find out to get this supported with our documentation framework. Thanks

@yangwwei
Copy link
Contributor Author

yangwwei commented Aug 25, 2022

@dongjoon-hyun could you please take a look at the updated version? Here are the changes:

  1. Used 1.0.0 in the installation example to comply with Spark community requirements
  2. Removed the "Work with YuniKorn queues" as that adds complexity to the doc, and for Spark users, that's not something that has to be done
  3. Addressed other review comments

there is one more thing about our version doc. I have implemented some temp workaround in https://issues.apache.org/jira/browse/YUNIKORN-1293, but YuniKorn community folks made a very good point that the workaround wasn't sustainable. Since now there is only the following links in the doc:

I think it should be fine that not bind to a specific version, what do you think? YuniKorn folks also pointed out that there are similar docs like the following already:

Volcano feature steps help users to create a Volcano PodGroup and set driver/executor pod annotation to link with this [PodGroup](https://volcano.sh/en/docs/podgroup/).

and

Volcano defines PodGroup spec using [CRD yaml](https://volcano.sh/en/docs/podgroup/#example). 

please let me know your thought for this, thanks!

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Aug 25, 2022

In that case, why don't you add Apache Spark Support Matrix in Apache YuniKorn page?

I think it should be fine that not bind to a specific version, what do you think?

Specifically, I'm suggesting the following.

  • Apache Spark simply adds a forward link to Apache YuniKorn's Support Matrix page.
  • Apache YuniKorn page (v1.0.0 or the future release) will test and maintain it from YuniKorn community side.

Apache Spark does that for Arrow project's PyArrow support.

Screen Shot 2022-08-24 at 11 32 19 PM

@dongjoon-hyun
Copy link
Member

BTW, let me test YuniKorn according to this doc tomorrow further, @yangwwei .

@dongjoon-hyun
Copy link
Member

According to apache/yunikorn-site#180 (comment) , I understand the context (including rollback) and let's hold on this until next week (YuniKorn v1.1).

@yangwwei
Copy link
Contributor Author

yangwwei commented Aug 25, 2022

Hi @dongjoon-hyun

Please see my thoughts w.r.t this comment

In that case, why don't you add Apache Spark Support Matrix to Apache YuniKorn page?

I don't think that will be necessary. Apache YuniKorn is a scheduler, a replacement to the default scheduler, it isn't so sensitive about Spark versions. Any Spark can run on YuniKorn with some necessary configs. In the recent version with the support of https://issues.apache.org/jira/browse/SPARK-38383, submitting jobs to YuniKorn is even easier (what was introduced in this doc). This is like we do not need a Spark support matrix in YARN, or Kubernetes.

Adding such a matrix on the YuniKorn side is cumbersome, we probably will need to list all Spark versions, which isn't useful for the end users.

@yangwwei
Copy link
Contributor Author

According to apache/yunikorn-site#180 (comment) , I understand the context (including rollback) and let's hold on this until next week (YuniKorn v1.1).

Please take a look at the updated version. I do not think we depend on the 1.0.0 doc link anymore. We just make sure we explicitly set the version in the installation example. For general docs, we can point to https://yunikorn.apache.org.

@dongjoon-hyun
Copy link
Member

Got it. Let me think about that again from Apache Spark user perspective. I believe we can find some sweet spots where both communities satisfy, @yangwwei . Thank you again for all your contribution and collaboration.

@yangwwei
Copy link
Contributor Author

@dongjoon-hyun circle back on this. does the latest version look good to you?
Anything else you want me to address in this doc? Please let me know, thanks!

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @yangwwei . I also tested all existing tests with YuniKorn scheduler.
Merged to master/3.3.

@dongjoon-hyun
Copy link
Member

Sorry for being late, @yangwwei . We can move forward more based on this.

dongjoon-hyun pushed a commit that referenced this pull request Sep 1, 2022
### What changes were proposed in this pull request?
Add a section under [customized-kubernetes-schedulers-for-spark-on-kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html#customized-kubernetes-schedulers-for-spark-on-kubernetes) to explain how to run Spark with Apache YuniKorn. This is based on the review comments from #35663.

### Why are the changes needed?
Explain how to run Spark with Apache YuniKorn

### Does this PR introduce _any_ user-facing change?
No

Closes #37622 from yangwwei/SPARK-40187.

Authored-by: Weiwei Yang <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 4b18773)
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun
Copy link
Member

I created a test suite PR, @yangwwei .

@yangwwei
Copy link
Contributor Author

yangwwei commented Sep 1, 2022

Hi, @dongjoon-hyun thanks a lot for helping on this.
This is a great community collaboration between YuniKorn and Spark, thank you so much!

@dongjoon-hyun
Copy link
Member

Thank YOU, @yangwwei .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants