-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update AKS gpu cluster setup #49992
Merged
Merged
Update AKS gpu cluster setup #49992
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@kevin85421 can you help to take a look at simple doc update? |
@jcotant1 can you help to take a look? |
csivanich
reviewed
Jan 30, 2025
doc/source/cluster/kubernetes/user-guides/azure-aks-gpu-cluster.md
Outdated
Show resolved
Hide resolved
csivanich
reviewed
Jan 30, 2025
doc/source/cluster/kubernetes/user-guides/azure-aks-gpu-cluster.md
Outdated
Show resolved
Hide resolved
Thanks a lot for the contribution @anson627, can you fix the following high level points before we merge the PR?
|
7c95d06
to
1ae9647
Compare
3966a1a
to
4f13f02
Compare
Signed-off-by: Anson Qian <[email protected]>
4f13f02
to
7260d35
Compare
Signed-off-by: Anson Qian <[email protected]>
@pcmoritz @csivanich thanks for getting back to me! all comments addressed |
pcmoritz
approved these changes
Jan 31, 2025
n30111
pushed a commit
to minds-ai/ray
that referenced
this pull request
Jan 31, 2025
<!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> This pull request adds a new user guide for setting up an Azure AKS cluster with GPU nodes specifically for KubeRay, and updates references in the Kubernetes cluster setup documentation. New Azure AKS GPU cluster setup guide: * [`doc/source/cluster/kubernetes/user-guides/azure-aks-gpu-cluster.md`](diffhunk://#diff-0b5f6ba4d8b02475f9b0eef738c62ac56e9a782d524b8e38eb4fdf453d283630R1-R51): Added a detailed guide on creating an Azure AKS cluster with GPU nodes for KubeRay, including steps for creating a resource group, creating an AKS cluster, adding a GPU node group, and obtaining kubeconfig. Documentation updates: * [`doc/source/cluster/kubernetes/user-guides/k8s-cluster-setup.md`](diffhunk://#diff-4b96da3370400e06b8f96f19d13bfdeb122d56f179b4404ffbf501b17781cb48R11): Added a reference to the new Azure AKS GPU cluster setup guide in the list of available cluster setup guides. * [`doc/source/cluster/kubernetes/user-guides/k8s-cluster-setup.md`](diffhunk://#diff-4b96da3370400e06b8f96f19d13bfdeb122d56f179b4404ffbf501b17781cb48L29-R31): Updated the section for setting up an AKS cluster to include a reference to the new detailed setup guide. <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? Updated based on latest AKS public docs ## Related issue number ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy make sure steps in [quickstart](https://docs.ray.io/en/latest/cluster/kubernetes/getting-started/raycluster-quick-start.html#kuberay-raycluster-quickstart) run without issue on AKS cluster --------- Signed-off-by: Anson Qian <[email protected]> Signed-off-by: n3011 <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request adds a new user guide for setting up an Azure AKS cluster with GPU nodes specifically for KubeRay, and updates references in the Kubernetes cluster setup documentation.
New Azure AKS GPU cluster setup guide:
doc/source/cluster/kubernetes/user-guides/azure-aks-gpu-cluster.md
: Added a detailed guide on creating an Azure AKS cluster with GPU nodes for KubeRay, including steps for creating a resource group, creating an AKS cluster, adding a GPU node group, and obtaining kubeconfig.Documentation updates:
doc/source/cluster/kubernetes/user-guides/k8s-cluster-setup.md
: Added a reference to the new Azure AKS GPU cluster setup guide in the list of available cluster setup guides.doc/source/cluster/kubernetes/user-guides/k8s-cluster-setup.md
: Updated the section for setting up an AKS cluster to include a reference to the new detailed setup guide.Why are these changes needed?
Updated based on latest AKS public docs
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.make sure steps in quickstart run without issue on AKS cluster