-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Make the main Dockerfile to work on aarch64 #7721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the main Dockerfile to work on aarch64 #7721
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @kyleliang-nv, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly enhances the Dockerfile's compatibility by making it platform-agnostic, specifically enabling builds for both x86 and aarch64 architectures. My changes primarily involve replacing hardcoded x86_64 references with dynamic $(uname -m) commands or conditional logic. This ensures that architecture-specific dependencies, symlinks, and package sources are correctly resolved during the Docker build process, greatly improving the portability of the build environment.
Highlights
- Platform Agnostic Symlinks: I've modified the
libmlx5.sosymlink creation in the Dockerfile to dynamically use$(uname -m)instead of a hardcodedx86_64path. This ensures the symlink is correctly created regardless of the underlying architecture. - Dynamic SGLang Kernel Wheel Installation: The SGLang kernel wheel download URL has been updated to incorporate
$(uname -m)for the architecture suffix. This allows the Docker build to fetch the correct pre-built wheel for either x86 or aarch64 targets. - Multi-Architecture NVIDIA Repositories: I've adjusted the
aptsource list andapt-keycommands for NVIDIA developer tools and CUDA repositories. These now conditionally selectamd64orarm64based on the output ofuname -m, ensuring the correct repository is used for package management on different architectures. - CMake Installation for Multiple Architectures: The CMake download URL and extraction paths have been changed to dynamically include
$(uname -m). This enables the Dockerfile to download and install the appropriate CMake binary for the detected build architecture, enhancing cross-platform compatibility.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request updates the Dockerfile to support aarch64 builds by replacing hardcoded x86_64 values with dynamic architecture detection. A critical issue was identified where the architecture check for the NVIDIA CUDA repository key uses arm64 instead of sbsa for aarch64. Suggestions were provided to improve the maintainability of the Dockerfile by using variables to reduce repetitive code.
ishandhanani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add this into CI and allow users to pull either platform from the lmsys docker repository. WDYT @zhyncs?
9e7dee6 to
4af8600
Compare
4ebda8c to
f817e6e
Compare
docker/Dockerfile
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Blackwell" encompasses more than sm_100. (Same comment below for TORCH_CUDA_ARCH_LIST.) Maybe we should be more expansive in the list here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would CMAKE_CUDA_ARCHITECTURES=100;120 be sufficient?
And for TORCH_CUDA_ARCH_LIST, will TORCH_CUDA_ARCH_LIST="10.0 12.0" be sufficient?
Since this is a blackwell build, I'm thinking if I should include sm90.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FP8 or FP4 kernels add complexity to the explanation because of arch and/or family conditional compilation.
For the kernels that aren't fp8 or fp4, 10.0 + 12.0 together covers a lot of ground for the Blackwell families, yes. Thor is the other one; its number had been 10.1 but it will be renumbered as SM 11.0 in CUDA 13 here shortly. (We could perhaps skip Thor for the moment but expect to add it later.)
As to the "what about 9.0", I think my best advice is to have one build that serves all the primary targets, so at least here that's Hopper and Blackwell together. Instead of having the "blackwell" one not be the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explaination Cliff. I'll work on these changes.
Claiming it to be Hopper and Blackwell is a bit challenging right now, due to the DeepEP library where they explictly check that TORCH_CUDA_ARCH_LIST have to be exactly "9.0", otherwise will disable some features.
https://github.com/deepseek-ai/DeepEP/blob/main/setup.py#L70-L73.
Not sure how much SGLang depends on this specific feature in DeepEP package.
390bf94 to
bc6ffbe
Compare
bf6d118 to
8dcbbd1
Compare
3427207 to
63c339c
Compare
29f5ee8 to
6c51a23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we call this blackwell-cu128?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's more consistent, but will render blackwell tag obsolete. Or, we can tag it as blackwell-cu128 and instead add additional tag for blackwell-cu129asblackwell. So that blackwell` is the one that have multi-platform (with cu129).
ishandhanani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comment on tag but LGTM
e3f56bc to
1af3696
Compare
b90f009 to
c846af6
Compare
c846af6 to
63072e8
Compare
b18d7ed to
ac12dc3
Compare
46744b3 to
3e85f4b
Compare
|
Closing this PR since #10705 is already merged in |
Motivation
The current Dockerfile does not build for aarch64. This PR makes the main Dockerfile platform agnostic
which allows people to build it for both x86 and aarch64 targets.
Modifications
Accuracy Tests
Checklist