-
Notifications
You must be signed in to change notification settings - Fork 3.3k
docs: update installation #7366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -16,7 +16,7 @@ uv pip install "sglang[all]>=0.4.7.post1" | |||||
|
|
||||||
| **Quick Fixes to Common Problems** | ||||||
|
|
||||||
| - SGLang currently uses torch 2.6, so you need to install flashinfer for torch 2.6. If you want to install flashinfer separately, please refer to [FlashInfer installation doc](https://docs.flashinfer.ai/installation.html). Please note that the FlashInfer pypi package is called `flashinfer-python` instead of `flashinfer`. | ||||||
| - SGLang currently uses torch 2.7.1, so you need to install flashinfer for torch 2.7.1. If you want to install flashinfer separately, please refer to [FlashInfer installation doc](https://docs.flashinfer.ai/installation.html). Please note that the FlashInfer pypi package is called `flashinfer-python` instead of `flashinfer`. | ||||||
|
|
||||||
| - If you encounter `OSError: CUDA_HOME environment variable is not set`. Please set it to your CUDA install root with either of the following solutions: | ||||||
|
|
||||||
|
|
@@ -34,7 +34,7 @@ pip install --upgrade pip | |||||
| pip install -e "python[all]" | ||||||
| ``` | ||||||
|
|
||||||
| Note: SGLang currently uses torch 2.6, so you need to install flashinfer for torch 2.6. If you want to install flashinfer separately, please refer to [FlashInfer installation doc](https://docs.flashinfer.ai/installation.html). | ||||||
| Note: SGLang currently uses torch 2.7.1, so you need to install flashinfer for torch 2.7.1. If you want to install flashinfer separately, please refer to [FlashInfer installation doc](https://docs.flashinfer.ai/installation.html). | ||||||
|
|
||||||
| If you want to develop SGLang, it is recommended to use docker. Please refer to [setup docker container](https://github.com/sgl-project/sglang/blob/main/docs/references/development_guide_using_docker.md#setup-docker-container) for guidance. The docker image is `lmsysorg/sglang:dev`. | ||||||
|
|
||||||
|
|
@@ -162,4 +162,4 @@ sky status --endpoint 30000 sglang | |||||
| - [FlashInfer](https://github.com/flashinfer-ai/flashinfer) is the default attention kernel backend. It only supports sm75 and above. If you encounter any FlashInfer-related issues on sm75+ devices (e.g., T4, A10, A100, L4, L40S, H100), please switch to other kernels by adding `--attention-backend triton --sampling-backend pytorch` and open an issue on GitHub. | ||||||
| - If you only need to use OpenAI models with the frontend language, you can avoid installing other dependencies by using `pip install "sglang[openai]"`. | ||||||
| - The language frontend operates independently of the backend runtime. You can install the frontend locally without needing a GPU, while the backend can be set up on a GPU-enabled machine. To install the frontend, run `pip install sglang`, and for the backend, use `pip install sglang[srt]`. `srt` is the abbreviation of SGLang runtime. | ||||||
| - To reinstall flashinfer locally, use the following command: `pip install "flashinfer-python==0.2.5" -i https://flashinfer.ai/whl/cu124/torch2.6 --force-reinstall --no-deps` and then delete the cache with `rm -rf ~/.cache/flashinfer`. | ||||||
| - To reinstall flashinfer locally, use the following command: `pip3 install --upgrade flashinfer-python --force-reinstall --no-deps` and then delete the cache with `rm -rf ~/.cache/flashinfer`. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line introduces While To improve clarity and consistency for users:
Could you clarify the reasoning for using
Suggested change
|
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous command for reinstalling
flashinfer-pythonspecified a particular version (0.2.5) and a direct index URL (https://flashinfer.ai/whl/cu124/torch2.6), which helped ensure a specific pre-built wheel compatible with a given CUDA/Torch combination was used.The new command,
pip3 install --upgrade flashinfer-python --force-reinstall --no-deps, is more generic and relies onflashinfer-pythonbeing available on PyPI with wheels thatpipcan correctly resolve for the user's environment (e.g., Torch 2.7.1 and their specific CUDA version).Could you confirm if
flashinfer-python's distribution on PyPI now robustly provides appropriate pre-built wheels for various common CUDA and Torch 2.7.1 combinations? If not, removing the specific index URL might lead to users unintentionally:flashinfer-pythonif a suitable wheel isn't found on PyPI.Ensuring users can easily reinstall the correct, optimized
flashinfer-pythonwheel is important for performance and troubleshooting.