Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/start/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ uv pip install "sglang[all]>=0.4.7.post1"

**Quick Fixes to Common Problems**

- SGLang currently uses torch 2.6, so you need to install flashinfer for torch 2.6. If you want to install flashinfer separately, please refer to [FlashInfer installation doc](https://docs.flashinfer.ai/installation.html). Please note that the FlashInfer pypi package is called `flashinfer-python` instead of `flashinfer`.
- SGLang currently uses torch 2.7.1, so you need to install flashinfer for torch 2.7.1. If you want to install flashinfer separately, please refer to [FlashInfer installation doc](https://docs.flashinfer.ai/installation.html). Please note that the FlashInfer pypi package is called `flashinfer-python` instead of `flashinfer`.

- If you encounter `OSError: CUDA_HOME environment variable is not set`. Please set it to your CUDA install root with either of the following solutions:

Expand All @@ -34,7 +34,7 @@ pip install --upgrade pip
pip install -e "python[all]"
```

Note: SGLang currently uses torch 2.6, so you need to install flashinfer for torch 2.6. If you want to install flashinfer separately, please refer to [FlashInfer installation doc](https://docs.flashinfer.ai/installation.html).
Note: SGLang currently uses torch 2.7.1, so you need to install flashinfer for torch 2.7.1. If you want to install flashinfer separately, please refer to [FlashInfer installation doc](https://docs.flashinfer.ai/installation.html).

If you want to develop SGLang, it is recommended to use docker. Please refer to [setup docker container](https://github.com/sgl-project/sglang/blob/main/docs/references/development_guide_using_docker.md#setup-docker-container) for guidance. The docker image is `lmsysorg/sglang:dev`.

Expand Down Expand Up @@ -162,4 +162,4 @@ sky status --endpoint 30000 sglang
- [FlashInfer](https://github.com/flashinfer-ai/flashinfer) is the default attention kernel backend. It only supports sm75 and above. If you encounter any FlashInfer-related issues on sm75+ devices (e.g., T4, A10, A100, L4, L40S, H100), please switch to other kernels by adding `--attention-backend triton --sampling-backend pytorch` and open an issue on GitHub.
- If you only need to use OpenAI models with the frontend language, you can avoid installing other dependencies by using `pip install "sglang[openai]"`.
- The language frontend operates independently of the backend runtime. You can install the frontend locally without needing a GPU, while the backend can be set up on a GPU-enabled machine. To install the frontend, run `pip install sglang`, and for the backend, use `pip install sglang[srt]`. `srt` is the abbreviation of SGLang runtime.
- To reinstall flashinfer locally, use the following command: `pip install "flashinfer-python==0.2.5" -i https://flashinfer.ai/whl/cu124/torch2.6 --force-reinstall --no-deps` and then delete the cache with `rm -rf ~/.cache/flashinfer`.
- To reinstall flashinfer locally, use the following command: `pip3 install --upgrade flashinfer-python --force-reinstall --no-deps` and then delete the cache with `rm -rf ~/.cache/flashinfer`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The previous command for reinstalling flashinfer-python specified a particular version (0.2.5) and a direct index URL (https://flashinfer.ai/whl/cu124/torch2.6), which helped ensure a specific pre-built wheel compatible with a given CUDA/Torch combination was used.

The new command, pip3 install --upgrade flashinfer-python --force-reinstall --no-deps, is more generic and relies on flashinfer-python being available on PyPI with wheels that pip can correctly resolve for the user's environment (e.g., Torch 2.7.1 and their specific CUDA version).

Could you confirm if flashinfer-python's distribution on PyPI now robustly provides appropriate pre-built wheels for various common CUDA and Torch 2.7.1 combinations? If not, removing the specific index URL might lead to users unintentionally:

  1. Failing to install flashinfer-python if a suitable wheel isn't found on PyPI.
  2. Building from source, which can be slow and error-prone if the environment isn't fully set up for compilation.
  3. Installing a generic or less optimized version.

Ensuring users can easily reinstall the correct, optimized flashinfer-python wheel is important for performance and troubleshooting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This line introduces pip3, while many other commands in this document for installing Python packages use pip (e.g., lines 12, 33, 34, 48, 52, 163, 164).

While pip3 is generally more explicit and safer on systems with multiple Python versions (ensuring Python 3's pip is used, which SGLang almost certainly requires), this change makes the usage inconsistent within this document.

To improve clarity and consistency for users:

  1. If pip3 is preferred for robustness, consider updating all other pip invocations in this document to pip3 (or python3 -m pip).
  2. If pip is the intended general command throughout this document (assuming it typically resolves to Python 3's pip in target environments), then this specific instance should also use pip to maintain consistency.

Could you clarify the reasoning for using pip3 specifically here, or consider making the usage consistent throughout the installation guide?

Suggested change
- To reinstall flashinfer locally, use the following command: `pip3 install --upgrade flashinfer-python --force-reinstall --no-deps` and then delete the cache with `rm -rf ~/.cache/flashinfer`.
- To reinstall flashinfer locally, use the following command: `pip install --upgrade flashinfer-python --force-reinstall --no-deps` and then delete the cache with `rm -rf ~/.cache/flashinfer`.

Loading