-
Notifications
You must be signed in to change notification settings - Fork 690
chore: vllm 0.10.1.1 #2641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: vllm 0.10.1.1 #2641
Conversation
|
/ok to test 43d9217 |
WalkthroughBumps vLLM from 0.10.1 to 0.10.1.1 across container build, install script, and Python optional dependency. Updates the VLLM_REF commit hash and corresponding precompiled wheel URL. No logic or control-flow changes. Changes
Sequence Diagram(s)Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
container/deps/vllm/install_vllm.sh (1)
89-96: Bash usage help prints literal “f” due to Python-style f-strings.These lines won’t interpolate variables and look unpolished in CLI help. Replace with standard bash echo.
Apply this diff:
- echo f" --vllm-ref REF Git reference to checkout (default: ${VLLM_REF})" - echo f" --max-jobs NUM Maximum number of parallel jobs (default: ${MAX_JOBS})" + echo " --vllm-ref REF Git reference to checkout (default: ${VLLM_REF})" + echo " --max-jobs NUM Maximum number of parallel jobs (default: ${MAX_JOBS})" echo " --arch ARCH Architecture (amd64|arm64, default: auto-detect)" - echo f" --installation-dir DIR Directory to install vllm (default: ${INSTALLATION_DIR})" - echo f" --deepgemm-ref REF Git reference for DeepGEMM (default: ${DEEPGEMM_REF})" - echo f" --flashinf-ref REF Git reference for Flash Infer (default: ${FLASHINF_REF})" - echo f" --torch-backend BACKEND Torch backend to use (default: ${TORCH_BACKEND})" + echo " --installation-dir DIR Directory to install vllm (default: ${INSTALLATION_DIR})" + echo " --deepgemm-ref REF Git reference for DeepGEMM (default: ${DEEPGEMM_REF})" + echo " --flashinf-ref REF Git reference for Flash Infer (default: ${FLASHINF_REF})" + echo " --torch-backend BACKEND Torch backend to use (default: ${TORCH_BACKEND})"
🧹 Nitpick comments (2)
container/deps/vllm/install_vllm.sh (2)
23-27: Add guard for precompiled vLLM wheel availabilityWe’ve verified that the current
VLLM_PRECOMPILED_WHEEL_LOCATIONresolves successfully (HTTP 200), but to avoid lengthy source‐builds if a future wheel URL is missing, it’s still worthwhile to add a fast HEAD check with fallback.• File:
container/deps/vllm/install_vllm.sh
• Insert immediately after theVLLM_PRECOMPILED_WHEEL_LOCATION=...lineVLLM_PRECOMPILED_WHEEL_LOCATION="https://vllm-wheels.s3.us-west-2.amazonaws.com/${VLLM_REF}/vllm-0.10.1.1-cp38-abi3-manylinux1_x86_64.whl" VLLM_GIT_URL="https://github.com/vllm-project/vllm.git" + +# Validate the precompiled wheel URL early to avoid long source builds if it’s missing +if command -v curl >/dev/null 2>&1; then + if ! curl -fsI "${VLLM_PRECOMPILED_WHEEL_LOCATION}" >/dev/null; then + echo "Warning: Precompiled vLLM wheel not found at ${VLLM_PRECOMPILED_WHEEL_LOCATION}. Falling back to build-from-source." + unset VLLM_PRECOMPILED_WHEEL_LOCATION + fi +fi
137-141: ARM64 Compatibility – Build vLLM from SourceOn aarch64, vLLM’s prebuilt wheels target x86_64 and won’t work with torch==2.7.1+cu128/torchvision==0.22.1. To ensure your pinned versions function:
• Detect the ARM64 architecture in container/deps/vllm/install_vllm.sh (around lines 137–141).
• After installing torch==2.7.1+cu128 and torchvision==0.22.1, invoke vLLM’s source‐build flow (e.g.use_existing_torch.pyorpip install -e .) so vLLM compiles against the installed PyTorch.
• Add a clear comment or conditional branch that points users to the vLLM ARM64 build docs:
https://docs.vllm.ai/en/stable/getting_started/installation.html#build-from-sourceExample snippet update:
if ! uv pip install torch==2.7.1+cu128 torchaudio==2.7.1 torchvision==0.22.1 --index-url https://download.pytorch.org/whl; then echo "Pinned PyTorch install failed" exit 1 fi + # On aarch64, compose vLLM from source to link against this torch install: + # python use_existing_torch.py # see vLLM ARM64 build docsThis ensures anyone running on ARM64 will know to rebuild vLLM for compatibility rather than relying on unavailable prebuilt wheels.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
container/Dockerfile.vllm(1 hunks)container/deps/vllm/install_vllm.sh(1 hunks)pyproject.toml(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-18T16:52:15.659Z
Learnt from: nnshah1
PR: ai-dynamo/dynamo#2489
File: container/deps/vllm/install_vllm.sh:151-152
Timestamp: 2025-08-18T16:52:15.659Z
Learning: The VLLM_PRECOMPILED_WHEEL_LOCATION environment variable, when exported, automatically triggers vLLM's build system to use the precompiled wheel instead of building from source, even when using standard `uv pip install .` commands in container/deps/vllm/install_vllm.sh.
Applied to files:
container/deps/vllm/install_vllm.sh
📚 Learning: 2025-08-18T16:52:15.659Z
Learnt from: nnshah1
PR: ai-dynamo/dynamo#2489
File: container/deps/vllm/install_vllm.sh:151-152
Timestamp: 2025-08-18T16:52:15.659Z
Learning: The VLLM_PRECOMPILED_WHEEL_LOCATION environment variable is an official vLLM environment variable that, when exported, automatically triggers vLLM's build system to use the specified precompiled wheel instead of building from source. This works even with standard `uv pip install .` commands without requiring explicit reference to the variable in the install command. The vLLM build system internally detects and uses this environment variable.
Applied to files:
container/deps/vllm/install_vllm.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (5)
pyproject.toml (1)
55-59: Consistency check passed – all vLLM references updated to 0.10.1.1
No stalevllm==0.10.1entries remain; all references point to0.10.1.1:
- pyproject.toml (line 58)
- container/deps/vllm/install_vllm.sh (line 26)
- container/Dockerfile.vllm (line 20)
No further action required.
container/Dockerfile.vllm (2)
19-21: Release notes comment updated — LGTM.The comment and URL now reference v0.10.1.1 and align with the ARG. No functional impact.
16-16: Confirmed: VLLM_REF correctly pins v0.10.1.1The
ARG VLLM_REF="1da94e673c257373280026f75ceb4effac80e892"on line 16 ofcontainer/Dockerfile.vllmexactly matches thev0.10.1.1tag (pointing to commit 1da94e673c257373280026f75ceb4effac80e892, tagged Aug 20, 2025). No changes needed.• container/Dockerfile.vllm: line 16 —
ARG VLLM_REF="1da94e673c257373280026f75ceb4effac80e892"container/deps/vllm/install_vllm.sh (2)
160-166: Relying on VLLM_PRECOMPILED_WHEEL_LOCATION is correct per vLLM behavior.Exporting the env var before uv pip install is sufficient; vLLM build picks it up automatically. Thanks for keeping this aligned with the documented behavior.
154-159: Confirmed: openai==1.99.9 is available on PyPIopenai version 1.99.9 was published on PyPI on August 12, 2025 and can be installed directly, so pinning this exact version in
container/deps/vllm/install_vllm.shis safe. (pypi.org)
|
/ok to test 30a70d0 |
Signed-off-by: Hannah Zhang <[email protected]>
Signed-off-by: Jason Zhou <[email protected]>
Signed-off-by: Krishnan Prashanth <[email protected]>
Signed-off-by: nnshah1 <[email protected]>
Overview:
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit