Fix vllm-qwen35: switch to cu130-nightly for SM 12.1 compat#10
Conversation
The previous qwen3_5-cu130 image (02-23 build) crashed with a Triton kernel error on GB10 (SM 12.1). The cu130-nightly image (03-01 build, commit afd089f2) includes fixes for both the Triton issue and the RMSNormGated bug (vllm-project/vllm#35423). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughvLLMのQwen3.5-35B-A3B-FP8イメージリファレンスをvllm/vllm-openai:cu130-nightly に更新しました。NG C 26.01互換性、cu130ビルド問題、TRITON Fp8 MoEバックエンド自動選択、ツール呼び出し機能、メモリ利用率設定などに関する説明を追加しました。 Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review please |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backends/vllm/compose.yml`:
- Line 68: Replace the mutable image tag "vllm/vllm-openai:cu130-nightly" with
the immutable digest form to ensure reproducibility; update the image reference
in compose.yml from "vllm/vllm-openai:cu130-nightly" to
"vllm/vllm-openai@sha256:44486041d89ec04a2184e6136e9472dd2d8f19794f47a98e0ea7aed71d90a6c3"
so Docker Compose pulls the exact image digest rather than a moving tag.
Summary
vllm/vllm-openai:qwen3_5-cu130tovllm/vllm-openai:cu130-nightlyqwen3_5-cu130image (02-23 build) crashed with a Triton kernel error on GB10 (SM 12.1)cu130-nightlyimage (03-01 build, commit afd089f2) includes fixes for both the Triton issue and the RMSNormGated bug (vllm-project/vllm#35423)Verification
docker compose --profile qwen35 up— started without crash/v1/chat/completionsPOST — correct response with reasoning separation🤖 Generated with Claude Code