docs(DeepSeek-V4): verify H200 Pro max-throughput recipe by yhyang201 · Pull Request #23726 · sgl-project/sglang

yhyang201 · 2026-04-25T17:39:38Z

Summary

Update H200 Pro (1.6T) max-throughput recipe parameters to match verified 2-node (16 GPU) deployment
DISPATCH_TOKENS: 256 → 128
--max-running-requests: 256 → 64
--mem-fraction-static: 0.82 → 0.875
Remove --cuda-graph-max-bs 128 (not needed for H200 big)
Mark h200|big|max-throughput as verified

Test plan

Verified on 2-node H200 cluster (Ion-5 + Ion-6, 16 GPU total)
Existing H200 small recipes unaffected (separate code path)
Other platform recipes unaffected

🤖 Generated with Claude Code

Update H200 big (Pro 1.6T) max-throughput parameters to match verified 2-node deployment: - DISPATCH_TOKENS: 256 → 128 - --max-running-requests: 256 → 64 - --mem-fraction-static: 0.82 → 0.875 - Remove --cuda-graph-max-bs 128 (not needed) Mark h200|big|max-throughput as verified. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request adds the 'h200|big|max-throughput' configuration to the DeepSeek-V4 deployment documentation, updating environment variables and CLI flags for H200 hardware. Specifically, it adjusts memory fraction and request limits for the 'big' variant. A review comment suggests consolidating the conditional logic for H200-specific overrides to enhance code readability and maintainability.

gemini-code-assist · 2026-04-25T17:44:48Z

+      if (isBig && hardware === "h200") {
+        flags.push("  --mem-fraction-static 0.875");
+      } else if (isBig) {
+        flags.push("  --mem-fraction-static 0.82");
+      }
+      if (hardware === "h200" && isBig) {
+        flags.push("  --max-running-requests 64");
+      } else if (hardware === "h200") {
        flags.push("  --cuda-graph-max-bs 128");
        flags.push("  --max-running-requests 256");


The logic for adding flags in the max-throughput recipe is slightly fragmented across multiple if blocks. While functional, consolidating the hardware === "h200" && isBig check would improve readability and maintainability, especially as more hardware-specific overrides are added.

if (hardware === "h200" && isBig) { flags.push(" --mem-fraction-static 0.875"); flags.push(" --max-running-requests 64"); } else { if (isBig) flags.push(" --mem-fraction-static 0.82"); if (hardware === "h200") { flags.push(" --cuda-graph-max-bs 128"); flags.push(" --max-running-requests 256"); } else if (isBig && hardware === "b200") { flags.push(" --cuda-graph-max-bs 64"); flags.push(" --max-running-requests 256"); } else if (isBig && hardware === "gb300") { flags.push(" --cuda-graph-max-bs 128"); flags.push(" --max-running-requests 256"); } }

Add commented-out hints for machine-specific env vars (NVSHMEM, GLOO, NCCL) on H200 big (2-node) deployments, matching the GB200 pattern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

yhyang201 · 2026-04-26T04:03:43Z

Move to #23742

yhyang201 requested a review from wisclmy0611 as a code owner April 25, 2026 17:39

github-actions Bot added the deepseek label Apr 25, 2026

gemini-code-assist Bot reviewed Apr 25, 2026

View reviewed changes

docs(DeepSeek-V4): add H200 multinode env var hints

c4ece72

Add commented-out hints for machine-specific env vars (NVSHMEM, GLOO, NCCL) on H200 big (2-node) deployments, matching the GB200 pattern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

yhyang201 closed this Apr 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(DeepSeek-V4): verify H200 Pro max-throughput recipe#23726

docs(DeepSeek-V4): verify H200 Pro max-throughput recipe#23726
yhyang201 wants to merge 2 commits intosgl-project:mainfrom
yhyang201:dpskv4-h200-big-mt

yhyang201 commented Apr 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 25, 2026

Uh oh!

yhyang201 commented Apr 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yhyang201 commented Apr 25, 2026

Summary

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

yhyang201 commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yhyang201 commented Apr 26, 2026 •

edited

Loading