Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
8242762
gb300 1k1k sglang
Oseltamivir Apr 26, 2026
ba062c0
route gb300 sglang to cw cluster
Oseltamivir Apr 26, 2026
4f7d3bc
Merge branch 'main' into gb300-1k1k-sglang
Oseltamivir Apr 26, 2026
c21afd3
Merge branch 'main' into gb300-1k1k-sglang
Oseltamivir Apr 26, 2026
7903970
connector
Oseltamivir Apr 26, 2026
26943f7
path
Oseltamivir Apr 26, 2026
e7b58f7
drop forced dynamo 0.8.1 install — use container-bundled dynamo for D…
Oseltamivir Apr 26, 2026
74d8307
Merge branch 'main' into gb300-1k1k-sglang
Oseltamivir Apr 26, 2026
7f38f8c
Merge remote-tracking branch 'origin/main' into gb300-1k1k-sglang
Oseltamivir Apr 26, 2026
fa52ab0
match upstream PR #75 tunings + skip srtctl dynamo install
Oseltamivir Apr 26, 2026
bc80a16
add flags
hnyls2002 Apr 26, 2026
7f43185
add more selection space
hnyls2002 Apr 26, 2026
afca046
use _arm64 image tag + squash_dupe dir for gb300-cw
Oseltamivir Apr 27, 2026
3882a55
pin dynamo to 1.2.0.dev20260426 — first arm64 wheel with DSv4 formatter
Oseltamivir Apr 27, 2026
77bbcb8
step back to dynamo dev20260425 — earlier wheel may align with contai…
Oseltamivir Apr 27, 2026
d7dc646
prebuild dynamo wheel from hash 6a159fed on /mnt/vast — mirror PR #11…
Oseltamivir Apr 27, 2026
56b64e8
Merge branch 'main' into gb300-1k1k-sglang
Oseltamivir Apr 27, 2026
5e3340c
switch disagg transport nixl → mooncake
Oseltamivir Apr 27, 2026
83867ea
strip return_routed_experts kwarg from dynamo call sites — sglang 0.5…
Oseltamivir Apr 27, 2026
3efc208
fix dynamo regex: only match whole-line kwarg passes, leave assignmen…
Oseltamivir Apr 27, 2026
9a4018c
Merge branch 'main' into gb300-1k1k-sglang
Oseltamivir Apr 27, 2026
173bd41
PR85
Oseltamivir Apr 28, 2026
5dc00ed
Import recipes
Oseltamivir Apr 28, 2026
5b88465
Merge branch 'main' into gb300-1k1k-sglang
Oseltamivir Apr 29, 2026
93cc3c3
Update perf-changelog.yaml
Oseltamivir Apr 29, 2026
81bba88
config syntext
Oseltamivir Apr 29, 2026
628f45b
Merge main into gb300 SGLang PR
Oseltamivir Apr 29, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions .github/configs/nvidia-master.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7666,3 +7666,36 @@ dsv4-fp4-gb200-dynamo-vllm:
tp: 16
ep: 16
dp-attn: true

dsv4-fp4-gb300-dynamo-sglang:
image: lmsysorg/sglang:deepseek-v4-grace-blackwell
model: deepseek-ai/DeepSeek-V4-Pro
model-prefix: dsv4
runner: gb300
precision: fp4
framework: dynamo-sglang
multinode: true
disagg: true
# Ported from NVIDIA/srt-slurm PR #75 — 1P + 1D, both TP=4 on a single
# GB300 (4 GPUs / node), MXFP4 MoE kernels, NIXL KV transfer. Recipe
# staged at benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/
# 1k1k/ and overlaid into the srt-slurm checkout by launch_gb300-nv.sh.
# DEP/TEP variants are upstream follow-ups; mirror that and ship 1P1D
# only here.
seq-len-configs:
- isl: 1024
osl: 1024
search-space:
- conc-list: [1, 4, 16, 64, 256]
prefill:
num-worker: 1
tp: 4
ep: 1
dp-attn: false
additional-settings:
- "CONFIG_FILE=recipes/sglang/deepseek-v4/1k1k/disagg-gb300-1p1d-tp4.yaml"
decode:
num-worker: 1
tp: 4
ep: 1
dp-attn: false
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
name: "dsv4-sglang-disagg-gb300-1p1d-tp4"

# DeepSeek-V4-Pro disaggregated on GB300 (1P1D, TP=4, MXFP4) — sglang +
# dynamo frontend. Ported from NVIDIA/srt-slurm PR #75
# (recipes/gb300-fp4/1k1k-dsv4/disagg-1p1d-tp4-mxfp4.yaml). GB300 sibling of
# the dsv4-sglang-disagg-gb200-1p1d-dep8-tep8 recipe in this directory tree.
#
# Topology: 1 prefill node + 1 decode node, each TP=4 on a single GB300
# (4 GPUs / node). KV transfer over NIXL. Targets steady decode TPOT under
# moderate-to-high concurrency.
#
# Local deltas vs upstream PR #75:
# * benchmark.type = sa-bench (upstream uses "manual" because they pair
# with a separate sa-bench launcher; our sweep harness drives sa-bench
# in-recipe).
# * Disagg timeout triple + NCCL_MNNVL/CUMEM env vars copied from the
# GB200 sglang sibling — same handshake-stability rationale.

Check warning on line 17 in benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/1k1k/disagg-gb300-1p1d-tp4.yaml

View check run for this annotation

Claude / Claude Code Review

Recipe header references non-existent GB200 sglang sibling

The new recipe header at lines 5-6 and 16-17 of disagg-gb300-1p1d-tp4.yaml refers to a 'GB200 sglang sibling' (`dsv4-sglang-disagg-gb200-1p1d-dep8-tep8`) that does not exist in this repository — the only DSv4 GB200 recipes live under `srt-slurm-recipes/vllm/deepseek-v4/`, and `launch_gb200-nv.sh` routes `dsv4-fp4` exclusively through the `dynamo-vllm` branch. This is a comment-only inconsistency with no runtime impact, but the PR description's claim that this 'mirrors the gates the GB200 launche

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The new recipe header at lines 5-6 and 16-17 of disagg-gb300-1p1d-tp4.yaml refers to a 'GB200 sglang sibling' (dsv4-sglang-disagg-gb200-1p1d-dep8-tep8) that does not exist in this repository — the only DSv4 GB200 recipes live under srt-slurm-recipes/vllm/deepseek-v4/, and launch_gb200-nv.sh routes dsv4-fp4 exclusively through the dynamo-vllm branch. This is a comment-only inconsistency with no runtime impact, but the PR description's claim that this 'mirrors the gates the GB200 launcher already uses for the SGLang sibling' is also inaccurate. Suggest editing the header to drop the sibling references or point at the actual upstream PR #75 source instead.

Extended reasoning...

What the bug is

The new file benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/1k1k/disagg-gb300-1p1d-tp4.yaml carries a header comment (lines 4-6) and a 'Local deltas' block (lines 16-17) that twice reference a sibling recipe — dsv4-sglang-disagg-gb200-1p1d-dep8-tep8 — said to live 'in this directory tree'. It also claims that the disagg-timeout triple and NCCL_MNNVL_ENABLE/NCCL_CUMEM_ENABLE env vars were 'copied from the GB200 sglang sibling'. No such sibling exists in the repo today.

The specific code path / proof

Step-by-step:

  1. The directory benchmarks/multi_node/srt-slurm-recipes/sglang/ contains only the new GB300 file added by this PR — no GB200 sglang DSv4 file is present.
  2. A repo-wide search for dsv4-sglang-disagg-gb200 returns only the self-reference inside this new recipe's header comment.
  3. DSv4 GB200 recipes do exist, but under the vllm subtree (benchmarks/multi_node/srt-slurm-recipes/vllm/deepseek-v4/), not sglang.
  4. runners/launch_gb200-nv.sh routes dsv4-fp4 only through the dynamo-vllm branch (the elif on FRAMEWORK==dynamo-vllm) and the overlay block copies from srt-slurm-recipes/vllm/deepseek-v4. There is no dynamo-sglang branch for dsv4 on GB200, so the PR description's claim 'Mirrors the gates the GB200 launcher already uses for the SGLang sibling' is also off.
  5. perf-changelog.yaml has dsv4-fp4-gb200-dynamo-vllm but no GB200 sglang DSv4 entry.

Why existing code doesn't prevent it

YAML comments are free-form text; nothing in the launcher, srtctl, or sweep harness validates them.

Impact

Zero runtime impact — the field values themselves are valid. The cost is purely documentation: a future reader trying to compare this recipe against the alleged sibling, or wanting to update env-var rationale in lockstep, will hit a dead-end.

How to fix it

Either:

  • Drop the sibling references entirely; or
  • Repoint them at the actual provenance — upstream NVIDIA/srt-slurm PR improve NVIDIA CI stability #75 (already cited elsewhere in the same header) for the timeout/env-var rationale; or
  • If the intent was to track parity with the existing GB200 dynamo-vllm recipe under srt-slurm-recipes/vllm/deepseek-v4/, name that file instead and adjust the wording (it isn't an sglang sibling).


model:
path: "deepseek-v4-pro"
container: "lmsysorg/sglang:deepseek-v4-grace-blackwell"
precision: "fp4"

dynamo:
version: 0.8.1

slurm:
time_limit: "8:00:00"

health_check:
max_attempts: 1440
interval_seconds: 10

resources:
gpu_type: "gb300"
gpus_per_node: 4
prefill_nodes: 1
decode_nodes: 1
prefill_workers: 1
decode_workers: 1
gpus_per_prefill: 4
gpus_per_decode: 4

frontend:
type: dynamo
enable_multiple_frontends: false

backend:
type: sglang
connector: null

prefill_environment:
PYTHONUNBUFFERED: "1"
SGLANG_JIT_DEEPGEMM_PRECOMPILE: "0"
NCCL_MNNVL_ENABLE: "1"
NCCL_CUMEM_ENABLE: "1"
SGLANG_DISAGGREGATION_HEARTBEAT_MAX_FAILURE: "100000"
SGLANG_DISAGGREGATION_BOOTSTRAP_TIMEOUT: "100000"
SGLANG_DISAGGREGATION_WAITING_TIMEOUT: "100000"

decode_environment:
PYTHONUNBUFFERED: "1"
SGLANG_JIT_DEEPGEMM_PRECOMPILE: "0"
NCCL_MNNVL_ENABLE: "1"
NCCL_CUMEM_ENABLE: "1"
SGLANG_DISAGGREGATION_HEARTBEAT_MAX_FAILURE: "100000"
SGLANG_DISAGGREGATION_BOOTSTRAP_TIMEOUT: "100000"
SGLANG_DISAGGREGATION_WAITING_TIMEOUT: "100000"

sglang_config:
prefill:
served-model-name: "deepseek-ai/DeepSeek-V4-Pro"
model-path: "/model/"
trust-remote-code: true
tensor-parallel-size: 4
disaggregation-mode: "prefill"
disaggregation-transfer-backend: nixl
moe-runner-backend: "flashinfer_mxfp4"
chunked-prefill-size: 4096
disable-flashinfer-autotune: true

decode:
served-model-name: "deepseek-ai/DeepSeek-V4-Pro"
model-path: "/model/"
trust-remote-code: true
tensor-parallel-size: 4
disaggregation-mode: "decode"
disaggregation-transfer-backend: nixl
moe-runner-backend: "flashinfer_mxfp4"
chunked-prefill-size: 4096
disable-flashinfer-autotune: true

benchmark:
type: "sa-bench"
isl: 1024
osl: 1024
concurrencies: "1x4x16x64x256"
req_rate: "inf"
use_chat_template: false
9 changes: 9 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1833,3 +1833,12 @@
- "Bump --chunked-prefill-size from 4096 to 8192"
- "Retrigger dsv4-fp8-mi355x-sglang"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1160

- config-keys:
- dsv4-fp4-gb300-dynamo-sglang
description:
- "Add DeepSeek-V4-Pro FP4 GB300 Dynamo SGLang disaggregated multinode configuration"
- "Image: lmsysorg/sglang:deepseek-v4-grace-blackwell"
- "Topology: 1P + 1D, both TP=4 on a single GB300; MXFP4 MoE kernels, NIXL KV transfer"
- "Recipe ported from NVIDIA/srt-slurm PR #75 (recipes/gb300-fp4/1k1k-dsv4/disagg-1p1d-tp4-mxfp4.yaml)"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX

Check warning on line 1844 in perf-changelog.yaml

View check run for this annotation

Claude / Claude Code Review

perf-changelog pr-link is a literal XXX placeholder

The new perf-changelog.yaml entry for 'dsv4-fp4-gb300-dynamo-sglang' (line 1844) has 'pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX' — a literal 'XXX' placeholder rather than the actual PR number. Every other entry in this file references a real PR number, so this will produce a 404 for any tooling that follows the link. Fix by replacing 'XXX' with '1169' (this PR's number) before merge.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The new perf-changelog.yaml entry for 'dsv4-fp4-gb300-dynamo-sglang' (line 1844) has 'pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX' — a literal 'XXX' placeholder rather than the actual PR number. Every other entry in this file references a real PR number, so this will produce a 404 for any tooling that follows the link. Fix by replacing 'XXX' with '1169' (this PR's number) before merge.

Extended reasoning...

What the bug is

In perf-changelog.yaml at line 1844, the new changelog entry for dsv4-fp4-gb300-dynamo-sglang ends with:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX

The XXX is a literal placeholder string, not a real PR number. The PR description acknowledges this explicitly: "perf-changelog.yaml entry triggers the new sweep (PR-link placeholder to update post-merge)".

Why this is a bug

A quick scan of the rest of perf-changelog.yaml shows ~150+ other entries, every one of which uses a real numeric PR number (e.g., /pull/1160, /pull/1144, /pull/1129). This is the only pull/XXX literal in the file. Any consumer that iterates entries to render a changelog or validate links will get a 404 on this single entry.

Why the existing review process doesn't prevent it

There is no automated post-merge backfill that rewrites XXX → the real PR number — the author plans to update it manually post-merge, but there is nothing enforcing that. If it gets forgotten (which is the typical failure mode for "I'll fix it post-merge" promises), the placeholder ships permanently.

Why it's trivially fixable now

The PR number is already known: this is PR #1169. The author can replace XXX with 1169 in a single-character-range edit before merge — there is no actual reason it has to wait until post-merge. GitHub will resolve pull/1169 correctly the moment the PR is merged (and also pre-merge, since the URL just redirects to the open PR).

Step-by-step proof

  1. Open perf-changelog.yaml and grep for pull/XXX:
    • Exactly one match, at line 1844, in the new dsv4-fp4-gb300-dynamo-sglang entry added by this PR.
  2. Compare with the immediately preceding entry (line 1836): pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1160 — a real, resolvable URL.
  3. Visit https://github.com/SemiAnalysisAI/InferenceX/pull/XXX in a browser → GitHub returns "Page not found" (404), since XXX is not a valid pull request number.
  4. Visit https://github.com/SemiAnalysisAI/InferenceX/pull/1169 (this PR) → resolves correctly.

Impact

  • Severity: nit. This is documentation/changelog hygiene, not a runtime bug. It does not affect benchmark execution, sweep generation (pr-link is metadata only), or any CI gate.
  • However, it leaves a permanently dead link in the changelog for any consumer that follows pr-link URLs (e.g., changelog rendering, link validation, audit tooling).

Fix

Replace line 1844:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX

with:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1169

Note on duplicate refutations

Two verifiers refuted this as a duplicate of the other report describing the same issue. They are correct that bug_002 and bug_005 describe the same underlying issue — the synthesis agent merged them, and this single comment now represents both. The underlying issue is real and confirmed by 6 verifiers across the two original bugs; only the duplication concern was raised, not the validity.

18 changes: 17 additions & 1 deletion runners/launch_gb300-nv.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,15 @@
export SERVED_MODEL_NAME="deepseek-r1-fp8"
export MODEL_PATH=/raid/shared/models/deepseek-r1-0528
export SRT_SLURM_MODEL_PREFIX="dsr1-fp8"
elif [[ $MODEL_PREFIX == "dsv4" && $PRECISION == "fp4" ]]; then
# SRT_SLURM_MODEL_PREFIX matches the model.path alias in our DSv4
# sglang recipes (benchmarks/multi_node/srt-slurm-recipes/sglang/
# deepseek-v4/1k1k/disagg-gb300-1p1d-tp4.yaml).
export SERVED_MODEL_NAME="deepseek-v4-pro"
export MODEL_PATH=/raid/shared/models/deepseek-v4-pro
export SRT_SLURM_MODEL_PREFIX="deepseek-v4-pro"

Check failure on line 27 in runners/launch_gb300-nv.sh

View check run for this annotation

Claude / Claude Code Review

GB300 launcher missing container alias for new DSv4 SGLang recipe

The new GB300 sglang DSv4 recipe declares `container: "lmsysorg/sglang:deepseek-v4-grace-blackwell"` (a literal image name), but the `containers:` map emitted by `runners/launch_gb300-nv.sh` only declares the aliases `dynamo-trtllm`, `dynamo-sglang`, and `nginx-sqsh` — it has no entry for the literal IMAGE. `launch_gb200-nv.sh` already includes `"${IMAGE}": ${SQUASH_FILE}` precisely for locally-shipped recipes that use literal image names; without the same line here, `srtctl` cannot resolve the

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The new GB300 sglang DSv4 recipe declares container: "lmsysorg/sglang:deepseek-v4-grace-blackwell" (a literal image name), but the containers: map emitted by runners/launch_gb300-nv.sh only declares the aliases dynamo-trtllm, dynamo-sglang, and nginx-sqsh — it has no entry for the literal IMAGE. launch_gb200-nv.sh already includes "${IMAGE}": ${SQUASH_FILE} precisely for locally-shipped recipes that use literal image names; without the same line here, srtctl cannot resolve the container for the new dsv4-fp4-gb300-dynamo-sglang sweep and will fall through to a docker pull on compute nodes rather than mount the imported squashfs. Fix by adding "${IMAGE}": ${SQUASH_FILE} to the containers block (or change the recipe to container: dynamo-sglang to use the existing alias).

Extended reasoning...

What the bug is\n\nbenchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4/1k1k/disagg-gb300-1p1d-tp4.yaml line 21 sets:\nyaml\ncontainer: "lmsysorg/sglang:deepseek-v4-grace-blackwell"\n\nThis is a literal docker image name, not an alias. The srtslurm.yaml heredoc emitted by runners/launch_gb300-nv.sh only contains:\nyaml\ncontainers:\n dynamo-trtllm: ${SQUASH_FILE}\n dynamo-sglang: ${SQUASH_FILE}\n nginx-sqsh: ${NGINX_SQUASH_FILE}\n\nThere is no entry mapping lmsysorg/sglang:deepseek-v4-grace-blackwell (or ${IMAGE}) to the locally-imported squashfs.\n\n## Why existing code does not save us\n\nIn srt-slurm/src/srtctl/core/config.py the alias resolver only rewrites model.container when the value is a key in the containers: map; otherwise the literal string passes through unchanged. runtime.py then treats only strings starting with / or ./ as a squashfs path — anything else becomes a literal --container-image argument to srun, which under pyxis/enroot will attempt a registry pull on the compute node rather than mount the pre-imported sqsh.\n\n## Why this is new on GB300\n\nThe pre-PR launch_gb300-nv.sh only handled dsr1 recipes from upstream srt-slurm, all of which use container: dynamo-sglang / dynamo-trtllm (aliases) — so the missing literal-IMAGE mapping never bit. This PR ships the first GB300 recipe that uses a literal image name, and it ships exactly the launcher change the PR description claims it mirrors from GB200 — except the containers-map line was not mirrored.\n\n## Direct precedent in the GB200 sibling\n\nrunners/launch_gb200-nv.sh (the launcher whose gates the PR description explicitly says it mirrors) already has at line 206:\nyaml\n "${IMAGE}": ${SQUASH_FILE}\n\nThat line was added because GB200's locally-shipped vllm DSv4 recipes use literal container names like vllm/vllm-openai:deepseekv4-cu130. The H100 and H200 launchers (launch_h100-dgxc-slurm.sh, launch_h200-dgxc-slurm.sh) include the same line for the same reason. The new GB300 sglang DSv4 recipe is structurally identical (literal image string in container:), and needs the same launcher entry.\n\n## Step-by-step proof\n\n1. CI runs dsv4-fp4-gb300-dynamo-sglang. launch_gb300-nv.sh enters the new dsv4-fp4 branch, sets IMAGE=lmsysorg/sglang:deepseek-v4-grace-blackwell, and imports it into /home/sa-shared/squash/lmsysorg_sglang_deepseek-v4-grace-blackwell.sqsh.\n2. The heredoc writes srtslurm.yaml with three container aliases (none of them the literal image string).\n3. srtctl apply -f recipes/sglang/deepseek-v4/1k1k/disagg-gb300-1p1d-tp4.yaml reads model.container = "lmsysorg/sglang:deepseek-v4-grace-blackwell".\n4. config.py: if container in containers — false (only dynamo-trtllm, dynamo-sglang, nginx-sqsh). Literal string passes through unchanged.\n5. runtime.py: the string does not start with / or ./, so it is treated as a docker image name and passed to srun --container-image=lmsysorg/sglang:deepseek-v4-grace-blackwell.\n6. Pyxis/enroot tries to pull from Docker Hub on the compute node. The pre-imported squashfs in /home/sa-shared/squash/ is never used. Compute nodes typically lack registry pull access (and even if they did, the import was wasted) — runtime failure for the whole sweep.\n\n## Fix\n\nEither add the literal-IMAGE line to the heredoc in launch_gb300-nv.sh next to the existing aliases:\nyaml\ncontainers:\n dynamo-trtllm: ${SQUASH_FILE}\n dynamo-sglang: ${SQUASH_FILE}\n "${IMAGE}": ${SQUASH_FILE}\n nginx-sqsh: ${NGINX_SQUASH_FILE}\n\n(mirroring launch_gb200-nv.sh:206 exactly), or change the new recipe to container: "dynamo-sglang" to reuse the existing alias.\n\n## Impact / severity\n\nnormal — this blocks the only feature this PR adds. Any dsv4-fp4-gb300-dynamo-sglang run will fail at srun time, never reaching the SGLang server. Should block merge until either the launcher gets the IMAGE mapping or the recipe switches to the alias.

else
echo "Unsupported model: $MODEL_PREFIX-$PRECISION. Supported models are: dsr1-fp4, dsr1-fp8"
echo "Unsupported model: $MODEL_PREFIX-$PRECISION. Supported models are: dsr1-fp4, dsr1-fp8, dsv4-fp4"
exit 1
fi

Expand Down Expand Up @@ -47,6 +54,15 @@
cd "$SRT_REPO_DIR"
git checkout sa-submission-q2-2026

# Overlay our hand-rolled DSv4 sglang recipes on top of the upstream tree.
# NVIDIA/srt-slurm has no upstream sglang DSv4 disagg recipe for GB300
# beyond PR #75's 1P1D-TP4 entry, so we ship the recipe locally and copy
# it in here. Mirrors the equivalent block in launch_gb200-nv.sh.
if [[ $FRAMEWORK == "dynamo-sglang" && $MODEL_PREFIX == "dsv4" ]]; then
mkdir -p recipes/sglang/deepseek-v4
cp -rT "$GITHUB_WORKSPACE/benchmarks/multi_node/srt-slurm-recipes/sglang/deepseek-v4" recipes/sglang/deepseek-v4
fi

echo "Installing srtctl..."
export UV_INSTALL_DIR="$GITHUB_WORKSPACE/.local/bin"
curl -LsSf https://astral.sh/uv/install.sh | sh
Expand Down
Loading