-
Notifications
You must be signed in to change notification settings - Fork 193
[NVIDIA] feat: adds more configurations for GB200 SGLang DSR1 #335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 17 commits
a1a0325
c03076b
028f224
25a19b1
124ddf4
6199031
344ac6c
355773a
0dd1e5a
7da0be5
b38b633
8136816
c1f1be4
7a8e890
ce40018
35c7eb3
b26d699
5b0509a
c1024db
3d4c3ae
35d7555
45cc883
a6cc157
2731ccb
b3ccea8
00dcff7
e845bdd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,38 @@ | ||
|
|
||
| #!/bin/bash | ||
|
|
||
| set -x | ||
|
|
||
| source "$(dirname "$0")/benchmark_lib.sh" | ||
|
|
||
| check_env_vars CONC_LIST ISL OSL IMAGE SPEC_DECODING MODEL_PATH \ | ||
| PREFILL_NUM_WORKERS PREFILL_TP PREFILL_EP PREFILL_DP_ATTN \ | ||
| DECODE_NUM_WORKERS DECODE_TP DECODE_EP DECODE_DP_ATTN \ | ||
| PREFILL_NODES DECODE_NODES N_ADDITIONAL_FRONTENDS SGL_SLURM_JOBS_PATH # SGL_SLURM_JOBS_PATH FIXME | ||
|
|
||
| # Always clone and setup Dynamo | ||
| echo "Cloning Dynamo repository..." | ||
| git clone https://github.com/ai-dynamo/dynamo.git | ||
| cd dynamo && git checkout ishan/fp48k1k && cd .. # All configs are now tracked in this branch | ||
|
|
||
| cd "$SGL_SLURM_JOBS_PATH" | ||
|
|
||
| # Set up SGL launch script-specific environment variables | ||
| export TIME_LIMIT="04:00:00" | ||
| export MODEL_PATH=$MODEL_PATH | ||
| export CONFIG_DIR=$CONFIG_DIR | ||
| export CONTAINER_IMAGE=$IMAGE | ||
| export GPU_TYPE="gb200-fp4" | ||
|
|
||
| # Launch jobs based on ISL/OSL | ||
| # Replace ' ' in CONC_LIST with 'x' such that the concurrency list is represented | ||
| # by a list of numbers delimted by 'x'. This is because of how the underlying launch script | ||
| # expects the concurrencies. | ||
| bash ./submit_disagg.sh $PREFILL_NODES \ | ||
| $PREFILL_NUM_WORKERS \ | ||
| $DECODE_NODES \ | ||
| $DECODE_NUM_WORKERS \ | ||
| $N_ADDITIONAL_FRONTENDS \ | ||
| $ISL $OSL "${CONC_LIST// /x}" inf \ | ||
| $GPU_TYPE \ | ||
| $SCRIPT_MODE |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -14,22 +14,16 @@ export SLURM_JOB_NAME="benchmark-dynamo.job" | |
|
|
||
| ### FRAMEWORK_DIFF_IF_STATEMENT #1 - difference in setting up envvars | ||
| if [[ $FRAMEWORK == "dynamo-sglang" ]]; then | ||
| # Set IMAGE based on ISL/OSL | ||
| if [ "$ISL" = "1024" ] && [ "$OSL" = "1024" ]; then | ||
| export IMAGE="/mnt/lustre01/artifacts/containers/lmsysorg+sglang+v0.5.5.post2.sqsh" | ||
| else | ||
| export IMAGE="/mnt/lustre01/artifacts/containers/dynamo-sglang.sqsh" | ||
| fi | ||
| export MODEL_PATH="/mnt/lustre01/models/deepseek-r1-0528" | ||
| export CONFIG_DIR="/mnt/lustre01/artifacts/sglang-configs/1k1k" | ||
| export IMAGE="/mnt/lustre01/artifacts/containers/lmsysorg+sglang+v0.5.5.post2.sqsh" | ||
|
|
||
| # FIXME: Another workaround for all the different branching | ||
| # THIS NEEDS TO BE STANDARDIZED ASAP | ||
| if [ "$ISL" = "1024" ] && [ "$OSL" = "1024" ]; then | ||
| export SGL_SLURM_JOBS_PATH="dynamo/examples/backends/sglang/slurm_jobs" | ||
| if [[ $PRECISION == "fp4" ]]; then | ||
| export MODEL_PATH="/mnt/lustre01/models/deepseek-r1-0528-fp4-v2" | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. for posterity, it would be preferable if this was retrieved from the master config
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the suggestion! I have addressed them through InferenceMAX/InferenceMAX@35d7555 and InferenceMAX/InferenceMAX@45cc883, which applies to TRTLLM side of code as well. |
||
| else | ||
| export SGL_SLURM_JOBS_PATH="dynamo/components/backends/sglang/slurm_jobs" | ||
| export MODEL_PATH="/mnt/lustre01/models/deepseek-r1-0528" | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same here |
||
| fi | ||
|
|
||
| export CONFIG_DIR="/mnt/lustre01/artifacts/sglang-configs/1k1k" | ||
| export SGL_SLURM_JOBS_PATH="dynamo/examples/backends/sglang/slurm_jobs" | ||
| else | ||
| SQUASH_FILE="/mnt/lustre01/users/sa-shared/images/$(echo "$IMAGE" | sed 's/[\/:@#]/_/g').sqsh" | ||
| srun --partition=$SLURM_PARTITION --exclusive --time=180 bash -c "enroot import -o $SQUASH_FILE docker://$IMAGE" | ||
|
|
@@ -148,4 +142,4 @@ PY | |
| done | ||
| fi | ||
|
|
||
| echo "All result files processed" | ||
| echo "All result files processed" | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the PR, is it possible to have the IMAGE inherit from the nvidia-master.yaml instead of hard setting in the launcher script?
kinda like what trtllm dynamo already does?

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @functionstackx , thanks for the comment! I have updated the code in InferenceMAX/InferenceMAX@c1024db so that Dynamo+SGLang will also pull the container from nvidia-master.yaml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
closes https://github.com/InferenceMAX/InferenceMAX/issues/334