Conversation
WalkthroughThis PR introduces SGLang Router mode as an alternative to Dynamo frontends for disaggregated inference, adds multi-router architecture with load balancing, refactors backend classes from Changes
Sequence DiagramsequenceDiagram
participant User
participant SLURM as SLURM Job Script
participant nginx as Nginx<br/>(Load Balancer)
participant Router as SGLang Router(s)
participant Prefill as Prefill Servers
participant Decode as Decode Servers
participant HealthCheck as Health Check Service
User->>SLURM: Submit job
SLURM->>nginx: Start nginx on node 0
SLURM->>Router: Launch primary router on nginx node
SLURM->>Router: Launch additional routers on worker nodes
Router->>Router: Register server endpoints
HealthCheck->>nginx: Poll /workers endpoint
nginx->>Router: Query worker stats
Router->>Router: Aggregate prefill/decode counts
Router-->>HealthCheck: Return stats
HealthCheck->>HealthCheck: Validate expected counts
HealthCheck-->>SLURM: Confirm ready
SLURM->>User: Frontend available at http://localhost:8000
User->>nginx: Send inference request
nginx->>Router: Route request
Router->>Prefill: Token prefill phase
Prefill-->>Router: Prefill results
Router->>Decode: Decoding phase
Decode-->>Router: Decoded tokens
Router-->>nginx: Return results
nginx-->>User: Response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (4)
🧰 Additional context used🧬 Code graph analysis (3)src/srtctl/core/backend.py (3)
src/srtctl/__init__.py (1)
src/srtctl/cli/submit.py (1)
🪛 Ruff (0.14.8)src/srtctl/core/backend.py23-23: PEP 484 prohibits implicit Convert to (RUF013) 53-53: PEP 484 prohibits implicit Convert to (RUF013) 65-65: Unnecessary key check before dictionary access Replace with (RUF019) 68-68: Avoid specifying long messages outside the exception class (TRY003) 81-81: PEP 484 prohibits implicit Convert to (RUF013) 115-115: Unused method argument: (ARG002) 115-115: PEP 484 prohibits implicit Convert to (RUF013) 115-115: PEP 484 prohibits implicit Convert to (RUF013) 203-203: Avoid specifying long messages outside the exception class (TRY003) src/srtctl/cli/submit.py56-56: PEP 484 prohibits implicit Convert to (RUF013) 🔇 Additional comments (10)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
scripts/worker_setup/worker.py (2)
113-122: Consider closing file handles on failure.If
run_commandreturnsNone(line 119), the file handles opened on lines 114-115 remain open. While minor (only occurs on failure path), consider using a context manager or explicit cleanup.+ frontend_stdout = None + frontend_stderr = None + try: # Open log files for frontend frontend_stdout = open("/logs/frontend.out", "w") frontend_stderr = open("/logs/frontend.err", "w") frontend_cmd = "python3 -m dynamo.frontend --http-port=8000" frontend_process = run_command(frontend_cmd, background=True, stdout=frontend_stdout, stderr=frontend_stderr) if not frontend_process: + raise RuntimeError("Failed to start frontend") - raise RuntimeError("Failed to start frontend") logging.info(f"Frontend started in background (PID: {frontend_process.pid})") logging.info("Frontend logs: /logs/frontend.out and /logs/frontend.err") + except Exception: + if frontend_stdout: + frontend_stdout.close() + if frontend_stderr: + frontend_stderr.close() + raise
124-126: Simplify redundant condition.The check
not gpu_type.startswith("gb300")is redundant—a string cannot start with both"gb200"and"gb300". This simplifies to:- if (gpu_type.startswith("gb200") and not gpu_type.startswith("gb300")) or gpu_type.startswith("h100"): + if gpu_type.startswith("gb200") or gpu_type.startswith("h100"):If the intent is to exclude
"gb300"types entirely, the condition should be restructured (e.g., checkgpu_type.startswith(("gb200", "h100")) and not gpu_type.startswith("gb300")).scripts/worker_setup/command.py (1)
177-179: Consider using keyword arguments for clarity.With 9 parameters, positional arguments are error-prone and hard to read. Keyword arguments improve maintainability.
return build_sglang_command_from_yaml( - worker_type, sglang_config_path, host_ip, port, total_nodes, rank, profiler, dump_config_path, use_sglang_router + worker_type=worker_type, + sglang_config_path=sglang_config_path, + host_ip=host_ip, + port=port, + total_nodes=total_nodes, + rank=rank, + profiler=profiler, + dump_config_path=dump_config_path, + use_sglang_router=use_sglang_router, )
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
scripts/worker_setup/command.py(7 hunks)scripts/worker_setup/worker.py(6 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
scripts/worker_setup/worker.py (1)
scripts/worker_setup/command.py (1)
install_dynamo_wheels(120-144)
🔇 Additional comments (3)
scripts/worker_setup/worker.py (2)
143-185: LGTM!The
use_sglang_routerparameter is correctly integrated. Conditional skipping of etcd wait and dynamo installation aligns with the sglang router mode behavior. Same comment about the redundant condition on line 170 applies here.
188-253: LGTM - same patterns assetup_prefill_worker.The implementation is consistent with the prefill worker. The same optional refactors for file handle cleanup (lines 227-228) and the redundant condition (line 238) apply here as well.
scripts/worker_setup/command.py (1)
62-65: LGTM!The
use_launch_serverlogic correctly unifies profiling mode and sglang router mode under the samesglang.launch_servermodule path, withdynamo.sglangreserved for the traditional/non-profiling mode.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
scripts/benchmarks/sa-bench/bench.sh (1)
29-29: Consider using${chosen_concurrencies[*]}for cleaner array expansion.When expanding an array in a string context,
${array[*]}is preferred over${array[@]}to avoid shellcheck warnings (SC2145).Apply this diff:
-echo "Config ${chosen_isl}; ${chosen_osl}; ${chosen_concurrencies[@]}; ${chosen_req_rate}; sglang_router=${use_sglang_router}" +echo "Config ${chosen_isl}; ${chosen_osl}; ${chosen_concurrencies[*]}; ${chosen_req_rate}; sglang_router=${use_sglang_router}"
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
scripts/benchmarks/sa-bench/bench.sh(1 hunks)scripts/templates/job_script_template_disagg.j2(1 hunks)scripts/utils/benchmark_utils.sh(1 hunks)scripts/utils/check_server_health.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
scripts/benchmarks/sa-bench/bench.sh (1)
scripts/utils/benchmark_utils.sh (1)
wait_for_model(5-57)
🪛 Shellcheck (0.11.0)
scripts/utils/benchmark_utils.sh
[warning] 25-25: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 26-26: Declare and assign separately to avoid masking return values.
(SC2155)
scripts/benchmarks/sa-bench/bench.sh
[error] 29-29: Argument mixes string and array. Use * or separate argument.
(SC2145)
🔇 Additional comments (11)
scripts/templates/job_script_template_disagg.j2 (2)
381-381: LGTM!The template variable correctly renders to a bash-compatible boolean string.
385-385: LGTM!The USE_SGLANG_ROUTER flag is correctly passed to bench.sh as the 9th positional argument. The argument ordering relies on BENCHMARK_ARGS expanding to exactly 4 space-separated values, which should work correctly based on line 380.
scripts/benchmarks/sa-bench/bench.sh (2)
27-27: LGTM!The parameter correctly captures the 9th positional argument with a sensible default value.
35-35: LGTM!The flag is correctly passed as the 8th argument to
wait_for_model, matching the function signature in benchmark_utils.sh.scripts/utils/benchmark_utils.sh (3)
14-17: LGTM!The new parameter is added with a sensible default, maintaining backward compatibility. The workers_addr variable is appropriately defined for the sglang router mode.
19-23: LGTM!The conditional echo provides clear feedback about which health check mode is active, aiding in debugging and monitoring.
29-37: LGTM!The conditional logic correctly routes to the appropriate endpoint and passes the corresponding flag to the health check script based on the mode.
scripts/utils/check_server_health.py (4)
5-27: LGTM!The updated docstring clearly documents both health check modes with usage examples, and the argparse import supports the enhanced CLI interface.
50-80: LGTM!The dynamo health check logic is correctly extracted into its own function, maintaining the original behavior for /health endpoint validation. The component-based counting logic properly handles both disaggregated and aggregated modes.
82-96: LGTM!The dispatcher correctly validates inputs and routes to the appropriate health check function based on the mode flag. The default parameter maintains backward compatibility.
98-113: LGTM!The argparse-based CLI provides a cleaner interface with proper flag support for selecting the health check mode. The argument passing is correct and maintains compatibility with both modes.
| def check_sglang_router_health(expected_n_prefill: int, expected_n_decode: int, response: str) -> str: | ||
| """Check health using sglang router /workers endpoint.""" | ||
| try: | ||
| decoded_response = json.loads(response) | ||
| except json.JSONDecodeError: | ||
| return f"Got invalid response from server that leads to JSON Decode error: {response}" | ||
|
|
||
| expected_n_prefill = int(expected_n_prefill) | ||
| expected_n_decode = int(expected_n_decode) | ||
| if "stats" not in decoded_response: | ||
| return f"Key 'stats' not found in response: {response}" | ||
|
|
||
| stats = decoded_response["stats"] | ||
| actual_prefill = stats.get("prefill_count", 0) | ||
| actual_decode = stats.get("decode_count", 0) | ||
|
|
||
| if actual_prefill >= expected_n_prefill and actual_decode >= expected_n_decode: | ||
| return f"Model is ready. Have {actual_prefill} prefills and {actual_decode} decodes." | ||
| else: | ||
| return f"Model is not ready, waiting for {expected_n_prefill - actual_prefill} prefills and {expected_n_decode - actual_decode} decodes. Have {actual_prefill} prefills and {actual_decode} decodes." | ||
|
|
There was a problem hiding this comment.
Fix potentially confusing negative wait counts.
When one worker type exceeds the expected count while another doesn't, the subtraction on line 47 can produce negative values. For example, if actual_prefill=3, expected_n_prefill=2 but actual_decode=1, expected_n_decode=2, the message would incorrectly report "waiting for -1 prefills and 1 decodes."
Apply this diff to only report positive wait counts:
def check_sglang_router_health(expected_n_prefill: int, expected_n_decode: int, response: str) -> str:
"""Check health using sglang router /workers endpoint."""
try:
decoded_response = json.loads(response)
except json.JSONDecodeError:
return f"Got invalid response from server that leads to JSON Decode error: {response}"
if "stats" not in decoded_response:
return f"Key 'stats' not found in response: {response}"
stats = decoded_response["stats"]
actual_prefill = stats.get("prefill_count", 0)
actual_decode = stats.get("decode_count", 0)
if actual_prefill >= expected_n_prefill and actual_decode >= expected_n_decode:
return f"Model is ready. Have {actual_prefill} prefills and {actual_decode} decodes."
else:
- return f"Model is not ready, waiting for {expected_n_prefill - actual_prefill} prefills and {expected_n_decode - actual_decode} decodes. Have {actual_prefill} prefills and {actual_decode} decodes."
+ prefill_needed = max(0, expected_n_prefill - actual_prefill)
+ decode_needed = max(0, expected_n_decode - actual_decode)
+ return f"Model is not ready, waiting for {prefill_needed} prefills and {decode_needed} decodes. Have {actual_prefill} prefills and {actual_decode} decodes."🤖 Prompt for AI Agents
In scripts/utils/check_server_health.py around lines 30 to 48, the health
message can show negative "waiting for" counts when actual > expected; compute
positive wait counts by clamping each difference to zero (e.g., wait_prefill =
max(0, expected_n_prefill - actual_prefill) and wait_decode = max(0,
expected_n_decode - actual_decode)) and use those wait_prefill and wait_decode
values in the "Model is not ready" return string so it never reports negative
waits.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (5)
src/srtctl/core/schema.py (1)
242-244: Consider adding validation fornum_additional_frontends.The field accepts any integer, including negative values, which would result in invalid configurations. A simple validator ensuring
num_additional_frontends >= 0would prevent misconfiguration.🔎 Suggested validation:
+from pydantic import field_validator + class BackendConfig(BaseModel): # ... existing fields ... num_additional_frontends: int = 9 + + @field_validator("num_additional_frontends") + @classmethod + def validate_num_additional_frontends(cls, v): + if v < 0: + raise ValueError("num_additional_frontends must be non-negative") + return vdocs/README.md (1)
42-48: Command table improves clarity.The table format makes the available commands easy to scan.
Minor note: Line 48 shows
validateas an alias fordry-run, but the description reads "Alias for dry-run" which could be slightly clearer as "Same as dry-run" to avoid confusion about which is the primary command.docs/installation.md (1)
116-121: Add language specifier to fenced code block.The error message code block is missing a language specifier. Consider adding a language hint for consistent formatting.
🔎 Suggested fix:
-``` +```text sbatch: error: Invalid --segment specification</details> </blockquote></details> <details> <summary>scripts/worker_setup/worker.py (1)</summary><blockquote> `113-122`: **File handles for frontend logs are never closed.** The file handles `frontend_stdout` and `frontend_stderr` are opened but never explicitly closed. While these are for a background process and the handles will be closed when the parent process exits, it's better practice to manage them explicitly or use a context manager. Note: The same pattern exists at lines 273-274 in `setup_aggregated_worker`. Consider addressing both locations consistently. </blockquote></details> <details> <summary>scripts/templates/job_script_template_disagg.j2 (1)</summary><blockquote> `398-417`: **Consider safer alternative to `eval()` for parsing router list.** The inline Python script uses `eval(os.environ['ROUTER_LIST'])` to parse the bash-generated list. While the input is controlled (generated by the script itself), `eval()` is generally risky. Consider using `json.loads()` with proper JSON formatting instead. <details> <summary>🔎 Suggested approach:</summary> ```diff -ROUTER_LIST=$(printf "'%s'," "${router_ips[@]}") -ROUTER_LIST="[${ROUTER_LIST%,}]" +ROUTER_LIST=$(printf '"%s",' "${router_ips[@]}") +ROUTER_LIST="[${ROUTER_LIST%,}]" export ROUTER_LIST SCRIPT_DIR LOG_DIR python3 - <<'PY' import os +import json from jinja2 import Template ... -router_hosts = eval(os.environ['ROUTER_LIST']) +router_hosts = json.loads(os.environ['ROUTER_LIST'])The same pattern exists at line 125 for
FRONTEND_LIST.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
README.md(1 hunks)docs/README.md(2 hunks)docs/installation.md(8 hunks)docs/profiling.md(5 hunks)examples/example.yaml(1 hunks)scripts/templates/job_script_template_disagg.j2(3 hunks)scripts/worker_setup.py(5 hunks)scripts/worker_setup/__init__.py(2 hunks)scripts/worker_setup/worker.py(6 hunks)src/srtctl/core/schema.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- examples/example.yaml
🧰 Additional context used
🧬 Code graph analysis (3)
scripts/worker_setup/__init__.py (1)
scripts/worker_setup/worker.py (1)
setup_router_worker(188-231)
scripts/worker_setup/worker.py (2)
scripts/worker_setup/command.py (1)
install_dynamo_wheels(120-144)scripts/worker_setup/utils.py (1)
run_command(73-104)
scripts/worker_setup.py (1)
scripts/worker_setup/worker.py (1)
setup_router_worker(188-231)
🪛 LanguageTool
docs/installation.md
[style] ~116-~116: Consider an alternative verb to strengthen your wording.
Context: ... #### Segment-Based Scheduling If you see this error when submitting jobs: ``` s...
(IF_YOU_HAVE_THIS_PROBLEM)
🪛 markdownlint-cli2 (0.18.1)
docs/installation.md
118-118: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Ruff (0.14.8)
scripts/worker_setup/worker.py
192-192: Possible binding to all interfaces
(S104)
scripts/worker_setup.py
195-195: Avoid specifying long messages outside the exception class
(TRY003)
197-197: Avoid specifying long messages outside the exception class
(TRY003)
273-273: Possible binding to all interfaces
(S104)
🔇 Additional comments (12)
scripts/worker_setup/__init__.py (1)
10-10: LGTM!The new
setup_router_workerfunction is correctly imported and exported, following the existing module patterns.Also applies to: 30-30
docs/profiling.md (1)
39-40: Documentation update looks good.The unified profiling parameters simplify the configuration and the documentation clearly explains that the same parameters apply to both prefill and decode workers in disaggregated mode.
scripts/worker_setup.py (3)
93-121: New CLI arguments for sglang-router are well-structured.The arguments follow existing patterns and provide sensible defaults. The help text clearly describes each parameter's purpose.
192-197: Validation logic is correct.The validation properly requires both
--prefill-ipsand--decode-ipsfor the sglang-router worker type. Note: The static analysis warning about long exception messages (TRY003) can be safely ignored here as the descriptive messages aid debugging in CLI contexts.
266-277: Router worker setup implementation looks good.The IP parsing correctly handles whitespace and empty entries. Using
args.worker_idx or 0provides a safe fallback.Regarding the static analysis hint (S104) about binding to
0.0.0.0: This is intentional and appropriate for a router service that needs to accept external connections within the cluster network.docs/README.md (1)
15-15: Documentation updates correctly reflect the new features.The profiling support addition and the updated frontend distribution description (nginx or sglang_router) align well with the PR's implementation.
Also applies to: 25-25
README.md (1)
27-44: Good addition of the Commands section.The examples clearly demonstrate common workflows including job submission, custom setup scripts, tags, dry-run validation, and dashboard launch. This improves the onboarding experience.
docs/installation.md (2)
184-196: Backend options documentation looks good.The new backend options for
enable_multiple_frontends,num_additional_frontends, anduse_sglang_routerare clearly documented with appropriate defaults and explanations.
291-327: Comprehensive srtslurm.yaml reference is helpful.This complete example provides users with a single reference for all available configuration options, improving discoverability.
scripts/worker_setup/worker.py (2)
188-231: Newsetup_router_workerfunction looks well-structured.The function correctly:
- Accepts prefill/decode IP lists for router configuration
- Builds the
sglang_router.launch_routercommand with--pd-disaggregation- Properly formats prefill servers with bootstrap port and decode servers without
- Uses
run_commandfor consistent executionThe
host="0.0.0.0"default binding is appropriate for container environments where the router needs to accept connections from other nodes.
93-104: Correct conditional handling for sglang router mode.The logic properly skips dynamo wheel installation and frontend startup when
use_sglang_router=True, as the sglang router handles load balancing differently. The flag is correctly propagated toget_gpu_command.Also applies to: 109-111, 138-138
scripts/templates/job_script_template_disagg.j2 (1)
463-469: USE_SGLANG_ROUTER correctly propagated to benchmark script.The template properly converts the boolean to a bash-friendly string and passes it as an argument to bench.sh, enabling router-aware benchmarking.
| # Clone and install | ||
| git clone https://github.com/your-org/srtctl.git | ||
| cd srtctl | ||
| pip install -e . |
There was a problem hiding this comment.
Placeholder URL should be updated to actual repository.
Line 9 contains your-org/srtctl.git which appears to be a placeholder. This should reference the actual repository URL for users to clone.
🤖 Prompt for AI Agents
In README.md around lines 8 to 11 the git clone URL uses the placeholder
your-org/srtctl.git; replace that placeholder with the actual repository URL
(e.g., git clone https://github.com/<actual-org>/<actual-repo>.git) so the clone
command points to the real project, and verify the path and casing match the
repository name before committing.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
scripts/templates/job_script_template_disagg.j2 (1)
441-442: Consider reusing thenodes_per_routercalculation.The same calculation from line 383 is repeated here. Since both calculations are within the same conditional block, you could reuse the earlier value to reduce duplication.
💡 Optional refactor
Move the calculation before the first loop (around line 383) and reference the same variable in both loops, eliminating the recalculation at line 441.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
scripts/templates/job_script_template_disagg.j2(3 hunks)scripts/templates/nginx.conf.j2(1 hunks)
🔇 Additional comments (5)
scripts/templates/nginx.conf.j2 (1)
9-9: LGTM! Configurable backend port with sensible default.The template now accepts a
backend_portvariable while defaulting to 8000, enabling flexible nginx configurations for different backend topologies.scripts/templates/job_script_template_disagg.j2 (4)
341-359: LGTM! Clean IP collection and string formatting.The leader IP collection and comma-separated string construction follows standard bash patterns and correctly prepares the arguments for router invocation.
403-436: Port conflict resolved! ✅The previous port conflict issue (nginx and router 0 both binding to 8000) has been properly addressed:
- Routers now use internal port 30080 (line 403:
ROUTER_INTERNAL_PORT)- nginx listens on external port 8000 and proxies to routers on 30080
- Both services coexist on node 0 without conflict
This architecture cleanly separates concerns with nginx as the public-facing load balancer.
460-471: LGTM! Clean single-router fallback.The single-router mode provides a straightforward deployment without nginx overhead when multi-router scaling isn't needed. The router correctly binds directly to port 8000 without conflicts.
485-489: LGTM! Correct flag propagation to benchmark.The
USE_SGLANG_ROUTERflag is properly set and passed to the benchmark script, enabling conditional behavior in downstream benchmark utilities as intended by this PR.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
docs/sglang-router.md (1)
40-48: Add language identifiers to fenced code blocks (markdown linting).Lines 40–48, 65–80, and 99–101 contain fenced code blocks without language specifiers. For proper markdown linting compliance, add a language identifier (e.g.,
textfor ASCII diagrams).🔎 Apply these diffs to add language identifiers:
-┌─────────────────────────────────────────────────────────┐
+```text
+┌─────────────────────────────────────────────────────────┐```diff-┌──────────────────────────────────────────────────────────────────────┐
+```text
+┌──────────────────────────────────────────────────────────────────────┐```diff-nodes_per_router = ceil((total_nodes - 1) / num_additional_frontends)
+```
+nodes_per_router = ceil((total_nodes - 1) / num_additional_frontends)Also applies to: 65-80, 99-101
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
docs/README.md(2 hunks)docs/SUMMARY.md(1 hunks)docs/sglang-router.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/SUMMARY.md
[grammar] ~8-~8: Ensure spelling is correct
Context: ...ring](monitoring.md) - Parameter Sweeps - Analying
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
docs/sglang-router.md
[grammar] ~124-~124: Ensure spelling is correct
Context: ...rt 30000 by default. This is standard sglang behavior and doesn't need configuration...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🪛 markdownlint-cli2 (0.18.1)
docs/sglang-router.md
40-40: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
65-65: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
99-99: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (6)
docs/sglang-router.md (1)
1-209: Excellent, comprehensive documentation for SGLang Router feature.The documentation clearly explains the new SGLang Router mode as an alternative to Dynamo frontends, with well-structured sections covering configuration, architecture modes (single and multiple routers), distribution logic, port configuration, troubleshooting, and comparison with Dynamo. The ASCII diagrams, tables, and complete YAML example effectively communicate the feature's capabilities and deployment patterns. Port numbers (8000, 30080, 30001) are consistently used and documented.
docs/README.md (4)
32-39: Improved clarity in "How It Works" section.The expanded, enumerated breakdown of the workflow (validation → alias resolution → generation → SLURM submission) is clear and easy to follow. Good addition to the documentation flow.
25-26: Clear presentation of routing options.The updated architecture description now explicitly mentions both Dynamo frontends (default) and SGLang Router as routing options, with direct links to documentation. This improves discoverability and helps users understand the feature choice.
43-49: Command table improves CLI documentation.The tabular format for commands is clearer than prose and makes the
--tagsoption and aliases (dry-run,validate) immediately discoverable. Good UX improvement.
54-54: All referenced documentation files exist and are properly linked.The links to
sglang-router.md,profiling.md, andanalyzing.mdare valid—all files exist in the docs directory with correct relative link formatting (.mdextension). No broken links detected.docs/SUMMARY.md (1)
8-8: Fix typo in table of contents entry.Line 8 contains a spelling error: "Analying" should be "Analyzing".
🔎 Apply this diff to fix the typo:
-- [Analying](analyzing.md) +- [Analyzing](analyzing.md)Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
docs/slurm-faq.md (2)
9-13: Add language identifier to fenced code block.The code block at line 11 should specify a language for proper syntax highlighting. Since it contains an error message (plain text), use
```textor keep it without language if it's meant to be literal output. However, the subsequent YAML block should use```yaml.🔎 Proposed fix
If you see this error when submitting jobs: - ``` + ```text sbatch: error: Invalid generic resource (gres) specification - ``` + ```
25-29: Add language identifier to second fenced code block.Similar to the first block, specify
```textfor the error message output.🔎 Proposed fix
If you see this error when submitting jobs: - ``` + ```text sbatch: error: Invalid --segment specification - ``` + ```docs/installation.md (1)
21-21: Minor: Consider rewording for clarity.The phrase "You can use AI to figure out the right set of commands for your cluster" is informal. Consider: "Consult your cluster administrator or cluster documentation if these commands don't work for your system."
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
docs/README.md(1 hunks)docs/SUMMARY.md(1 hunks)docs/analyzing.md(1 hunks)docs/installation.md(7 hunks)docs/profiling.md(5 hunks)docs/slurm-faq.md(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- docs/analyzing.md
🧰 Additional context used
🪛 LanguageTool
docs/slurm-faq.md
[style] ~9-~9: Consider an alternative verb to strengthen your wording.
Context: ... ## GPU Resource Specification If you see this error when submitting jobs: ``` s...
(IF_YOU_HAVE_THIS_PROBLEM)
[style] ~25-~25: Consider an alternative verb to strengthen your wording.
Context: ...t. ## Segment-Based Scheduling If you see this error when submitting jobs: ``` s...
(IF_YOU_HAVE_THIS_PROBLEM)
🪛 markdownlint-cli2 (0.18.1)
docs/slurm-faq.md
11-11: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
27-27: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (12)
docs/SUMMARY.md (1)
1-10: Verify the referenced sglang-router.md file exists.The TOC references
sglang-router.mdat line 5, which is not included in this review. Ensure the file exists in thedocs/directory and that the link is correct.docs/README.md (1)
15-15: LGTM!The additions clearly document the new profiling support and SGLang Router features. The "How It Works" section provides helpful context for users, and the updated command table and Next Steps guide users through the expanded workflow.
Also applies to: 17-45
docs/profiling.md (1)
10-10: LGTM!The unified profiling parameter approach (top-level
isl,osl,concurrencyapplied to both prefill and decode) is clearly documented and consistent. The requirement forbenchmark.type: manualis explicitly stated, and the note at line 43 clarifies parameter usage across disaggregated workers.Also applies to: 16-18, 43-43
docs/installation.md (9)
14-15: LGTM!Repository URL updated to reflect the correct source (
ishandhanani/srt-slurm). The instructions remain clear and functional.
32-32: LGTM!The architecture-aware setup instructions are clear. The addition of
ARCHparameter requirement, updated step descriptions, and the new auto-detect step forsrtctl_rootimprove clarity and robustness for different environments.Also applies to: 35-35, 40-40, 47-47
81-92: LGTM!The new "Cloud Sync (Optional)" section clearly documents the optional S3-compatible storage integration with realistic examples and actionable make commands.
94-129: LGTM!The "Complete srtslurm.yaml Reference" section provides comprehensive documentation of all available configuration options, including the new SLURM directive compatibility flags and cloud settings. The examples are clear and realistic.
143-145: LGTM!The
extra_mountformatting updated to use YAML sequence syntax (dash-prefixed) is consistent with YAML best practices and clearer than the previous format.
182-182: LGTM!Added
req_rate: "inf"benchmark option with clear documentation. This parameter is properly positioned in the example configuration.
185-197: Clarify the relationship betweenenable_multiple_frontendsanduse_sglang_router.The new Backend Options section introduces two load-balancing configurations:
- nginx + multiple frontends (enabled by default):
enable_multiple_frontends: truewith optionalnum_additional_frontends: 9- SGLang router:
use_sglang_router: false(alternative to nginx)However, the documentation doesn't explicitly state whether these are mutually exclusive or how to transition between them. If setting
use_sglang_router: trueshould disable the nginx frontend mode, this constraint should be documented clearly.🔎 Suggested clarification
Consider adding:
backend: # Load balancing modes (mutually exclusive): # Option 1: nginx with multiple frontends (default) enable_multiple_frontends: true # Default: true num_additional_frontends: 9 # Default: 9 # Option 2: SGLang router (alternative to nginx) # Note: Set use_sglang_router: true to disable nginx mode use_sglang_router: false # Default: falseOr verify the actual behavior and update accordingly.
212-220: LGTM!The new "Submit with Tags" section clearly documents the
--tagsfeature with practical examples and explains how tags are stored and used in analysis.
260-260: LGTM!Updated note about setup script execution mentions Dynamo installation and remains accurate.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/srtctl/backends/sglang.py (2)
121-121: Security: Use safe YAML loader.Using
yaml.FullLoadercan execute arbitrary Python code from YAML files. Switch toyaml.safe_load()to prevent potential code injection.🔎 Proposed fix
- sglang_config = yaml.load(f, Loader=yaml.FullLoader) + sglang_config = yaml.safe_load(f)
346-346: Potential runtime error with None value.If
self.config.get("extra_mount")returnsNone, calling",".join(None)will raise aTypeError.🔎 Proposed fix
- "extra_container_mounts": ",".join(self.config.get("extra_mount") or []), + "extra_container_mounts": ",".join(self.config.get("extra_mount", [])),src/srtctl/cli/submit.py (1)
301-301: Security: Use safe YAML loader.Using
yaml.FullLoadercan execute arbitrary Python code from YAML files. Switch toyaml.safe_load()to prevent potential code injection.🔎 Proposed fix
- sweep_config = yaml.load(f, Loader=yaml.FullLoader) + sweep_config = yaml.safe_load(f)
🧹 Nitpick comments (1)
src/srtctl/backends/sglang.py (1)
204-385: Consider extracting helper methods to reduce complexity.The
generate_slurm_scriptmethod is quite long (181 lines). Consider extracting logical sections into smaller helper methods:
- Node count computation (lines 218-236)
- Template variable assembly (lines 310-347)
- Template path resolution (lines 355-371)
This would improve readability and testability, though the current implementation is functional.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/srtctl/backends/base.py(0 hunks)src/srtctl/backends/sglang.py(1 hunks)src/srtctl/cli/submit.py(2 hunks)
💤 Files with no reviewable changes (1)
- src/srtctl/backends/base.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/srtctl/cli/submit.py (1)
src/srtctl/backends/sglang.py (2)
SGLangBackend(22-385)generate_config_file(41-89)
🪛 Ruff (0.14.8)
src/srtctl/cli/submit.py
72-72: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
src/srtctl/backends/sglang.py
25-25: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
🔇 Additional comments (3)
src/srtctl/backends/sglang.py (1)
329-331: LGTM: sglang router flag integration.The
use_sglang_routerflag is correctly extracted frombackend_configwith a sensible default ofFalseand properly threaded through to the template variables.src/srtctl/cli/submit.py (2)
72-100: LGTM: Cleaner dry-run implementation.The new
run_dry_runfunction consolidates dry-run logic effectively, replacing the previousDryRunContextclass with a simpler, more direct approach. The conditional handling of SGLang-specific artifacts is well-structured.
129-131: LGTM: Conditional backend instantiation.The dry-run path correctly instantiates the backend only when needed (for SGLang) and gracefully handles the case where no backend is required.
| class SGLangBackend: | ||
| """SGLang backend for distributed serving.""" | ||
|
|
||
| def __init__(self, config: dict, setup_script: str = None): |
There was a problem hiding this comment.
Fix implicit Optional type hint.
The setup_script parameter should explicitly use str | None instead of relying on implicit Optional.
🔎 Proposed fix
- def __init__(self, config: dict, setup_script: str = None):
+ def __init__(self, config: dict, setup_script: str | None = None):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def __init__(self, config: dict, setup_script: str = None): | |
| def __init__(self, config: dict, setup_script: str | None = None): |
🧰 Tools
🪛 Ruff (0.14.8)
25-25: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
🤖 Prompt for AI Agents
In src/srtctl/backends/sglang.py around line 25, the __init__ signature uses an
implicit Optional for setup_script; change the parameter type to an explicit
union by updating the signature to use "setup_script: str | None = None" (or use
typing.Optional[str] if supporting older Python versions), and adjust any
related type hints or imports if necessary so the annotation is explicit.
| print("\nTo see what commands would run:") | ||
| print(f" cat {self.output_dir}/commands.sh") | ||
| print("\n" + "=" * 60 + "\n") | ||
| def run_dry_run(config: dict, backend, sglang_config_path: Path = None) -> Path: |
There was a problem hiding this comment.
Fix implicit Optional type hint.
The sglang_config_path parameter should explicitly use Path | None instead of relying on implicit Optional.
🔎 Proposed fix
-def run_dry_run(config: dict, backend, sglang_config_path: Path = None) -> Path:
+def run_dry_run(config: dict, backend, sglang_config_path: Path | None = None) -> Path:🧰 Tools
🪛 Ruff (0.14.8)
72-72: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
🤖 Prompt for AI Agents
In src/srtctl/cli/submit.py around line 72, the parameter annotation for
sglang_config_path uses an implicit Optional; update the function signature so
sglang_config_path is explicitly typed as Path | None (i.e., use the union type
operator) to make the optional intent explicit and type-checker friendly; ensure
Path is imported from pathlib if not already.
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (2)
scripts/utils/check_server_health.py (1)
30-48: Fix potentially confusing negative wait counts.When one worker type exceeds the expected count while another doesn't, Line 47's subtraction can produce negative values in the message. This is the same issue flagged in the previous review.
🔎 Apply the previously suggested fix
if actual_prefill >= expected_n_prefill and actual_decode >= expected_n_decode: return f"Model is ready. Have {actual_prefill} prefills and {actual_decode} decodes." else: - return f"Model is not ready, waiting for {expected_n_prefill - actual_prefill} prefills and {expected_n_decode - actual_decode} decodes. Have {actual_prefill} prefills and {actual_decode} decodes." + prefill_needed = max(0, expected_n_prefill - actual_prefill) + decode_needed = max(0, expected_n_decode - actual_decode) + return f"Model is not ready, waiting for {prefill_needed} prefills and {decode_needed} decodes. Have {actual_prefill} prefills and {actual_decode} decodes."src/srtctl/cli/submit.py (1)
56-56: Fix implicit Optional type hint.This issue was already flagged in a previous review. The
sglang_config_pathparameter should explicitly usePath | Noneinstead of relying on implicit Optional.🔎 Proposed fix
-def run_dry_run(config: dict, backend, sglang_config_path: Path = None) -> Path: +def run_dry_run(config: dict, backend, sglang_config_path: Path | None = None) -> Path:
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
scripts/utils/check_server_health.py(2 hunks)src/srtctl/backends/sglang.py(0 hunks)src/srtctl/cli/submit.py(6 hunks)src/srtctl/core/backend.py(1 hunks)
💤 Files with no reviewable changes (1)
- src/srtctl/backends/sglang.py
🧰 Additional context used
🧬 Code graph analysis (2)
src/srtctl/core/backend.py (3)
src/srtctl/core/config.py (1)
get_srtslurm_setting(104-118)src/srtctl/core/sweep.py (1)
expand_template(17-48)analysis/srtlog/models.py (1)
is_aggregated(74-76)
src/srtctl/cli/submit.py (1)
src/srtctl/core/backend.py (3)
SGLangBackend(20-212)render_command(81-113)generate_config_file(53-79)
🪛 Ruff (0.14.8)
src/srtctl/core/backend.py
23-23: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
53-53: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
65-65: Unnecessary key check before dictionary access
Replace with dict.get
(RUF019)
68-68: Avoid specifying long messages outside the exception class
(TRY003)
81-81: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
98-98: Probable use of unsafe loader FullLoader with yaml.load. Allows instantiation of arbitrary objects. Consider yaml.safe_load.
(S506)
115-115: Unused method argument: config_path
(ARG002)
115-115: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
115-115: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
203-203: Avoid specifying long messages outside the exception class
(TRY003)
src/srtctl/cli/submit.py
56-56: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
🔇 Additional comments (21)
scripts/utils/check_server_health.py (4)
9-27: LGTM! Clear documentation for dual-mode usage.The updated docstring provides helpful usage examples for both Dynamo and SGLang router modes, making it easy for users to understand how to invoke the script in each context.
50-80: LGTM! Clean refactoring of existing Dynamo health check logic.The existing logic has been successfully extracted into a well-named function without altering the health check behavior.
82-98: LGTM! Clean routing logic with proper validation.The function correctly validates numeric inputs and routes to the appropriate health check handler based on the
sglang_routerflag. The.isnumeric()check implicitly rejects negative values since it only accepts positive digit sequences.
100-115: LGTM! Well-structured CLI interface.The argparse implementation provides a clean CLI interface that correctly handles positional arguments for worker counts and an optional flag for SGLang router mode. The stdin-based input approach works well with the documented curl usage pattern.
src/srtctl/cli/submit.py (7)
29-29: LGTM: Import source change aligns with new architecture.The import of
SGLangBackendfromsrtctl.core.backendcorrectly reflects the refactored backend architecture shown in the relevant code snippets.
40-54: LGTM: Command rendering logic is clean and correct.The function properly delegates command generation to the backend and creates an executable script with appropriate content and permissions.
57-84: LGTM: Dry-run implementation is clean and informative.The inline artifact-saving approach is more readable than the previous context manager pattern, and the compact summary output provides clear feedback to users.
110-116: LGTM: Simplified dry-run flow is more maintainable.The conditional backend construction and streamlined call to
run_dry_runimprove code clarity.
178-220: LGTM: Metadata generation uses modern Python idioms effectively.The inline unpacking, walrus operator usage, and clearer variable naming (
bench_metainstead ofprofiler_metadata) improve readability without changing functionality.
303-307: LGTM: Streamlined output formatting.The compact f-string formatting improves readability of the sweep summary.
321-356: LGTM: CLI refactoring improves maintainability.The introduction of
add_common_argsreduces duplication, and the removal of thevalidatesubcommand simplifies the interface. The use ofgetattrwith fallbacks is idiomatic and clean.src/srtctl/core/backend.py (10)
31-38: LGTM: Helper methods provide clean encapsulation.These utility methods appropriately encapsulate common checks and lookups used throughout the class.
40-51: LGTM: Flag conversion logic handles all value types correctly.The method properly handles booleans, lists, and scalars, with sorted output for consistency.
63-68: LGTM: Kebab-case validation prevents CLI flag errors.The validation ensures config keys use the correct format for command-line flags, with a helpful error message showing the correct form.
70-79: LGTM: Config file generation correctly merges environment variables.The logic properly combines mode-specific configurations with their environment variables and writes to a temporary file.
85-94: LGTM: SGLang router support correctly implemented.The launcher selection logic properly handles three cases: nsys profiling, SGLang router (when profiling is disabled or
use_sglang_routeris enabled), and Dynamo fallback. This aligns with the PR objective to support the SGLang router.
101-113: LGTM: Node count calculation correctly handles both modes.The ternary logic properly calculates node count for disaggregated (mode-specific) and aggregated deployments.
120-128: LGTM: Resource extraction correctly handles both deployment modes.The conditional logic properly extracts aggregated or disaggregated resources and calculates total node count.
130-148: LGTM: SLURM and benchmark configuration with sensible defaults.The code properly retrieves SLURM settings with fallbacks to cluster config, and correctly parses benchmark parameters for sa-bench.
163-198: LGTM: Template variables comprehensively prepared, including router flag.The template_vars dictionary is thorough and correctly includes the
use_sglang_routerflag (line 184), enabling the templates to conditionally render router-specific logic.
200-212: LGTM: Template rendering with helpful error message.The template loading, rendering, and temporary file creation are correct. The error message on line 203 is descriptive and helps users troubleshoot configuration issues.
| class SGLangBackend: | ||
| """SGLang backend for distributed serving.""" | ||
|
|
||
| def __init__(self, config: dict, setup_script: str = None): | ||
| self.config = config | ||
| self.backend_config = config.get("backend", {}) | ||
| self.resources = config.get("resources", {}) | ||
| self.model = config.get("model", {}) | ||
| self.slurm = config.get("slurm", {}) | ||
| self.setup_script = setup_script | ||
|
|
There was a problem hiding this comment.
Fix implicit Optional type hint for setup_script.
The setup_script parameter should explicitly use str | None instead of relying on implicit Optional.
🔎 Proposed fix
- def __init__(self, config: dict, setup_script: str = None):
+ def __init__(self, config: dict, setup_script: str | None = None):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| class SGLangBackend: | |
| """SGLang backend for distributed serving.""" | |
| def __init__(self, config: dict, setup_script: str = None): | |
| self.config = config | |
| self.backend_config = config.get("backend", {}) | |
| self.resources = config.get("resources", {}) | |
| self.model = config.get("model", {}) | |
| self.slurm = config.get("slurm", {}) | |
| self.setup_script = setup_script | |
| class SGLangBackend: | |
| """SGLang backend for distributed serving.""" | |
| def __init__(self, config: dict, setup_script: str | None = None): | |
| self.config = config | |
| self.backend_config = config.get("backend", {}) | |
| self.resources = config.get("resources", {}) | |
| self.model = config.get("model", {}) | |
| self.slurm = config.get("slurm", {}) | |
| self.setup_script = setup_script |
🧰 Tools
🪛 Ruff (0.14.8)
23-23: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
🤖 Prompt for AI Agents
In src/srtctl/core/backend.py around lines 20 to 30, the __init__ parameter
setup_script uses an implicit Optional by defaulting to None; update the type
annotation to explicitly indicate it can be None (use setup_script: str | None =
None or, if targeting older Python versions, setup_script: Optional[str] = None
and add from typing import Optional) so the signature clearly expresses an
optional string.
| lines.append(f" --{flag} {value} \\") | ||
| return lines | ||
|
|
||
| def generate_config_file(self, params: dict = None) -> Path | None: |
There was a problem hiding this comment.
Fix implicit Optional type hint for params.
The params parameter should explicitly use dict | None instead of relying on implicit Optional.
🔎 Proposed fix
- def generate_config_file(self, params: dict = None) -> Path | None:
+ def generate_config_file(self, params: dict | None = None) -> Path | None:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def generate_config_file(self, params: dict = None) -> Path | None: | |
| def generate_config_file(self, params: dict | None = None) -> Path | None: |
🧰 Tools
🪛 Ruff (0.14.8)
53-53: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
🤖 Prompt for AI Agents
In src/srtctl/core/backend.py around line 53, the method signature uses an
implicit Optional for params; change the type hint to explicitly declare params:
dict | None (i.e., replace params: dict = None with params: dict | None = None)
so the parameter type is unambiguous; ensure any callers and docstrings remain
compatible and adjust any import/typing usage if your project targets a Python
version that requires typing.Optional instead of PEP 604 syntax.
| logging.info(f"Generated SGLang config: {temp_path}") | ||
| return Path(temp_path) | ||
|
|
||
| def render_command(self, mode: str, config_path: Path = None) -> str: |
There was a problem hiding this comment.
Fix implicit Optional type hint for config_path.
The config_path parameter should explicitly use Path | None instead of relying on implicit Optional.
🔎 Proposed fix
- def render_command(self, mode: str, config_path: Path = None) -> str:
+ def render_command(self, mode: str, config_path: Path | None = None) -> str:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def render_command(self, mode: str, config_path: Path = None) -> str: | |
| def render_command(self, mode: str, config_path: Path | None = None) -> str: |
🧰 Tools
🪛 Ruff (0.14.8)
81-81: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
🤖 Prompt for AI Agents
In src/srtctl/core/backend.py around line 81 the parameter type for config_path
is implicitly Optional; update the function signature to explicitly use a union
type (config_path: Path | None) instead of relying on implicit Optional, and
ensure typing imports/forward-refs are correct (add "from __future__ import
annotations" at top if needed or import typing.Optional if you prefer
Optional[Path]) so static type checkers accept the explicit optional type; run
tests/type-checking after the change.
| if config_path: | ||
| with open(config_path) as f: | ||
| sglang_config = yaml.load(f, Loader=yaml.FullLoader) | ||
| lines.extend(self._config_to_flags(sglang_config.get(mode, {}))) |
There was a problem hiding this comment.
Security: Replace yaml.FullLoader with yaml.safe_load.
Using yaml.FullLoader allows instantiation of arbitrary Python objects, which is a security risk if config files can come from untrusted sources.
🔎 Proposed fix
if config_path:
with open(config_path) as f:
- sglang_config = yaml.load(f, Loader=yaml.FullLoader)
+ sglang_config = yaml.safe_load(f)
lines.extend(self._config_to_flags(sglang_config.get(mode, {})))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if config_path: | |
| with open(config_path) as f: | |
| sglang_config = yaml.load(f, Loader=yaml.FullLoader) | |
| lines.extend(self._config_to_flags(sglang_config.get(mode, {}))) | |
| if config_path: | |
| with open(config_path) as f: | |
| sglang_config = yaml.safe_load(f) | |
| lines.extend(self._config_to_flags(sglang_config.get(mode, {}))) |
🧰 Tools
🪛 Ruff (0.14.8)
98-98: Probable use of unsafe loader FullLoader with yaml.load. Allows instantiation of arbitrary objects. Consider yaml.safe_load.
(S506)
🤖 Prompt for AI Agents
In src/srtctl/core/backend.py around lines 96 to 99, the code uses
yaml.FullLoader to parse the config which can instantiate arbitrary Python
objects; replace yaml.load(f, Loader=yaml.FullLoader) with yaml.safe_load(f) to
avoid object construction from untrusted input, preserve existing behavior by
handling the possible None return (e.g., treat as {} or use .get safely), and
ensure any downstream call self._config_to_flags receives a dict.
| ) | ||
| return "\n".join(lines) | ||
|
|
||
| def generate_slurm_script(self, config_path: Path = None, timestamp: str = None) -> tuple[Path, str]: |
There was a problem hiding this comment.
Fix implicit Optional type hints for both parameters.
Both config_path and timestamp parameters should explicitly use union syntax instead of relying on implicit Optional.
🔎 Proposed fix
- def generate_slurm_script(self, config_path: Path = None, timestamp: str = None) -> tuple[Path, str]:
+ def generate_slurm_script(self, config_path: Path | None = None, timestamp: str | None = None) -> tuple[Path, str]:🧰 Tools
🪛 Ruff (0.14.8)
115-115: Unused method argument: config_path
(ARG002)
115-115: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
115-115: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
🤖 Prompt for AI Agents
In src/srtctl/core/backend.py around line 115, the method signature uses
implicit Optional by default (config_path: Path = None, timestamp: str = None);
change it to use explicit union syntax for both parameters: config_path: Path |
None = None and timestamp: str | None = None so static type checkers and linters
recognize the optional types; update any related type hints or docstrings nearby
if present to match the new annotations.
|
what does multiple sglang router mean? each router connect to different worker? |
Summary by CodeRabbit
Release Notes
New Features
use_sglang_router,num_additional_frontends, and backend-specific parameters.Documentation
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.