[sglang-miles] Cherry-pick #24851: Add routed_experts_start_len for absolute routing slice control by ByronHsu · Pull Request #24904 · sgl-project/sglang

ByronHsu · 2026-05-10T18:52:39Z

Summary

Cherry-pick of #24851 (merge commit d82e339) onto sglang-miles.

Adds routed_experts_start_len: int = 0 across the full request lifecycle (GenerateReqInput, TokenizedGenerateReqInput, OpenAI protocol, Engine, Req, tokenizer_manager, scheduler, session_controller, encode_receiver, serving_chat/completions)
Scheduler validates start_len is non-negative and <= prompt_tokens
maybe_collect_routed_experts honors start_len, early-returns when return_routed_experts is False, and logs row-count mismatches
get_routed_experts gains start_len parameter with defensive clamping
Tests added in test_return_routed_experts.py

Conflict resolution:

Multiple conflicts due to upstream main evolution (docs_new directory, state_capturer refactoring, indexer_topk module). Resolution:
- Removed docs_new/ files (directory doesn't exist on sglang-miles)
- Removed state_capturer/base.py (sglang-miles uses layers/moe/routed_experts_capturer.py)
- Removed return_indexer_topk / maybe_collect_indexer_topk references (not part of [Session R3] Add routed_experts_start_len for absolute routing slice control #24851, leaked from upstream context)
- Added start_len parameter directly to sglang-miles get_routed_experts method instead of upstream's get_topk
- Content conflicts in io_struct, schedule_batch, scheduler, tokenizer_manager, scheduler_output_processor_mixin resolved by taking only the routed_experts_start_len additions

Test plan

Existing TestReturnRoutedExperts tests should pass (no regression)
New TestRoutedExpertsStartLen tests cover default behavior, row-count correctness, bounds checking, and cache-hit interaction

Made with Cursor

…control (#24851) Co-authored-by: Byron Hsu <byron@periodiclabs.ai> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: zyzshishui <zyzshishui@gmail.com> Co-authored-by: Yuzhen Zhou <82826991+zyzshishui@users.noreply.github.com> (cherry picked from commit d82e339)

gemini-code-assist

Code Review

This pull request introduces the routed_experts_start_len parameter, allowing users to specify an absolute start position for returned MoE routing information. The changes propagate this parameter through the API protocols, engine entrypoints, scheduling logic, and the expert capturer layer. Review feedback identifies a potential NameError due to a missing logger import in the output processor, inconsistent validation logic in the disaggregation receiver, brittle hardcoded constants in the new tests, and a type hint mismatch in a test utility function.

gemini-code-assist · 2026-05-10T18:57:27Z

+            logger.warning(
+                "routed_experts row-count mismatch for req %s: got %d, "
+                "expected %d (seqlen=%d, cached_tokens=%d, start_len=%s). "
+                "This indicates a silent bug.",
+                req.rid,
+                req.routed_experts.shape[0],
+                expected_rows,
+                req.seqlen,
+                req.cached_tokens,
+                req.routed_experts_start_len,
+            )


The logger variable is used here but it does not appear to be imported or defined within this mixin file. This will likely result in a NameError at runtime when a row-count mismatch occurs. Please ensure logger is imported from sglang.srt.utils or defined at the module level.

this is wrong. it is above

gemini-code-assist · 2026-05-10T18:57:27Z

+_QWEN3_30B_A3B_NUM_LAYERS = 48
+_QWEN3_30B_A3B_TOPK = 8


Hardcoding model-specific constants like _QWEN3_30B_A3B_NUM_LAYERS and _QWEN3_30B_A3B_TOPK makes the test brittle if the default test model changes. It would be better to derive these values from the model configuration or the response metadata to ensure the test remains valid across different MoE architectures.

gemini-code-assist · 2026-05-10T18:57:27Z

+        if hasattr(cls, "process") and cls.process:
+            kill_process_tree(cls.process.pid)
+
+    def _send(self, payload: dict) -> dict:


The type hint for _send indicates it returns a dict, but the implementation returns a requests.Response object. This mismatch can be confusing for static analysis and future maintenance.

Suggested change

def _send(self, payload: dict) -> dict:

def _send(self, payload: dict) -> requests.Response:

ByronHsu requested review from BBuf, CatherineSue, Edwardf0t1, Fridge003, HaiShaw, JustinTong0323, ShangmingCai, Ying1123, ch-wan, hnyls2002, ispobock, merrymercy, slin1237 and xiezhq-hermann as code owners May 10, 2026 18:52

gemini-code-assist Bot reviewed May 10, 2026

View reviewed changes

ByronHsu merged commit a929eb7 into sglang-miles May 10, 2026
7 checks passed

ByronHsu deleted the byron/cp-24851-to-sglang-miles branch May 10, 2026 19:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[sglang-miles] Cherry-pick #24851: Add routed_experts_start_len for absolute routing slice control#24904

[sglang-miles] Cherry-pick #24851: Add routed_experts_start_len for absolute routing slice control#24904
ByronHsu merged 1 commit into
sglang-milesfrom
byron/cp-24851-to-sglang-miles

ByronHsu commented May 10, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 10, 2026

Uh oh!

ByronHsu May 10, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot May 10, 2026

Uh oh!

gemini-code-assist Bot May 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	def _send(self, payload: dict) -> dict:
	def _send(self, payload: dict) -> requests.Response:

		_QWEN3_30B_A3B_NUM_LAYERS = 48
		_QWEN3_30B_A3B_TOPK = 8

Conversation

ByronHsu commented May 10, 2026

Summary

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

ByronHsu May 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant