test(router): cover round-robin unset dp-rank flow#7991
Conversation
…print PrefillRouter::query_prefill_worker returns Option<u32> for dp_rank. The C FFI wrapper was declaring u32, causing E0308 in clippy. Map None to u32::MAX (NO_DP_RANK sentinel) so the Python side sees _DP_RANK_UNSET.
Signed-off-by: PeaBrane <yanrpei@gmail.com>
WalkthroughModified MockEngine to assign DP ranks via atomic counter-based round-robin when not explicitly provided in requests. Added router test infrastructure including a helper function for verifying disaggregated prefill DP rank distribution and a parameterized E2E test. Updated Python process invocations to use Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/router/test_router_e2e_with_mockers.py (1)
1283-1375: Extract the disagg worker launch matrix into a shared helper.This repeats the same
registration_orderbranching and nestedDisaggMockerProcesssetup astest_router_decisions_disagg, so the two E2E paths can drift independently. A small helper/fixture that yields(prefill_workers, decode_workers)would keep this logic in one place.As per coding guidelines, "Do not copy-paste test infrastructure; reuse and refactor shared setup logic into fixtures or
tests/utils/."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/router/test_router_e2e_with_mockers.py` around lines 1283 - 1375, The test duplicates the disaggregated worker launch matrix (registration_order branching and nested DisaggMockerProcess usage) already present in test_router_decisions; refactor by extracting that setup into a shared helper/fixture (e.g., a fixture in tests/utils or a helper function) that yields (prefill_workers, decode_workers) and accepts params like namespace, prefill_mocker_args, decode_mocker_args, registration_order, request_plane, and enable_bootstrap; update test_router_decisions_disagg_round_robin_prefill_dp_rank and test_router_decisions_disagg to call the new helper/fixture instead of repeating the DisaggMockerProcess nesting so both tests reuse the same launch logic and avoid divergence.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/router/common.py`:
- Around line 1814-1898: The test can race because wait_for_frontend_ready()
only checks /v1/models; add a warm-up loop to exercise the frontend round-robin
path before measuring DP-rank deltas: after you verify worker_ids (using
client.instance_ids()) and before taking baseline_counts via
observer_router.dump_events(), perform a small number of dummy chat POSTs to
chat_url (at least prefill_workers.num_workers requests, using the same minimal
payload used later) and await their responses (with short sleeps) so the
frontend finishes discovering the prefill pool and round-robins to all prefill
workers.
---
Nitpick comments:
In `@tests/router/test_router_e2e_with_mockers.py`:
- Around line 1283-1375: The test duplicates the disaggregated worker launch
matrix (registration_order branching and nested DisaggMockerProcess usage)
already present in test_router_decisions; refactor by extracting that setup into
a shared helper/fixture (e.g., a fixture in tests/utils or a helper function)
that yields (prefill_workers, decode_workers) and accepts params like namespace,
prefill_mocker_args, decode_mocker_args, registration_order, request_plane, and
enable_bootstrap; update
test_router_decisions_disagg_round_robin_prefill_dp_rank and
test_router_decisions_disagg to call the new helper/fixture instead of repeating
the DisaggMockerProcess nesting so both tests reuse the same launch logic and
avoid divergence.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ba95e2c9-e3aa-417c-a244-957d17793e8c
📒 Files selected for processing (4)
lib/llm/src/mocker.rstests/router/common.pytests/router/router_process.pytests/router/test_router_e2e_with_mockers.py
Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: PeaBrane <yanrpei@gmail.com>
Add a disagg e2e test that verifies round-robin prefill DP-rank selection across bootstrap and non-bootstrap paths. Teach the mocker to round-robin unset dp_rank so the test exercises the same behavior the real engine uses.
Summary by CodeRabbit
Release Notes
New Features
Tests
Chores