Skip to content

feat(mocker): add offline disagg replay#7617

Merged
PeaBrane merged 28 commits intomainfrom
rupei/disagg-replay-prep
Mar 25, 2026
Merged

feat(mocker): add offline disagg replay#7617
PeaBrane merged 28 commits intomainfrom
rupei/disagg-replay-prep

Conversation

@PeaBrane
Copy link
Copy Markdown
Contributor

@PeaBrane PeaBrane commented Mar 25, 2026

Add an offline disaggregated replay path with separate prefill/decode worker pools, staged Python/CLI bindings, and replay docs updates. This also keeps the aggregated replay path intact while extending tests around disagg routing, metrics, and timing behavior.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added router configuration flag to track or ignore prompt-side prefill tokens in load accounting (--router-track-prefill-tokens / --no-router-track-prefill-tokens).
    • Added support for offline disaggregated replay mode with separate prefill and decode engine configuration and worker counts.
  • Documentation

    • Updated router and replay documentation with new CLI options and disaggregated mode usage guidance.

Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: PeaBrane <yanrpei@gmail.com>
@PeaBrane PeaBrane requested review from a team as code owners March 25, 2026 01:49
@github-actions github-actions bot added feat documentation Improvements or additions to documentation router Relates to routing, KV-aware routing, etc. labels Mar 25, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 25, 2026

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 25, 2026

Walkthrough

This pull request introduces a new router_track_prefill_tokens boolean configuration flag (default: true) to control whether prompt-side prefill tokens are included in the router's active load accounting. The feature is integrated through configuration, CLI, routing, scheduling, and replay paths. Additionally, new offline disaggregated ("disagg") replay infrastructure is added, supporting separate prefill and decode worker pools with independent engines and routers.

Changes

Cohort / File(s) Summary
Configuration & CLI
components/src/dynamo/common/configuration/groups/kv_router_args.py, components/src/dynamo/router/__main__.py
Added router_track_prefill_tokens to KV router configuration exports and CLI flag --router-track-prefill-tokens with environment variable DYN_ROUTER_TRACK_PREFILL_TOKENS (default true); integrated into startup logging.
Router Documentation
components/src/dynamo/router/README.md, docs/components/router/README.md, docs/components/router/router-guide.md
Documented new CLI flag --no-router-track-prefill-tokens with usage guidance for decode-only routing paths; updated disaggregated serving characteristics to include track_prefill_tokens=false.
Replay Documentation
docs/benchmarks/mocker-trace-replay.md, docs/mocker/mocker.md
Added new disaggregated offline replay CLI options (--prefill-engine-args, --decode-engine-args, --num-prefill-workers, --num-decode-workers) with configuration constraints and examples.
Routing & Scheduling Core
lib/kv-router/src/scheduling/config.rs, lib/kv-router/src/scheduling/local.rs, lib/kv-router/src/scheduling/types.rs, lib/llm/src/kv_router.rs, lib/llm/src/kv_router/scheduler.rs, lib/llm/src/kv_router/prefill_router.rs
Added router_track_prefill_tokens field to KvRouterConfig and RouterConfigOverride with helper methods; integrated into LocalScheduler and KvScheduler to control load estimation via new track_prefill_tokens parameter; decode router overrides now disable prefill token tracking.
Sequence Tracking
lib/kv-router/src/sequences/multi_worker.rs, lib/kv-router/src/sequences/single.rs, lib/kv-router/src/protocols.rs, lib/kv-router/src/scheduling/queue.rs, lib/kv-router/src/scheduling/policy.rs
Added track_prefill_tokens: bool field to SequenceRequest and ActiveSequenceEventData::AddRequest; introduced add_request_with_prefill_tracking(...) and potential_blocks_and_tokens_with_prefill_tracking(...) methods to conditionally account for prefill load.
Offline Disaggregated Replay Core
lib/mocker/src/replay/offline/disagg.rs, lib/mocker/src/replay/offline/events.rs, lib/mocker/src/replay/offline/runtime_utils.rs, lib/mocker/src/replay/offline/mod.rs, lib/mocker/src/replay/offline/README.md
Implemented new offline disaggregated replay simulator with separate prefill/decode worker pools, dual KV routers, staged event processing, and request state management; added SimulationWorkerStage enum and WorkerCompletionPayload for multi-stage event handling; added shared timing and admission utilities.
Replay Infrastructure & Validation
lib/mocker/src/replay/entrypoints.rs, lib/mocker/src/replay/mod.rs, lib/mocker/src/replay/validate.rs, lib/mocker/src/replay/router/offline.rs, lib/mocker/src/replay/router/online.rs, lib/mocker/src/replay/router/shared.rs, lib/mocker/src/replay/offline/state.rs, lib/mocker/src/replay/offline/multi.rs
Added OfflineDisaggReplayConfig struct and disaggregated entrypoints; introduced validate_offline_disagg_replay_args and validate_offline_disagg_concurrency_args validators; integrated track_prefill_tokens into offline replay router construction; refactored multi-worker runtime to use shared timing/event utilities; added execute_hidden_pass methods to worker state and engine core.
Scheduler Engine Implementations
lib/mocker/src/scheduler/mod.rs, lib/mocker/src/scheduler/vllm/core.rs, lib/mocker/src/scheduler/sglang/core.rs
Added execute_hidden_pass(...) method to EngineCore and engine implementations (vLLM, SGLang) for hidden pass execution without trace collection.
C & Python Bindings
lib/bindings/c/src/lib.rs, lib/bindings/python/rust/llm/entrypoint.rs
Exposed router_track_prefill_tokens in C and Python KvRouterConfig constructors; C bindings read environment variable DYN_ROUTER_TRACK_PREFILL_TOKENS and include in decode router overrides.
Python Replay API
lib/bindings/python/rust/llm/replay.rs, lib/bindings/python/src/dynamo/replay/api.py, lib/bindings/python/src/dynamo/replay/main.py
Extended run_mocker_trace_replay and run_mocker_synthetic_trace_replay signatures with prefill_engine_args, decode_engine_args, num_prefill_workers, num_decode_workers; introduced load_replay_args_selection to branch on aggregated vs. disaggregated mode; added CLI parsing for new disaggregated engine and worker-count parameters.
Benchmark & Test Updates
lib/bench/kv_router/active_sequences_bench.rs, lib/bindings/python/tests/test_replay.py
Set track_prefill_tokens: true in benchmark sequence requests; added disaggregated replay tests covering offline trace/synthetic replay, error conditions, and CLI subprocess smoke test with disaggregated worker configuration.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description provided is brief but missing key structure from the template. It lacks an 'Overview' section, detailed 'Details' breakdown, 'Where should the reviewer start' guidance, and 'Related Issues' reference. Expand the description to include all template sections: Overview (high-level purpose), Details (breakdown of changes), Where should the reviewer start (key files), and Related Issues (GitHub issue number if applicable).
Docstring Coverage ⚠️ Warning Docstring coverage is 44.17% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(mocker): add offline disagg replay' clearly describes the main feature being added—an offline disaggregated replay capability. It follows conventional commit format and is specific and concise.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/mocker/src/replay/router/offline.rs (1)

374-397: ⚠️ Potential issue | 🟠 Major

track_prefill_tokens is still ignored during offline admission.

SequenceRequest now carries the flag, but admit_request() still computes candidate load with slots.potential_blocks_and_tokens(...), which always includes prompt-side tokens. That means offline replay keeps charging prompt-side load during worker selection even when router_track_prefill_tokens=false (notably the decode pool in disagg mode). This path needs the same potential_blocks_and_tokens_with_prefill_tracking(..., request.track_prefill_tokens) change that lib/kv-router/src/scheduling/queue.rs got.

Suggested fix
 fn admit_request(&mut self, request: PendingRequest) -> Result<usize> {
-    let (decode_blocks, prefill_tokens) = self.slots.potential_blocks_and_tokens(
-        request.token_seq.as_deref(),
-        request.isl_tokens,
-        request.overlaps.clone(),
-    );
+    let (decode_blocks, prefill_tokens) = self
+        .slots
+        .potential_blocks_and_tokens_with_prefill_tracking(
+            request.token_seq.as_deref(),
+            request.isl_tokens,
+            request.overlaps.clone(),
+            request.track_prefill_tokens,
+        );
     let scheduling_request = request.scheduling_request(decode_blocks, prefill_tokens);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/mocker/src/replay/router/offline.rs` around lines 374 - 397, The offline
admit_request path currently calls self.slots.potential_blocks_and_tokens(...)
which always counts prompt-side (prefill) tokens; change that call to use the
prefill-aware API
self.slots.potential_blocks_and_tokens_with_prefill_tracking(...,
request.track_prefill_tokens) so worker selection respects the
SequenceRequest.track_prefill_tokens flag. Update the call site in admit_request
(where decode_blocks and prefill_tokens are computed) to pass
request.track_prefill_tokens into
potential_blocks_and_tokens_with_prefill_tracking and keep the rest of the
scheduling flow (scheduling_request, selector.select_worker, SequenceRequest
construction) unchanged.
🧹 Nitpick comments (3)
lib/mocker/src/replay/offline/runtime_utils.rs (1)

79-102: Refutable pattern match on single-variant enum may break if new event kinds are added.

The let binding on lines 88-94 uses a refutable pattern that assumes SimulationEventKind::WorkerCompletion is the only variant. If SimulationEventKind gains additional variants in the future, this will panic at runtime instead of producing a compile error.

Consider using match with an explicit exhaustive arm or adding a #[deny(irrefutable_let_patterns)] check if this assumption should remain stable.

♻️ Optional: Use explicit match for future-proofing
-    let SimulationEventKind::WorkerCompletion {
-        stage,
-        worker_idx,
-        completed_requests,
-        output_signals,
-        kv_events,
-    } = event.kind;
+    let (stage, worker_idx, completed_requests, output_signals, kv_events) = match event.kind {
+        SimulationEventKind::WorkerCompletion {
+            stage,
+            worker_idx,
+            completed_requests,
+            output_signals,
+            kv_events,
+        } => (stage, worker_idx, completed_requests, output_signals, kv_events),
+    };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/mocker/src/replay/offline/runtime_utils.rs` around lines 79 - 102, The
current refutable pattern in pop_ready_worker_completion destructures event.kind
as SimulationEventKind::WorkerCompletion and can panic if new variants are
added; change the code to explicitly match event.kind (from the SimulationEvent
returned by events.peek()/events.pop()) and handle the WorkerCompletion arm by
constructing and returning WorkerCompletionPayload, while adding a wildcard arm
that safely returns None (or logs/handles unexpected variants) to avoid runtime
panics.
lib/kv-router/src/protocols.rs (1)

1030-1056: Add a legacy-payload test, not just a round-trip.

This only proves false survives when the field is present. The compatibility contract from the new serde default is that older AddRequest payloads without track_prefill_tokens still deserialize to true, so please lock that in with a fixture that omits the field.

Example regression test
+    #[test]
+    fn test_active_sequence_add_request_defaults_track_prefill_tokens_for_legacy_payloads() {
+        let legacy = r#"{"request_id":"req-123","worker":{"worker_id":7,"dp_rank":0},"data":{"AddRequest":{"token_sequence":[11,22],"isl":128,"overlap":1,"expected_output_tokens":32}},"router_id":9,"lora_name":null}"#;
+        let deserialized: ActiveSequenceEvent = serde_json::from_str(legacy).unwrap();
+
+        match deserialized.data {
+            ActiveSequenceEventData::AddRequest {
+                track_prefill_tokens,
+                ..
+            } => assert!(track_prefill_tokens),
+            _ => panic!("expected add request event"),
+        }
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/kv-router/src/protocols.rs` around lines 1030 - 1056, The current test
only round-trips an event that includes track_prefill_tokens=false; add a
regression test that deserializes a legacy JSON payload which omits the
track_prefill_tokens field and asserts the resulting ActiveSequenceEvent
(specifically ActiveSequenceEventData::AddRequest) yields track_prefill_tokens
== true. Create a new test (or extend
test_active_sequence_add_request_serialization_preserves_track_prefill_tokens)
that constructs a JSON string representing an AddRequest without the
track_prefill_tokens key, calls serde_json::from_str::<ActiveSequenceEvent>(),
matches on ActiveSequenceEventData::AddRequest and asserts track_prefill_tokens
is true to lock in the backward-compat default behavior.
lib/bindings/python/tests/test_replay.py (1)

593-614: Consider adding a @pytest.mark.timeout decorator.

This test exercises the full synthetic disagg replay pipeline, which involves worker execution. While the speedup ratio is high (1000.0), adding a timeout guard would be consistent with the subprocess tests and prevent CI hangs if something goes wrong.

🛡️ Suggested timeout decorator
+@pytest.mark.timeout(30)
 def test_run_synthetic_trace_replay_disagg_preserves_expected_output_tokens():
     report = run_synthetic_trace_replay(

As per coding guidelines: "add @pytest.mark.timeout() for any test that may exceed 30s or uses polling/sleeps/subprocess waits"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/bindings/python/tests/test_replay.py` around lines 593 - 614, Add a
pytest timeout decorator to the
test_run_synthetic_trace_replay_disagg_preserves_expected_output_tokens function
to prevent CI hangs; annotate the function with `@pytest.mark.timeout`(30) (or
another appropriate seconds value) placed immediately above its def, and ensure
pytest is imported in the test file if not already so the decorator resolves.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/benchmarks/mocker-trace-replay.md`:
- Around line 171-178: The docs currently present staged disagg args (flags
--prefill-engine-args and --decode-engine-args and staged JSON fields) as
conveniences; update the text to state them as required for offline disagg
replay and document the validator constraints: each --prefill-engine-args JSON
must include worker_type="prefill" and each --decode-engine-args JSON must
include worker_type="decode", both staged configs must set the same block_size,
and pool sizes are controlled via --num-prefill-workers and
--num-decode-workers; mention these requirements where the staged args are
described (references: flags --prefill-engine-args, --decode-engine-args, fields
worker_type and block_size, and flags --num-prefill-workers /
--num-decode-workers) so users won’t hit validation errors.

In `@docs/mocker/mocker.md`:
- Around line 131-134: The AIC guidance and any references to dynamo.replay must
be updated to mention split engine args for disaggregated offline replay: when
using `--replay-mode offline` with disagg you should use `--prefill-engine-args`
and `--decode-engine-args` (plus `--num-prefill-workers` and
`--num-decode-workers`) instead of `--extra-engine-args`; update the AIC section
text to conditionally show the aggregated example using `--extra-engine-args`
and a separate disagg example showing `--prefill-engine-args`,
`--decode-engine-args`, and the worker flags, and change any instructions that
currently tell `dynamo.replay` users to only use `--extra-engine-args` to
include the disagg path and flags.

In `@lib/bindings/c/src/lib.rs`:
- Around line 489-492: The bookkeeping path is not receiving the disagg decode
RouterConfigOverride, so decode_router.add_request is called with None and the
assume_kv_reuse/track_prefill_tokens overrides never reach
KvRouter::add_request; fix by creating or reusing the same RouterConfigOverride
(e.g., RouterConfigOverride { overlap_score_weight: Some(0.0), assume_kv_reuse:
Some(false), track_prefill_tokens: Some(false) }) used for query-time and pass
it into the bookkeeping call instead of None (i.e., replace the None argument to
decode_router.add_request(...) with Some(router_config_override) or propagate
the existing variable) so KvRouter::add_request sees the override.

In `@lib/bindings/python/rust/llm/replay.rs`:
- Around line 642-644: The code accepts num_prefill_workers and
num_decode_workers but ignores them when the replay path stays aggregated; add a
validation in the function that constructs/dispatches the replay (the function
handling num_workers, num_prefill_workers, num_decode_workers in replay.rs) to
reject (return Err or panic with a clear message) any non-default
num_prefill_workers or num_decode_workers when prefill_engine_args and
decode_engine_args are not provided (i.e., when choosing the aggregated arm); do
the same guard where the parameters are later processed (the block around the
alternative arm handling at the section covering lines ~656-673) to ensure
callers are informed rather than silently ignored.

In `@lib/bindings/python/src/dynamo/replay/main.py`:
- Around line 21-48: In _load_engine_args, json.loads(raw_args) can produce
non-dict values (e.g., list, null, scalar) so add an explicit validation right
after raw = json.loads(raw_args) that checks isinstance(raw, dict) and raises a
ValueError (e.g., "engine-args must be a JSON object") if not; keep all
subsequent logic that reads keys from raw (worker_type handling,
planner_profile_data resolution via resolve_planner_profile_data, and final
return via MockEngineArgs.from_json) unchanged.

---

Outside diff comments:
In `@lib/mocker/src/replay/router/offline.rs`:
- Around line 374-397: The offline admit_request path currently calls
self.slots.potential_blocks_and_tokens(...) which always counts prompt-side
(prefill) tokens; change that call to use the prefill-aware API
self.slots.potential_blocks_and_tokens_with_prefill_tracking(...,
request.track_prefill_tokens) so worker selection respects the
SequenceRequest.track_prefill_tokens flag. Update the call site in admit_request
(where decode_blocks and prefill_tokens are computed) to pass
request.track_prefill_tokens into
potential_blocks_and_tokens_with_prefill_tracking and keep the rest of the
scheduling flow (scheduling_request, selector.select_worker, SequenceRequest
construction) unchanged.

---

Nitpick comments:
In `@lib/bindings/python/tests/test_replay.py`:
- Around line 593-614: Add a pytest timeout decorator to the
test_run_synthetic_trace_replay_disagg_preserves_expected_output_tokens function
to prevent CI hangs; annotate the function with `@pytest.mark.timeout`(30) (or
another appropriate seconds value) placed immediately above its def, and ensure
pytest is imported in the test file if not already so the decorator resolves.

In `@lib/kv-router/src/protocols.rs`:
- Around line 1030-1056: The current test only round-trips an event that
includes track_prefill_tokens=false; add a regression test that deserializes a
legacy JSON payload which omits the track_prefill_tokens field and asserts the
resulting ActiveSequenceEvent (specifically ActiveSequenceEventData::AddRequest)
yields track_prefill_tokens == true. Create a new test (or extend
test_active_sequence_add_request_serialization_preserves_track_prefill_tokens)
that constructs a JSON string representing an AddRequest without the
track_prefill_tokens key, calls serde_json::from_str::<ActiveSequenceEvent>(),
matches on ActiveSequenceEventData::AddRequest and asserts track_prefill_tokens
is true to lock in the backward-compat default behavior.

In `@lib/mocker/src/replay/offline/runtime_utils.rs`:
- Around line 79-102: The current refutable pattern in
pop_ready_worker_completion destructures event.kind as
SimulationEventKind::WorkerCompletion and can panic if new variants are added;
change the code to explicitly match event.kind (from the SimulationEvent
returned by events.peek()/events.pop()) and handle the WorkerCompletion arm by
constructing and returning WorkerCompletionPayload, while adding a wildcard arm
that safely returns None (or logs/handles unexpected variants) to avoid runtime
panics.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6378df3c-ba91-45aa-a8c8-1143db98f0a9

📥 Commits

Reviewing files that changed from the base of the PR and between 2fe37a5 and b4d281c.

📒 Files selected for processing (42)
  • components/src/dynamo/common/configuration/groups/kv_router_args.py
  • components/src/dynamo/router/README.md
  • components/src/dynamo/router/__main__.py
  • docs/benchmarks/mocker-trace-replay.md
  • docs/components/router/README.md
  • docs/components/router/router-guide.md
  • docs/mocker/mocker.md
  • lib/bench/kv_router/active_sequences_bench.rs
  • lib/bindings/c/src/lib.rs
  • lib/bindings/python/rust/llm/entrypoint.rs
  • lib/bindings/python/rust/llm/replay.rs
  • lib/bindings/python/src/dynamo/replay/api.py
  • lib/bindings/python/src/dynamo/replay/main.py
  • lib/bindings/python/tests/test_replay.py
  • lib/kv-router/src/protocols.rs
  • lib/kv-router/src/scheduling/config.rs
  • lib/kv-router/src/scheduling/local.rs
  • lib/kv-router/src/scheduling/policy.rs
  • lib/kv-router/src/scheduling/queue.rs
  • lib/kv-router/src/scheduling/types.rs
  • lib/kv-router/src/sequences/multi_worker.rs
  • lib/kv-router/src/sequences/single.rs
  • lib/llm/src/kv_router.rs
  • lib/llm/src/kv_router/prefill_router.rs
  • lib/llm/src/kv_router/scheduler.rs
  • lib/llm/src/kv_router/sequence.rs
  • lib/mocker/src/replay/entrypoints.rs
  • lib/mocker/src/replay/mod.rs
  • lib/mocker/src/replay/offline/README.md
  • lib/mocker/src/replay/offline/disagg.rs
  • lib/mocker/src/replay/offline/events.rs
  • lib/mocker/src/replay/offline/mod.rs
  • lib/mocker/src/replay/offline/multi.rs
  • lib/mocker/src/replay/offline/runtime_utils.rs
  • lib/mocker/src/replay/offline/state.rs
  • lib/mocker/src/replay/router/offline.rs
  • lib/mocker/src/replay/router/online.rs
  • lib/mocker/src/replay/router/shared.rs
  • lib/mocker/src/replay/validate.rs
  • lib/mocker/src/scheduler/mod.rs
  • lib/mocker/src/scheduler/sglang/core.rs
  • lib/mocker/src/scheduler/vllm/core.rs

Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: PeaBrane <yanrpei@gmail.com>
@PeaBrane PeaBrane enabled auto-merge (squash) March 25, 2026 17:07
@PeaBrane PeaBrane disabled auto-merge March 25, 2026 17:08
Signed-off-by: PeaBrane <yanrpei@gmail.com>
Signed-off-by: PeaBrane <yanrpei@gmail.com>
@PeaBrane PeaBrane enabled auto-merge (squash) March 25, 2026 18:02
@PeaBrane PeaBrane merged commit 02b1c58 into main Mar 25, 2026
89 checks passed
@PeaBrane PeaBrane deleted the rupei/disagg-replay-prep branch March 25, 2026 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation feat router Relates to routing, KV-aware routing, etc. size/XXL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants