Skip to content

[Fix][MoRI] Add MoRI-IO connector support#138

Merged
BugenZhao merged 10 commits into
vllm-project:mainfrom
simondanielsson:feature/moriio-support
Apr 29, 2026
Merged

[Fix][MoRI] Add MoRI-IO connector support#138
BugenZhao merged 10 commits into
vllm-project:mainfrom
simondanielsson:feature/moriio-support

Conversation

@simondanielsson
Copy link
Copy Markdown
Contributor

@simondanielsson simondanielsson commented Apr 9, 2026

Purpose

Fixes #126 and vllm-project/vllm#38692.

This PR adds

  • --kv-connector moriio type for running vllm-router with MoRI-IO KV connector
  • compatibility with MoRI connector with vllm 0.18.0+.
    • This PR introduces a transfer_id in the kv_transfer_params in the MoRI-IO connector and toy proxy (this PR is part of vllm 0.18.0). This PR applies the same for update the vllm-router, and also adds the required remote_dp_size for performing remote handshakes inside the MoRI connector. This allows us to support the MoRI-IO KV connector in vllm-router.

Important: This does not interfere with any other connectors functionality.

Basic usage

vllm-router \
    --vllm-pd-disaggregation \
    --kv-connector moriio \
    --vllm-discovery-address "0.0.0.0:${PROXY_PING_PORT}" \
    --port "${ROUTER_PORT}" \
    --host 0.0.0.0 \

Dependencies

Test Plan

This PR can be used in conjunction with this vllm PR to run vllm w/ MoRI-IO KV connector and vllm-router.

Reproducer scripts can be found in this temporary branch: mpashkovskii/vllm#4

It includes

  1. Dockerfiles for building the branches from source (incl installation of Broadcom NIC drivers, etc.),
  2. Scripts for testing MoRI+vllm-router using the images built in point 1, both in single-node (run_pd_demo.sh) and multi-node setups (run_pd_demo_2node.sh‎)

The tests include both basic smoke test (example request), vllm bench serve, and GSM8K using lm_eval (although the latter two require streaming support as added in the PR(s) mentioned above).

Test Result

See vllm-project/vllm#39565 for full performance and accuracy benchmarks. Below we show results from minimal smoke test reproducer.

2 node using DSR1

Basic smoke test (note random params):

$ IS_PREFILL=1 PREFILL_IP=10.21.9.47 DECODE_IP=10.21.9.29     ./examples/online_serving/disaggregated_serving/moriio_pd_demo/run_pd_demo_2node.sh



Containers will be shut down automatically when this script exits.
(Set KEEP_ALIVE=1 to leave them running.)

>>> Smoke test: sending a completion request through vllm-router...
2026-04-10 15:06:06  INFO vllm_router_rs::routers::http::vllm_pd_router: src/routers/http/vllm_pd_router.rs:1436: vLLM route_completion called, use_discovery=true
2026-04-10 15:06:06  INFO vllm_router_rs::routers::http::vllm_pd_router: src/routers/http/vllm_pd_router.rs:1443: Using service discovery mode, processing vLLM two-stage request
2026-04-10 15:06:06  INFO vllm_router_rs::policies::consistent_hash: src/policies/consistent_hash.rs:281: Updated consistent hash ring with 1 workers and 160 virtual nodes
2026-04-10 15:06:06  INFO vllm_router_rs::policies::consistent_hash: src/policies/consistent_hash.rs:593: CONSISTENT_HASH_DEBUG: Extracted hash key: request_hash:f9454da6e4bbbd0a
2026-04-10 15:06:06  INFO vllm_router_rs::policies::consistent_hash: src/policies/consistent_hash.rs:598: CONSISTENT_HASH_DEBUG: Hash key 'request_hash:f9454da6e4bbbd0a' mapped to worker: http://10.21.9.47:8100
2026-04-10 15:06:06  INFO vllm_router_rs::policies::consistent_hash: src/policies/consistent_hash.rs:647: Consistent hash routing: key='request_hash:f9454da6e4bbbd0a' -> worker='http://10.21.9.47:8100' (index=0)
2026-04-10 15:06:06  INFO vllm_router_rs::policies::consistent_hash: src/policies/consistent_hash.rs:281: Updated consistent hash ring with 1 workers and 160 virtual nodes
2026-04-10 15:06:06  INFO vllm_router_rs::policies::consistent_hash: src/policies/consistent_hash.rs:593: CONSISTENT_HASH_DEBUG: Extracted hash key: request_hash:f9454da6e4bbbd0a
2026-04-10 15:06:06  INFO vllm_router_rs::policies::consistent_hash: src/policies/consistent_hash.rs:598: CONSISTENT_HASH_DEBUG: Hash key 'request_hash:f9454da6e4bbbd0a' mapped to worker: http://10.21.9.29:8200
2026-04-10 15:06:06  INFO vllm_router_rs::policies::consistent_hash: src/policies/consistent_hash.rs:647: Consistent hash routing: key='request_hash:f9454da6e4bbbd0a' -> worker='http://10.21.9.29:8200' (index=0)
[aiter] pciChipId: 0x74a1, CU count: 304
[aiter] hipModuleLoad: /usr/local/lib/python3.12/dist-packages/aiter_meta/hsa//gfx942/fmha_v3_fwd/MI300/fwd_hd192x128_bf16_causal_rtna_group.co GetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE
[aiter] hipModuleGetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE Success
[aiter] pciChipId: 0x74a1, CU count: 304
[aiter] hipModuleLoad: /usr/local/lib/python3.12/dist-packages/aiter_meta/hsa//gfx942/fmha_v3_fwd/MI300/fwd_hd192x128_bf16_causal_rtna_group.co GetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE
[aiter] pciChipId: 0x74a1, CU count: 304
[aiter] hipModuleLoad: /usr/local/lib/python3.12/dist-packages/aiter_meta/hsa//gfx942/fmha_v3_fwd/MI300/fwd_hd192x128_bf16_causal_rtna_group.co GetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE
[aiter] pciChipId: 0x74a1, CU count: 304
[aiter] hipModuleLoad: /usr/local/lib/python3.12/dist-packages/aiter_meta/hsa//gfx942/fmha_v3_fwd/MI300/fwd_hd192x128_bf16_causal_rtna_group.co GetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE
[aiter] hipModuleGetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE Success
[aiter] hipModuleGetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE Success
[aiter] hipModuleGetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE Success
[aiter] pciChipId: 0x74a1, CU count: 304
[aiter] hipModuleLoad: /usr/local/lib/python3.12/dist-packages/aiter_meta/hsa//gfx942/fmha_v3_fwd/MI300/fwd_hd192x128_bf16_causal_rtna_group.co GetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE
[aiter] pciChipId: 0x74a1, CU count: 304
[aiter] hipModuleLoad: /usr/local/lib/python3.12/dist-packages/aiter_meta/hsa//gfx942/fmha_v3_fwd/MI300/fwd_hd192x128_bf16_causal_rtna_group.co GetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE
[aiter] hipModuleGetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE Success
[aiter] pciChipId: 0x74a1, CU count: 304
[aiter] hipModuleLoad: /usr/local/lib/python3.12/dist-packages/aiter_meta/hsa//gfx942/fmha_v3_fwd/MI300/fwd_hd192x128_bf16_causal_rtna_group.co GetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE
[aiter] hipModuleGetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE Success
[aiter] hipModuleGetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE Success
[aiter] pciChipId: 0x74a1, CU count: 304
[aiter] hipModuleLoad: /usr/local/lib/python3.12/dist-packages/aiter_meta/hsa//gfx942/fmha_v3_fwd/MI300/fwd_hd192x128_bf16_causal_rtna_group.co GetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE
[aiter] hipModuleGetFunction: _ZN5aiter41fmha_fwd_hd192x128_bf16_causal_rtna_groupE Success
(APIServer pid=7) INFO:     10.21.9.47:58400 - "POST /v1/completions HTTP/1.1" 200 OK
(APIServer pid=7) INFO 04-10 15:07:08 [loggers.py:259] Engine 000: Avg prompt throughput: 0.5 tokens/s, Avg generation throughput: 0.1 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 0.0%, External prefix cache hit rate: 0.0%
(APIServer pid=7) INFO 04-10 15:07:18 [loggers.py:259] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 0.0%, External prefix cache hit rate: 0.0%
{
    "id": "cmpl-___prefill_addr_host:10.21.9.47,handshake:6301,notify:61005___decode_addr_host:10.21.9.29,handshake:6301,notify:61005_ab377ab0c004415e8a74fbc0e7e1fe62",
    "object": "text_completion",
    "created": 1775833620,
    "model": "deepseek-ai/DeepSeek-R1-0528",
    "choices": [
        {
            "index": 0,
            "text": ".B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42.B\u8bf7\u6c42",
            "logprobs": null,
            "finish_reason": "length",
            "stop_reason": null,
            "token_ids": null,
            "prompt_logprobs": null,
            "prompt_token_ids": null
        }
    ],
    "service_tier": null,
    "system_fingerprint": null,
    "usage": {
        "prompt_tokens": 5,
        "total_tokens": 69,
        "completion_tokens": 64,
        "prompt_tokens_details": null
    },
    "kv_transfer_params": null
}

(Set USE_BENCH=1 for a perf benchmark, or USE_GSM8K=1 for accuracy eval.)

>>> Shutting down containers...
moriio-prefill
moriio-router
Done.

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results

…eation for initial MoRI-IO support

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
@simondanielsson simondanielsson force-pushed the feature/moriio-support branch from 9e1d84f to 5545b3f Compare April 9, 2026 12:51
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Comment thread src/routers/http/vllm_pd_router.rs Outdated
"do_remote_prefill": false,
"remote_engine_id": serde_json::Value::Null,
"remote_block_ids": serde_json::Value::Null,
"remote_host": serde_json::Value::Null,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remote_host and remote_port are not read by any prefill instance's connector, so it's safe to remove them

@simondanielsson simondanielsson changed the title [Fix] Add transfer_id to kv_transfer_params construction for MoRIIO connector support [Fix] Add transfer_id to kv_transfer_params for MoRIIO connector support Apr 11, 2026
@simondanielsson simondanielsson changed the title [Fix] Add transfer_id to kv_transfer_params for MoRIIO connector support [Fix][MoRI] Add transfer_id to kv_transfer_params for MoRIIO connector support Apr 11, 2026
@simondanielsson simondanielsson marked this pull request as ready for review April 11, 2026 09:08
…oyments

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
@simondanielsson simondanielsson changed the title [Fix][MoRI] Add transfer_id to kv_transfer_params for MoRIIO connector support [Fix][MoRI] Add transfer_id & remote_dp_size to kv_transfer_params for MoRIIO connector support Apr 14, 2026
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
@simondanielsson simondanielsson force-pushed the feature/moriio-support branch from 0c1aeee to 1e66f5a Compare April 20, 2026 07:42
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
@simondanielsson simondanielsson changed the title [Fix][MoRI] Add transfer_id & remote_dp_size to kv_transfer_params for MoRIIO connector support [Fix][MoRI] Add MoRI-IO connector support Apr 20, 2026
@simondanielsson simondanielsson changed the title [Fix][MoRI] Add MoRI-IO connector support [Fix][MoRI] Add MoRI-IO connector support by updating kv_transfer_params Apr 20, 2026
@simondanielsson simondanielsson changed the title [Fix][MoRI] Add MoRI-IO connector support by updating kv_transfer_params [Fix][MoRI] Add MoRI-IO connector support Apr 20, 2026
@tjtanaa tjtanaa requested review from BugenZhao and wpc April 24, 2026 00:28
@simon-mo
Copy link
Copy Markdown
Contributor

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM. Thanks!

Comment thread src/routers/http/vllm_pd_router.rs Outdated
Comment thread src/routers/http/vllm_pd_router.rs Outdated
Co-authored-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@simondanielsson
Copy link
Copy Markdown
Contributor Author

@BugenZhao Does this failing CI look like an unrelated issue to you as well? 🙏

@BugenZhao
Copy link
Copy Markdown
Member

@BugenZhao Does this failing CI look like an unrelated issue to you as well? 🙏

Yes I think it's #154 aims to resolve.

@BugenZhao BugenZhao enabled auto-merge April 29, 2026 00:40
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 328e9ce50f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/config/types.rs
Comment on lines +124 to +127
/// MoRI-IO KV transfer
#[serde(rename = "moriio")]
#[value(name = "moriio")]
MoriIO,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Accept moriio in Python kv_connector validation

Adding KvConnector::MoriIO here does not make it usable from the Python API, because Router::to_config still only accepts "nixl" and "mooncake" and returns ValidationFailed for any other value (src/lib.rs, kv_connector match). In environments configuring the router through Python bindings, kv_connector="moriio" now fails at startup even though the enum advertises support.

Useful? React with 👍 / 👎.

@BugenZhao BugenZhao merged commit 9f4bc16 into vllm-project:main Apr 29, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: MoRIIOConnector support

3 participants