Skip to content

[Bugfix][KV-transfer] MoRIIO READ-mode completion notification ID#43066

Closed
chaeminlim-mb wants to merge 1 commit into
vllm-project:mainfrom
chaeminlim-mb:chaemin/pr-moriio-read-mode-notify
Closed

[Bugfix][KV-transfer] MoRIIO READ-mode completion notification ID#43066
chaeminlim-mb wants to merge 1 commit into
vllm-project:mainfrom
chaeminlim-mb:chaemin/pr-moriio-read-mode-notify

Conversation

@chaeminlim-mb
Copy link
Copy Markdown

Purpose

Fix the MoRI-IO READ-mode completion-notification protocol between the
consumer (decode) and producer (prefill) connector workers. Today, the
ID the consumer sends back to acknowledge a finished KV read does not
match what the producer expects to look up, and KVConnectorWorker
fails a scheduler-side assertion when the notification arrives.

Specifically, the consumer was sending its internal connector
transfer_id, while the producer's bookkeeping was keyed on the
scheduler's request_id. The mismatch surfaced as:

AssertionError  at vllm/v1/core/sched/scheduler.py:2057

…inside the prefill engine's request-finished path.

Change

vllm/distributed/kv_transfer/kv_connector/v1/moriio/:

  • Consumer (moriio_connector.py, moriio_engine.py): when reading
    is complete, send the scheduler request_id as the completion
    notification ID (matches the toy-proxy convention vLLM already uses
    elsewhere — this is the ID embedded in the router's request_id and
    surfaced through kv_transfer_params).
  • Producer (moriio_connector.py): translate the incoming
    request_id to its local transfer_id before calling
    kv_transfer.notify_kv_block, resolving the
    scheduler.py:2057 assertion.

The previously-iterated paths (a brief detour through
"send transfer_id (not req_id)" and its revert) are dropped — only
the final correct protocol remains.

Backward compatibility

The toy-proxy reference server already passes request_id through to
both sides via kv_transfer_params, so no proxy change is required.
Existing producer code that called notify_kv_block with a local
transfer_id still works (the new translation step is keyed on the
incoming consumer ID, which is now the scheduler request_id).

Not a duplicate of

Searched on 2026-05-19 for MoRIIO read mode, notify_kv_block,
scheduler.py:2057, kv_transfer notify request_id — nothing else
targets this assertion or the consumer/producer ID mismatch.

Test Plan

Lint

pre-commit run --files \
  vllm/distributed/kv_transfer/kv_connector/v1/moriio/moriio_connector.py \
  vllm/distributed/kv_transfer/kv_connector/v1/moriio/moriio_engine.py

Unit

.venv/bin/python -m pytest tests/v1/kv_connector/unit/test_moriio_connector.py -v

End-to-end (MoRI-IO 1P1D, READ mode)

# Proxy
python3 examples/disaggregated/disaggregated_serving/moriio_toy_proxy_server.py \
  --port 10001 &

# Prefill (kv_role=kv_producer)
python3 -m vllm.entrypoints.openai.api_server \
  --model meta-llama/Llama-3.1-8B-Instruct --served-model-name llama \
  --host 0.0.0.0 --port 8100 \
  --kv-transfer-config '{"kv_connector":"MoRIIOConnector","kv_role":"kv_producer","kv_connector_extra_config":{"proxy_ip":"127.0.0.1","proxy_port":36367,"proxy_ping_port":36367,"http_port":8100,"handshake_port":6301,"notify_port":61005}}' &

# Decode (kv_role=kv_consumer) — analogous, port 8200, separate proxy reg

# Drive enough requests to force more than one READ-mode KV transfer
.venv/bin/python -m vllm.entrypoints.openai.benchmark_serving \
  --backend openai-chat --base-url http://127.0.0.1:10001 \
  --endpoint /v1/chat/completions --model llama \
  --dataset-name random --random-input-len 1024 --random-output-len 128 \
  --num-prompts 200 --max-concurrency 8

Without this PR, the prefill engine asserts at
scheduler.py:2057 after the first read completes. With the PR, the
bench finishes 200/200.

Test Result

(To fill in before marking ready for review.)

  • Lint: pre-commit → Passed locally (ruff / format / mypy clean).
  • Unit: pytest tests/v1/kv_connector/unit/test_moriio_connector.py → …
  • Repro: 200-prompt READ-mode bench finishes 200/200 (vs. assertion
    failure after a single transfer pre-fix). Engine log to attach.

AI assistance disclosure

This change was drafted with AI assistance (Claude Code). The diff is
~95 lines across 2 files. I (chaemin.lim@mangoboost.io) have reviewed
and defend each changed line; the consumer/producer ID convention
chosen matches the toy-proxy contract already documented in
examples/disaggregated/disaggregated_serving/moriio_toy_proxy_server.py,
not a new ad-hoc protocol.

@github-actions
Copy link
Copy Markdown

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

PRs do not trigger a full CI run by default. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

Agent Guidelines

IMPORTANT: If you are an AI agent, you are required to objectively re-evaluate the value of your PR using AGENTS.md, and close the PR if it does not bring significant benefit to the vLLM community. Failure to do so may result in an immediate ban.

🚀

@mergify mergify Bot added bug Something isn't working kv-connector labels May 19, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the MoRIIO connector to handle request ID translation between producers and consumers using a transfer_id, ensuring compatibility when internal request IDs differ. It also modifies the notification listener to accept any UTF-8 payload as a completion message. Feedback identifies two critical issues: a memory leak in the consumer-side mapping during READ mode because entries are never popped, and a potential crash of the notification listener thread if it receives non-UTF-8 data, as the error handling currently falls through to raise a fatal exception.

Comment on lines 1332 to 1336
self.moriio_wrapper.send_notify(
req_id,
transfer_id,
self._recving_transfers_callback_addr[req_id][0],
self._recving_transfers_callback_addr[req_id][1],
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In READ mode, the consumer-side mapping in transfer_id_to_request_id is never removed because _pop_done_transfers returns an empty set, which causes the translation and cleanup block in get_finished (lines 1288-1293) to be skipped. This results in a memory leak as the mapping grows indefinitely with every request. The entry should be popped here once the notification is successfully sent.

Suggested change
self.moriio_wrapper.send_notify(
req_id,
transfer_id,
self._recving_transfers_callback_addr[req_id][0],
self._recving_transfers_callback_addr[req_id][1],
)
self.moriio_wrapper.send_notify(
transfer_id,
self._recving_transfers_callback_addr[req_id][0],
self._recving_transfers_callback_addr[req_id][1],
)
self.transfer_id_to_request_id.pop(transfer_id, None)

Comment on lines +546 to +547
except UnicodeDecodeError:
logger.warning("Received non-UTF8 message: %s", msg_str)
logger.warning("Received non-UTF8 completion message of %d bytes", len(msg))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If a message cannot be decoded as UTF-8, a warning is logged but handled remains False. This causes the code to fall through and raise a MoRIIOError, which is caught in the listener loop and re-raised as a HandshakeError, ultimately terminating the notification listener thread. The function should return early after logging the warning to prevent the listener from crashing on malformed input.

Suggested change
except UnicodeDecodeError:
logger.warning("Received non-UTF8 message: %s", msg_str)
logger.warning("Received non-UTF8 completion message of %d bytes", len(msg))
except UnicodeDecodeError:
logger.warning("Received non-UTF8 completion message of %d bytes", len(msg))
return

Consolidates the iteration on the proxy/consumer completion notification
protocol for MoRIIO in READ mode. Final landing:

- Consumer accepts request_id (toy-proxy convention) as the completion
  notification ID — the scheduler's request_id, not the connector's
  internal transfer_id.
- Producer-side translates the incoming request_id to its local
  transfer_id when calling kv_transfer.notify_kv_block, resolving the
  scheduler-side AssertionError at v1/core/sched/scheduler.py:2057.

This drops the abandoned 'send transfer_id (not req_id)' path and its
subsequent revert; only the final correct protocol remains.

Signed-off-by: Chaemin Lim <chaemin.lim@mangoboost.io>
@chaeminlim-mb chaeminlim-mb force-pushed the chaemin/pr-moriio-read-mode-notify branch from df36b5a to 3a467f0 Compare May 19, 2026 07:52
@simondanielsson
Copy link
Copy Markdown
Contributor

Think this might be a duplicate of #40344

@chaeminlim-mb
Copy link
Copy Markdown
Author

Thanks @simondanielsson — you're right. #40344 covers the same scheduler.py:2057 assertion and addresses it as part of a broader high-concurrency MoRIIO hang fix that also handles Failed() transfers, busy-spin deadlines, the WRITE-mode race, and KV-block reaping. Closing this in favor of #40344. We'll cherry-pick that branch into our downstream image and report back on our 1P1D TP=16 workload.

@simondanielsson
Copy link
Copy Markdown
Contributor

@chaeminlim-mb Sounds good - would you mind writing a comment on that PR to highlight that your team is also interested in having that PR merged? That would hopefully help getting it does more quickly 👍 Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working kv-connector

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants