[Bugfix][Mooncake] Fix per-group block_size/block_hash and group_idx in MooncakeStoreConnector KV events by ivanium · Pull Request #44103 · vllm-project/vllm

ivanium · 2026-05-31T08:58:26Z

Purpose

MooncakeStoreConnector emits BlockStored KV events so external subscribers can track which KV blocks are present in the store. In hybrid / multi-group (HMA) configurations — and whenever a group's block_size is larger than the connector's hash_block_size — three fields of the emitted event were wrong.

block_hashes used req_meta.block_hashes[chunk_idx], which indexes the request's hash-granular hash list with a chunk index. When block_size > hash_block_size, consecutive hashes are merged into one chunk hash via BlockHashListWithBlockSize, so this picked the wrong hash and the event advertised a block_hash that does not match the key the block is actually stored under. A consumer indexing the store by that hash would miss or mismatch. The fix derives the hash directly from the store key being written (BlockHash(bytes.fromhex(key.chunk_hash))), so the event hash and the storage key agree by construction. For the uniform single-group case (block_size == hash_block_size) this is identical to the previous behavior, so nothing regresses there.
block_size used a single global original_block_size (cache_config.block_size), which is wrong for groups whose block size differs. It now uses the per-group db.block_size.
group_idx was not set, so multi-group consumers could not attribute an event to its KV-cache group. It is now populated from the per-group loop index. BlockStored.group_idx already exists, so no event-schema change is needed.

With original_block_size no longer read anywhere after (2), the now-dead field and its plumbing (scheduler attribute, ReqMeta field/parameter, worker attribute, and test references) are removed. This mechanical cleanup is bundled with the substantive fix rather than shipped on its own.

Test

.venv/bin/python -m pytest tests/v1/kv_connector/unit/test_mooncake_store_worker.py tests/v1/kv_connector/unit/test_mooncake_store_scheduler.py -q
# 66 passed

pre-commit run ruff-check --files <changed files>
# Passed

The worker tests include a case with enable_kv_event=True and a block_size > hash_block_size ("double hash") group, which exercises the corrected event fields.

Not a duplicate

Searched open PRs on vllm-project/vllm (mooncake kv event, MooncakeStoreConnector, mooncake BlockStored). The nearby open mooncake PRs are unrelated in scope: #43742 (release GPU pin on failed store), #43701 (DummyClient mode), #42199 (DP worker bootstrap timeout). None touch BlockStored per-group correctness.

AI assistance

This change was prepared with AI assistance (Claude Code). The author has reviewed every changed line and run the tests listed above.

ivanium · 2026-05-31T21:52:54Z

cc @zhewenl @Dao007forever

Dao007forever

LGTM!

mergify · 2026-06-02T05:16:01Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @ivanium.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…iginal_block_size Signed-off-by: Yifan Qiao <yifanqiao@inferact.ai>

Signed-off-by: Yifan Qiao <yifanqiao@inferact.ai>

mergify Bot added v1 bug Something isn't working kv-connector labels May 31, 2026

ivanium marked this pull request as ready for review May 31, 2026 21:52

ivanium requested review from ApostaC, NickLucche, orozery and xuechendi as code owners May 31, 2026 21:52

Dao007forever approved these changes Jun 1, 2026

View reviewed changes

mergify Bot added the needs-rebase label Jun 2, 2026

ivanium added 2 commits June 2, 2026 20:40

fix (mk store): correct per-group block_size in kv events. removed or…

9025f30

…iginal_block_size Signed-off-by: Yifan Qiao <yifanqiao@inferact.ai>

refactor (mk store): remove useless pcp and dcp size in scheduler

85de01f

Signed-off-by: Yifan Qiao <yifanqiao@inferact.ai>

ivanium force-pushed the fix/mk-store-kv-event branch from 570ddee to 85de01f Compare June 2, 2026 21:11

mergify Bot removed the needs-rebase label Jun 2, 2026

njhill approved these changes Jun 3, 2026

View reviewed changes

njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 3, 2026

njhill enabled auto-merge (squash) June 3, 2026 01:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix][Mooncake] Fix per-group block_size/block_hash and group_idx in MooncakeStoreConnector KV events#44103

[Bugfix][Mooncake] Fix per-group block_size/block_hash and group_idx in MooncakeStoreConnector KV events#44103
ivanium wants to merge 2 commits into
vllm-project:mainfrom
ivanium:fix/mk-store-kv-event

ivanium commented May 31, 2026

Uh oh!

ivanium commented May 31, 2026

Uh oh!

Dao007forever left a comment

Uh oh!

mergify Bot commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

ivanium commented May 31, 2026

Purpose

Test

Not a duplicate

AI assistance

Uh oh!

ivanium commented May 31, 2026

Uh oh!

Dao007forever left a comment

Choose a reason for hiding this comment

Uh oh!

mergify Bot commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants