[Bugfix][Mooncake] Fix per-group block_size/block_hash and group_idx in MooncakeStoreConnector KV events#44103
Open
ivanium wants to merge 2 commits into
Open
[Bugfix][Mooncake] Fix per-group block_size/block_hash and group_idx in MooncakeStoreConnector KV events#44103ivanium wants to merge 2 commits into
ivanium wants to merge 2 commits into
Conversation
Collaborator
Author
Contributor
|
This pull request has merge conflicts that must be resolved before it can be |
…iginal_block_size Signed-off-by: Yifan Qiao <yifanqiao@inferact.ai>
Signed-off-by: Yifan Qiao <yifanqiao@inferact.ai>
570ddee to
85de01f
Compare
njhill
approved these changes
Jun 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
MooncakeStoreConnectoremitsBlockStoredKV events so external subscribers can track which KV blocks are present in the store. In hybrid / multi-group (HMA) configurations — and whenever a group'sblock_sizeis larger than the connector'shash_block_size— three fields of the emitted event were wrong.block_hashesusedreq_meta.block_hashes[chunk_idx], which indexes the request's hash-granular hash list with a chunk index. Whenblock_size > hash_block_size, consecutive hashes are merged into one chunk hash viaBlockHashListWithBlockSize, so this picked the wrong hash and the event advertised ablock_hashthat does not match the key the block is actually stored under. A consumer indexing the store by that hash would miss or mismatch. The fix derives the hash directly from the store key being written (BlockHash(bytes.fromhex(key.chunk_hash))), so the event hash and the storage key agree by construction. For the uniform single-group case (block_size == hash_block_size) this is identical to the previous behavior, so nothing regresses there.block_sizeused a single globaloriginal_block_size(cache_config.block_size), which is wrong for groups whose block size differs. It now uses the per-groupdb.block_size.group_idxwas not set, so multi-group consumers could not attribute an event to its KV-cache group. It is now populated from the per-group loop index.BlockStored.group_idxalready exists, so no event-schema change is needed.With
original_block_sizeno longer read anywhere after (2), the now-dead field and its plumbing (scheduler attribute,ReqMetafield/parameter, worker attribute, and test references) are removed. This mechanical cleanup is bundled with the substantive fix rather than shipped on its own.Test
The worker tests include a case with
enable_kv_event=Trueand ablock_size > hash_block_size("double hash") group, which exercises the corrected event fields.Not a duplicate
Searched open PRs on vllm-project/vllm (
mooncake kv event,MooncakeStoreConnector,mooncake BlockStored). The nearby open mooncake PRs are unrelated in scope: #43742 (release GPU pin on failed store), #43701 (DummyClient mode), #42199 (DP worker bootstrap timeout). None touchBlockStoredper-group correctness.AI assistance
This change was prepared with AI assistance (Claude Code). The author has reviewed every changed line and run the tests listed above.