Skip to content

Commit ae2ab58

Browse files
authored
[KVCache] Fix the reference counter in sequence fork (#16666)
This PR fixes a sequence reference counter bug in the KV cache: when forking a child sequnece from an existing parent sequence, the reference counter of hte parent sequence was not increased. This leads to error when the child sequence is removed, where we will check the parent's reference counter and find it is 0 and is never changed unexpectedly. Meanwhile, this PR updates the PagedKVCache tests with some latest changes, including target-aware tile size selection.
1 parent ad1da4e commit ae2ab58

File tree

2 files changed

+177
-114
lines changed

2 files changed

+177
-114
lines changed

src/runtime/relax_vm/paged_kv_cache.cc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -475,6 +475,7 @@ class PagedAttentionKVCacheObj : public AttentionKVCacheObj {
475475
<< "Attention merge-score function not available. ForkSequence is thereby not supported.";
476476

477477
int32_t parent_block_idx = parent_it->second.last_block_idx;
478+
++global_block_pool_[parent_block_idx].external_ref_cnt;
478479
// Create a child block with the parent block pointer.
479480
int32_t child_block_idx = GetFreeBlock();
480481
global_block_pool_[child_block_idx].start_pos = parent_it->second.seq_length;

0 commit comments

Comments
 (0)