Skip to content

Commit 5fbafbb

Browse files
fix MLATokenToKVPoolHost get_size_per_token bug (#5161)
Co-authored-by: AniZpZ <[email protected]>
1 parent a949988 commit 5fbafbb

File tree

1 file changed

+6
-1
lines changed

1 file changed

+6
-1
lines changed

python/sglang/srt/mem_cache/memory_pool.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -879,7 +879,12 @@ def get_size_per_token(self):
879879
self.qk_rope_head_dim = self.device_pool.qk_rope_head_dim
880880
self.layer_num = self.device_pool.layer_num
881881

882-
return (self.kv_lora_rank + self.qk_rope_head_dim) * 1 * self.dtype.itemsize
882+
return (
883+
(self.kv_lora_rank + self.qk_rope_head_dim)
884+
* 1
885+
* self.dtype.itemsize
886+
* self.layer_num
887+
)
883888

884889
def init_kv_buffer(self):
885890
return torch.empty(

0 commit comments

Comments
 (0)