Skip to content

Commit 2a54634

Browse files
committed
[KV cache manager] Simplify block allocation in KVCacheManager::addSequence
Signed-off-by: eopXD <[email protected]>
1 parent 442b6e9 commit 2a54634

File tree

1 file changed

+1
-5
lines changed

1 file changed

+1
-5
lines changed

cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1891,11 +1891,7 @@ void KVCacheManager::addSequence(
18911891

18921892
for (auto const [windowSize, metadata] : mBlockManager.getWindowSizesMetadata())
18931893
{
1894-
auto const maxTokenNum = metadata.maxTokenNum;
1895-
auto const temporaryAttentionWindow = metadata.temporaryAttentionWindow;
1896-
1897-
// Consider the temporaryAttentionWindow when allocating blocks.
1898-
auto const effectiveInputLength = std::min(inputLength, maxTokenNum + temporaryAttentionWindow);
1894+
auto const effectiveInputLength = std::min(inputLength, windowSize);
18991895
auto const numContextBlocks = tc::ceilDiv(effectiveInputLength, getTokensPerBlock());
19001896
if (!sequence.isCyclic() && mEnableBlockReuse)
19011897
{

0 commit comments

Comments
 (0)