feat: Simplify tokenization prefix store to single-model architecture #182

sagiahrac · 2025-11-24T09:20:51Z

Summary

This PR refactors the tokenization prefix store architecture to support a single model per store instance, eliminating the need for model name management within the store implementations.

Changes Made

Simplified LRUTokenStore & TrieTokenStore - Removed multi base model support.
Updated Indexer interface - Removed modelName parameters from AddTokenization and FindLongestContainedTokens methods. Model identity managed at scheduler/indexer level, not storage level
Updated tests

Configure the base model name in the indexer config
Remove model name parameters from tokenization method calls
The change aligns with the single-model-per-scheduler architecture where each scheduler instance handles one specific base model.

Solves #190
Part of #167

Copilot

Pull request overview

This PR refactors the tokenization prefix store architecture to eliminate multi-model support within individual store instances, simplifying the implementation to a single-model-per-store design. This aligns with a broader architectural shift where model identity is managed at the scheduler/indexer level rather than within the storage layer.

Removed multi-model management from LRUTokenStore and TrieTokenStore implementations
Updated Indexer interface to remove modelName parameters from AddTokenization and FindLongestContainedTokens methods
Updated all tests and consumers to remove model name parameters from tokenization calls

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
pkg/tokenization/prefixstore/trie_store.go	Refactored from `ContainedTokenStore` (multi-model) to `TrieTokenStore` (single-model); removed model management logic
pkg/tokenization/prefixstore/lru_store.go	Simplified to single-model architecture by removing per-model cache map and using single LRU cache instance
pkg/tokenization/prefixstore/lru_store_test.go	Updated all test cases to remove `modelName` parameter from function calls
pkg/tokenization/prefixstore/indexer.go	Updated interface signatures to remove `modelName` parameters
pkg/tokenization/pool_test.go	Updated mock implementations and test expectations to match new interface signatures
pkg/tokenization/pool.go	Updated indexer method calls to remove `modelName` arguments

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pkg/tokenization/prefixstore/trie_store.go

pkg/tokenization/prefixstore/lru_store.go

pkg/tokenization/prefixstore/indexer.go

pkg/tokenization/prefixstore/trie_store.go

Signed-off-by: Sage Ahrac <[email protected]>

Co-authored-by: Copilot <[email protected]> Signed-off-by: Sage <[email protected]>

vMaroon · 2025-11-28T15:46:16Z

pkg/tokenization/prefixstore/lru_store.go

 	blockSize int

-	store map[string]*lru.Cache[uint64, Block]
+	cache *lru.Cache[uint64, Block]


I think index might be a better name here.

Considering that the prefix store caches tokenizations, the term index feels too generic. It’s also already used in both kvcache/kvblock/index.go and kvcache/indexer.go. I think we should choose a name that avoids this overloaded notation and better reflects the actual object the cache is representing.

vMaroon · 2025-12-01T10:29:22Z

/lgtm
/approve

sagiahrac force-pushed the add-base-model-to-config branch from 84935e6 to 3bf99f3 Compare November 24, 2025 10:51

sagiahrac changed the title ~~feat: Base-Model-Aware Indexer for Multi-LoRA KV-Cache Support~~ feat: Simplify tokenization prefix store to single-model architecture Nov 24, 2025

sagiahrac marked this pull request as ready for review November 24, 2025 14:54

sagiahrac requested review from dannyharnik, elevran and vMaroon as code owners November 24, 2025 14:54

Copilot AI review requested due to automatic review settings November 24, 2025 14:54

sagiahrac requested a review from kfirtoledo as a code owner November 24, 2025 14:54

Copilot started reviewing on behalf of sagiahrac November 24, 2025 14:55 View session

Copilot finished reviewing on behalf of sagiahrac November 24, 2025 14:58

Copilot AI reviewed Nov 24, 2025

View reviewed changes

sagiahrac and others added 6 commits November 24, 2025 17:14

lru store should not depend on model name

1bdcf75

Signed-off-by: Sage Ahrac <[email protected]>

Merge branch 'llm-d:main' into add-base-model-to-config

8cb2afb

Signed-off-by: Sage Ahrac <[email protected]>

remove model name dependency from pool

0985bd3

Signed-off-by: Sage Ahrac <[email protected]>

update pool tests

c2eda58

Signed-off-by: Sage Ahrac <[email protected]>

remove contained token tries, keep one trie

f3595f3

Signed-off-by: Sage Ahrac <[email protected]>

lint

db36361

Signed-off-by: Sage Ahrac <[email protected]>

sagiahrac force-pushed the add-base-model-to-config branch from 3cca416 to db36361 Compare November 24, 2025 15:15

sagiahrac and others added 2 commits November 24, 2025 17:15

Update pkg/tokenization/prefixstore/indexer.go

057a037

Co-authored-by: Copilot <[email protected]> Signed-off-by: Sage <[email protected]>

Update pkg/tokenization/prefixstore/lru_store.go

86d7669

Co-authored-by: Copilot <[email protected]> Signed-off-by: Sage <[email protected]>

vMaroon reviewed Nov 28, 2025

View reviewed changes

github-actions bot added the lgtm label Dec 1, 2025

github-actions bot approved these changes Dec 1, 2025

View reviewed changes

github-actions bot merged commit 6dbe3f6 into llm-d:main Dec 1, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Simplify tokenization prefix store to single-model architecture #182

feat: Simplify tokenization prefix store to single-model architecture #182

Uh oh!

sagiahrac commented Nov 24, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vMaroon Nov 28, 2025

Uh oh!

sagiahrac Dec 1, 2025

Uh oh!

vMaroon commented Dec 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Simplify tokenization prefix store to single-model architecture #182

feat: Simplify tokenization prefix store to single-model architecture #182

Uh oh!

Conversation

sagiahrac commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes Made

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vMaroon Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

sagiahrac Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

vMaroon commented Dec 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sagiahrac commented Nov 24, 2025 •

edited

Loading