Summary
The KV-Cache manager currently creates and maintains multiple prefix stores (prompt-to-tokens cache) despite serving only a single base model. This results in redundant cache management overhead.
Action Required
Consolidate the prefix caching mechanism to utilize a single, global store per base model instance, removing unnecessary redundant stores.