Skip to content

Conversation

@sagiahrac
Copy link
Contributor

Context

This PR updates the LLMD block hashing method to maintain compatibility with a recent change in the upstream vLLM project.

The vLLM block hashing method was updated to use the LoRA adapter name (string) instead of the previous integer ID (vllm-project/vllm#27211). This alignment is necessary for correct KV-cache lookups in LLMD.

Key Changes

  1. LoRA Hashing Support: The LLMD block hashing logic is updated to correctly incorporate the LoRA adapter name in the request hash calculation. Previously, LLMD did not support LoRA request hashing.
  2. Test Coverage: New tests have been added to verify the behavior of the request hash generation both with and without LoRA adapter information.

Future Work & Notes

Signed-off-by: Sage Ahrac <[email protected]>
Signed-off-by: Sage Ahrac <[email protected]>
Signed-off-by: Sage Ahrac <[email protected]>
@sagiahrac sagiahrac changed the title KV-block hashing consistency between vLLM and llmd feat: KV-block hashing consistency between vLLM and llmd Nov 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant