Skip to content

[RFC] Lora Support #167

@sagiahrac

Description

@sagiahrac

This RFC proposes necessary changes to the llm-d kv cache manager to enable support for multiple LoRA adapters. The proposed solution involves ensuring the KV-Cache manager consistently uses the base model information for tokenization and incorporates the LoRA adapter name into block hashing for compatibility with the vLLM serving framework.

https://docs.google.com/document/d/1kAqPBqZctkXISoGfo6Z2E-vRKbBV4ifDvuP-2y1yTb8/edit?usp=sharing

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions