[CORE] Adding support for insertion of soft-tuned prompts#4645
[CORE] Adding support for insertion of soft-tuned prompts#4645njhill merged 91 commits intovllm-project:mainfrom
Conversation
Yard1
left a comment
There was a problem hiding this comment.
Some high level comments - if this is not time sensitive, I think it would be good if we could come up with some more generic APIs to promote code reuse between LoRA and prompt adapter
71adbbb to
23f741b
Compare
|
Hey @Yard1,
After your review, happy to add more extensive tests + docs. |
|
@SwapnilDreams100 thanks! let me take a look |
|
Hi @Yard1 just a friendly reminder to review this PR when you get a chance, thanks! |
|
Hi @SwapnilDreams100, I have an initial implementation of adapter support for the OpenAI entry points based on https://github.com/SwapnilDreams100/vllm/tree/main. Would you be open to me contributing to your PR? |
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
Yard1
left a comment
There was a problem hiding this comment.
Approving pending CI passing
|
@SwapnilDreams100 first of all congrats! I'd like to add tests for the openai server, would it be best to wait for this to merge so I can open my own PR against main? |
|
Sounds good @g-eoj, big thank you for your help! |
|
Hey @Yard1 are we good to merge? |
|
Thanks for this epic effort @SwapnilDreams100!! And big thanks to @Yard1 for the many detailed reviews. I'll merge it before any new conflicts pop up! |
|
Big thank you to @Yard1 for your guidance on this, this was a great learning experience! |
…ct#4645) Co-authored-by: Swapnil Parekh <swapnilp@ibm.com> Co-authored-by: Joe G <joseph.granados@h2o.ai> Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
…ct#4645) Co-authored-by: Swapnil Parekh <swapnilp@ibm.com> Co-authored-by: Joe G <joseph.granados@h2o.ai> Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
…ct#4645) Co-authored-by: Swapnil Parekh <swapnilp@ibm.com> Co-authored-by: Joe G <joseph.granados@h2o.ai> Co-authored-by: Antoni Baum <antoni.baum@protonmail.com> Signed-off-by: LeiWang1999 <leiwang1999@outlook.com>
This PR adds support for inserting soft-tuned prompts into the input embeddings (trained using PEFT).
This functionality is required by the IBM team.
Summary of Changes:
prompt_adapterfolder similar to thelorafolder to create a LRU cache management system for multiple prompt adaptersprompt_adapter_configengine parameter - for easy extension to more sophisticated prompt-tuning techniques in the futureprompt_adapter_requestparameter added to the generate functionality of the LLM_Enginebloom, llama, mistral, easily extensible for othersbloom