[Frontend] Refactor prompt processing#4028
[Frontend] Refactor prompt processing#4028njhill merged 139 commits intovllm-project:mainfrom DarkLight1337:openai-typing
Conversation
…ions API and legacy Completions API
vllm.entrypoints.openai|
Seems that #4032 fixed the LoRA bugs, however |
|
Update: I found that it is due to a bug in my refactored parsing, my bad. I have fixed it just now. |
I'm updating |
I've moved out the logging to a separate class |
|
I have finished addressing your comments. |
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com> Signed-off-by: Alvant <alvasian@yandex.ru>
Co-authored-by: Roger Wang <ywang@roblox.com> Signed-off-by: LeiWang1999 <leiwang1999@outlook.com>
This PR refactors various parts of the OpenAI-compatible server:
_validate_prompt_and_tokenizemethod has been decomposed so thatpromptandprompt_idsare processed separately.promptandprompt_idshas been moved fromvllm.AsyncLLMEnginetovllm.entrypoints.logger.RequestLoggersuch that redundant data is no longer passed into the core engine. This also enables logging for tokenization endpoints.request_idbased on the endpoint type:cmpl-*(as before)chat-*embd-*tokn-*