Add patch for vLLM local deployment#36
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds a patch file for vLLM v0.15.1 to enable local deployment with bug fixes for tool parsing, reasoning parsing, and Anthropic messages API support. The patch serves as a temporary workaround while these fixes are being merged into the upstream vLLM repository.
Changes:
- Adds a 4960-line patch file containing fixes for Anthropic messages API, tool parser, and reasoning parser
- Adds a Dockerfile for easy Docker-based deployment of the patched vLLM version
- Updates README files (English and Chinese) with deployment instructions for using the patch
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
| step3.5_vllm_v0.15.1.patch | Comprehensive patch file containing Anthropic API support for thinking blocks, tool parser rewrite from XML-based to regex-based parsing, reasoning parser fixes for multi-turn conversations, and corresponding test files |
| step3.5_vllm_v0.15.1.Dockerfile | Dockerfile that applies the patch to vllm/vllm-openai:v0.15.1-x86_64 base image |
| README.md | Adds documentation for patch-based deployment option with Docker and pip installation methods |
| README.zh-CN.md | Chinese translation of the patch deployment documentation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -0,0 +1,4 @@ | |||
| FROM vllm/vllm-openai:v0.15.1-x86_64 | |||
| COPY step3.5_vllm_v0.15.1.patch /usr/bin/patches/step3.5_vllm_v0.15.1.patch | |||
| RUN apt-get update && apt-get install -y git | |||
There was a problem hiding this comment.
The Dockerfile installs git and applies the patch but doesn't clean up the apt cache, which increases the image size unnecessarily. Consider adding && rm -rf /var/lib/apt/lists/* after the apt-get install command to reduce the image size.
| RUN apt-get update && apt-get install -y git | |
| RUN apt-get update && apt-get install -y git && rm -rf /var/lib/apt/lists/* |
| FROM vllm/vllm-openai:v0.15.1-x86_64 | ||
| COPY step3.5_vllm_v0.15.1.patch /usr/bin/patches/step3.5_vllm_v0.15.1.patch | ||
| RUN apt-get update && apt-get install -y git | ||
| RUN git -C /usr/local/lib/python3.12/dist-packages apply --exclude='tests/*' --exclude='examples/*' /usr/bin/patches/step3.5_vllm_v0.15.1.patch |
There was a problem hiding this comment.
The patch file is applied with --exclude='tests/*' and --exclude='examples/*' flags. However, the patch contains test files (e.g., tests/reasoning/test_step3p5_reasoning_parser.py and tests/tool_parsers/test_step3p5_tool_parser.py) starting at line 437 and 867. These test files will be excluded from the patch application, which means the tests won't be available in the deployed system. While this might be intentional for production deployments, it could make debugging and validation more difficult. Consider documenting this behavior or providing a development variant that includes tests.
| RUN git -C /usr/local/lib/python3.12/dist-packages apply --exclude='tests/*' --exclude='examples/*' /usr/bin/patches/step3.5_vllm_v0.15.1.patch | |
| # By default, tests and examples are excluded when applying the patch to keep the image minimal. | |
| # For development/debugging, build with: `docker build --build-arg INCLUDE_TESTS=1 -f step3.5_vllm_v0.15.1.Dockerfile .` | |
| ARG INCLUDE_TESTS=0 | |
| RUN if [ "$INCLUDE_TESTS" = "1" ]; then \ | |
| git -C /usr/local/lib/python3.12/dist-packages apply /usr/bin/patches/step3.5_vllm_v0.15.1.patch; \ | |
| else \ | |
| git -C /usr/local/lib/python3.12/dist-packages apply --exclude='tests/*' --exclude='examples/*' /usr/bin/patches/step3.5_vllm_v0.15.1.patch; \ | |
| fi |
| + | ||
| + # Update prev_tool_call_arr for finish_reason | ||
| + self._update_prev_tool_call_state(tool_calls) | ||
| \ No newline at end of file |
There was a problem hiding this comment.
The patch file ends without a newline at the end (indicated by the "\ No newline at end of file" marker at line 4960). While this is typically handled correctly by git, it's generally better practice to ensure all text files end with a newline character. This is a very minor style issue and won't affect functionality.
| **注意**:tool parser和reasoning parser的bug修复,以及 `v1/messages` 接口的支持正在合入vLLM,在此期间,您可以使用`vllm/vllm-openai:v0.15.1-x86_64`镜像,配合`step3.5_vllm_v0.15.1.patch`部署,有下列两种方式: | ||
| ```bash | ||
| # 通过 Docker | ||
| # 参考 "step3.5_vllm_v0.15.1.Dockerfile" |
There was a problem hiding this comment.
The Chinese README has the same documentation issue - it only references the Dockerfile without providing the actual docker build command. Consider adding the complete command for clarity.
| # 参考 "step3.5_vllm_v0.15.1.Dockerfile" | |
| # 参考 "step3.5_vllm_v0.15.1.Dockerfile" | |
| docker build -t vllm/vllm-openai:v0.15.1-x86_64 -f step3.5_vllm_v0.15.1.Dockerfile . |
|
|
||
| # 或通过 pip(v0.15.1 版本) | ||
| pip install -U vllm==0.15.1 | ||
| cd /path/to/lib/python3.12/site-packages # 替换为vLLM安装路径的上级目录 |
There was a problem hiding this comment.
The Chinese README has the same issue with hardcoded Python version (3.12) in the path. Consider making the instructions more flexible to accommodate different Python versions in user environments.
| cd /path/to/lib/python3.12/site-packages # 替换为vLLM安装路径的上级目录 | |
| cd /path/to/lib/pythonX.Y/site-packages # 将 pythonX.Y 替换为你当前使用的 Python 版本,例如 python3.10;路径为 vLLM 安装路径的上级目录 |
| **注意**:tool parser和reasoning parser的bug修复,以及 `v1/messages` 接口的支持正在合入vLLM,在此期间,您可以使用`vllm/vllm-openai:v0.15.1-x86_64`镜像,配合`step3.5_vllm_v0.15.1.patch`部署,有下列两种方式: | ||
| ```bash | ||
| # 通过 Docker | ||
| # 参考 "step3.5_vllm_v0.15.1.Dockerfile" | ||
|
|
||
| # 或通过 pip(v0.15.1 版本) | ||
| pip install -U vllm==0.15.1 | ||
| cd /path/to/lib/python3.12/site-packages # 替换为vLLM安装路径的上级目录 | ||
| git apply step3.5_vllm_v0.15.1.patch |
There was a problem hiding this comment.
The Chinese README has the same issue - it doesn't clearly explain where to get the patch file or ensure it's accessible before running git apply.
| **注意**:tool parser和reasoning parser的bug修复,以及 `v1/messages` 接口的支持正在合入vLLM,在此期间,您可以使用`vllm/vllm-openai:v0.15.1-x86_64`镜像,配合`step3.5_vllm_v0.15.1.patch`部署,有下列两种方式: | |
| ```bash | |
| # 通过 Docker | |
| # 参考 "step3.5_vllm_v0.15.1.Dockerfile" | |
| # 或通过 pip(v0.15.1 版本) | |
| pip install -U vllm==0.15.1 | |
| cd /path/to/lib/python3.12/site-packages # 替换为vLLM安装路径的上级目录 | |
| git apply step3.5_vllm_v0.15.1.patch | |
| **注意**:tool parser和reasoning parser的bug修复,以及 `v1/messages` 接口的支持正在合入vLLM,在此期间,您可以使用`vllm/vllm-openai:v0.15.1-x86_64`镜像,配合 `step3.5_vllm_v0.15.1.patch` 部署。有下列两种方式,在此之前请先从本项目仓库获取该补丁文件(例如 Releases 页面或补丁目录),并记住其保存路径: | |
| ```bash | |
| # 通过 Docker | |
| # 参考 "step3.5_vllm_v0.15.1.Dockerfile" | |
| # 或通过 pip(v0.15.1 版本) | |
| pip install -U vllm==0.15.1 | |
| cd /path/to/lib/python3.12/site-packages # 替换为 vLLM 安装路径的上级目录 | |
| git apply /path/to/step3.5_vllm_v0.15.1.patch # 将 /path/to/ 替换为补丁文件的实际路径 |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Description
Add patch for vLLM local deployment
Several fix pull requests are being merged into vllm. In the meantime, we are providing a patch based on vllm version v0.15.1 for local deployment.
The patch contains:
Support anthropic messages api: [Feature]Supports Anthropic Thinking Block vllm-project/vllm#33671
Detail: Fix bugs in Anthropic Message Stream Converter (Anthropic
/v1/messagesREST API endpoint) and support thinking block.Fix tool parser: [Bugfix] Fix step3p5 tool parser and unnecessary unstreamed tool args in serving. vllm-project/vllm#34354
Detail: The parsing logic of the tool parser was changed to wait for all toolcall-related tokens to be output before parsing the toolcall, thus improving stability.
Fix reasoning parser: [Bugfix] Fix step3p5 reasoning with interleaved thinking vllm-project/vllm#34211
Detail: Fix issue: When there are multiple rounds of conversation, the prompt contains
</think>from the previous round, and the step3p5 reasoning parser failed to correctly determine the end of reasoning.