Add patch for vLLM local deployment by mariohong128 · Pull Request #36 · stepfun-ai/Step-3.5-Flash

mariohong128 · 2026-02-25T09:40:45Z

Description

Add patch for vLLM local deployment

Several fix pull requests are being merged into vllm. In the meantime, we are providing a patch based on vllm version v0.15.1 for local deployment.

The patch contains:

Support anthropic messages api: [Feature]Supports Anthropic Thinking Block vllm-project/vllm#33671
Detail: Fix bugs in Anthropic Message Stream Converter (Anthropic /v1/messages REST API endpoint) and support thinking block.
Fix tool parser: [Bugfix] Fix step3p5 tool parser and unnecessary unstreamed tool args in serving. vllm-project/vllm#34354
Detail: The parsing logic of the tool parser was changed to wait for all toolcall-related tokens to be output before parsing the toolcall, thus improving stability.
Fix reasoning parser: [Bugfix] Fix step3p5 reasoning with interleaved thinking vllm-project/vllm#34211
Detail: Fix issue: When there are multiple rounds of conversation, the prompt contains </think> from the previous round, and the step3p5 reasoning parser failed to correctly determine the end of reasoning.

Copilot

Pull request overview

This PR adds a patch file for vLLM v0.15.1 to enable local deployment with bug fixes for tool parsing, reasoning parsing, and Anthropic messages API support. The patch serves as a temporary workaround while these fixes are being merged into the upstream vLLM repository.

Changes:

Adds a 4960-line patch file containing fixes for Anthropic messages API, tool parser, and reasoning parser
Adds a Dockerfile for easy Docker-based deployment of the patched vLLM version
Updates README files (English and Chinese) with deployment instructions for using the patch

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 9 comments.

File	Description
step3.5_vllm_v0.15.1.patch	Comprehensive patch file containing Anthropic API support for thinking blocks, tool parser rewrite from XML-based to regex-based parsing, reasoning parser fixes for multi-turn conversations, and corresponding test files
step3.5_vllm_v0.15.1.Dockerfile	Dockerfile that applies the patch to vllm/vllm-openai:v0.15.1-x86_64 base image
README.md	Adds documentation for patch-based deployment option with Docker and pip installation methods
README.zh-CN.md	Chinese translation of the patch deployment documentation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-27T07:42:22Z

step3.5_vllm_v0.15.1.Dockerfile

@@ -0,0 +1,4 @@
+FROM vllm/vllm-openai:v0.15.1-x86_64
+COPY step3.5_vllm_v0.15.1.patch /usr/bin/patches/step3.5_vllm_v0.15.1.patch
+RUN apt-get update && apt-get install -y git


The Dockerfile installs git and applies the patch but doesn't clean up the apt cache, which increases the image size unnecessarily. Consider adding && rm -rf /var/lib/apt/lists/* after the apt-get install command to reduce the image size.

Suggested change

RUN apt-get update && apt-get install -y git

RUN apt-get update && apt-get install -y git && rm -rf /var/lib/apt/lists/*

Copilot · 2026-02-27T07:42:23Z

step3.5_vllm_v0.15.1.Dockerfile

+FROM vllm/vllm-openai:v0.15.1-x86_64
+COPY step3.5_vllm_v0.15.1.patch /usr/bin/patches/step3.5_vllm_v0.15.1.patch
+RUN apt-get update && apt-get install -y git
+RUN git -C /usr/local/lib/python3.12/dist-packages apply --exclude='tests/*' --exclude='examples/*' /usr/bin/patches/step3.5_vllm_v0.15.1.patch


The patch file is applied with --exclude='tests/*' and --exclude='examples/*' flags. However, the patch contains test files (e.g., tests/reasoning/test_step3p5_reasoning_parser.py and tests/tool_parsers/test_step3p5_tool_parser.py) starting at line 437 and 867. These test files will be excluded from the patch application, which means the tests won't be available in the deployed system. While this might be intentional for production deployments, it could make debugging and validation more difficult. Consider documenting this behavior or providing a development variant that includes tests.

Suggested change

RUN git -C /usr/local/lib/python3.12/dist-packages apply --exclude='tests/*' --exclude='examples/*' /usr/bin/patches/step3.5_vllm_v0.15.1.patch

# By default, tests and examples are excluded when applying the patch to keep the image minimal.

# For development/debugging, build with: `docker build --build-arg INCLUDE_TESTS=1 -f step3.5_vllm_v0.15.1.Dockerfile .`

ARG INCLUDE_TESTS=0

RUN if [ "$INCLUDE_TESTS" = "1" ]; then \

git -C /usr/local/lib/python3.12/dist-packages apply /usr/bin/patches/step3.5_vllm_v0.15.1.patch; \

else \

git -C /usr/local/lib/python3.12/dist-packages apply --exclude='tests/*' --exclude='examples/*' /usr/bin/patches/step3.5_vllm_v0.15.1.patch; \

fi

Copilot · 2026-02-27T07:42:23Z

step3.5_vllm_v0.15.1.patch

+
+        # Update prev_tool_call_arr for finish_reason
+        self._update_prev_tool_call_state(tool_calls)
+\ No newline at end of file


The patch file ends without a newline at the end (indicated by the "\ No newline at end of file" marker at line 4960). While this is typically handled correctly by git, it's generally better practice to ensure all text files end with a newline character. This is a very minor style issue and won't affect functionality.

Copilot · 2026-02-27T07:42:23Z

README.zh-CN.md

+   **注意**：tool parser和reasoning parser的bug修复，以及 `v1/messages` 接口的支持正在合入vLLM，在此期间，您可以使用`vllm/vllm-openai:v0.15.1-x86_64`镜像，配合`step3.5_vllm_v0.15.1.patch`部署，有下列两种方式：
+   ```bash 
+   # 通过 Docker
+   # 参考 "step3.5_vllm_v0.15.1.Dockerfile"


The Chinese README has the same documentation issue - it only references the Dockerfile without providing the actual docker build command. Consider adding the complete command for clarity.

Suggested change

# 参考 "step3.5_vllm_v0.15.1.Dockerfile"

# 参考 "step3.5_vllm_v0.15.1.Dockerfile"

docker build -t vllm/vllm-openai:v0.15.1-x86_64 -f step3.5_vllm_v0.15.1.Dockerfile .

README.md

Copilot · 2026-02-27T07:42:24Z

README.zh-CN.md

+
+   # 或通过 pip（v0.15.1 版本）
+   pip install -U vllm==0.15.1
+   cd /path/to/lib/python3.12/site-packages # 替换为vLLM安装路径的上级目录


The Chinese README has the same issue with hardcoded Python version (3.12) in the path. Consider making the instructions more flexible to accommodate different Python versions in user environments.

Suggested change

cd /path/to/lib/python3.12/site-packages # 替换为vLLM安装路径的上级目录

cd /path/to/lib/pythonX.Y/site-packages # 将 pythonX.Y 替换为你当前使用的 Python 版本，例如 python3.10；路径为 vLLM 安装路径的上级目录

README.md

Copilot · 2026-02-27T07:42:24Z

README.zh-CN.md

+   **注意**：tool parser和reasoning parser的bug修复，以及 `v1/messages` 接口的支持正在合入vLLM，在此期间，您可以使用`vllm/vllm-openai:v0.15.1-x86_64`镜像，配合`step3.5_vllm_v0.15.1.patch`部署，有下列两种方式：
+   ```bash 
+   # 通过 Docker
+   # 参考 "step3.5_vllm_v0.15.1.Dockerfile"
+
+   # 或通过 pip（v0.15.1 版本）
+   pip install -U vllm==0.15.1
+   cd /path/to/lib/python3.12/site-packages # 替换为vLLM安装路径的上级目录
+   git apply step3.5_vllm_v0.15.1.patch


The Chinese README has the same issue - it doesn't clearly explain where to get the patch file or ensure it's accessible before running git apply.

Suggested change

**注意**：tool parser和reasoning parser的bug修复，以及 `v1/messages` 接口的支持正在合入vLLM，在此期间，您可以使用`vllm/vllm-openai:v0.15.1-x86_64`镜像，配合`step3.5_vllm_v0.15.1.patch`部署，有下列两种方式：

```bash

# 通过 Docker

# 参考 "step3.5_vllm_v0.15.1.Dockerfile"

# 或通过 pip（v0.15.1 版本）

pip install -U vllm==0.15.1

cd /path/to/lib/python3.12/site-packages # 替换为vLLM安装路径的上级目录

git apply step3.5_vllm_v0.15.1.patch

**注意**：tool parser和reasoning parser的bug修复，以及 `v1/messages` 接口的支持正在合入vLLM，在此期间，您可以使用`vllm/vllm-openai:v0.15.1-x86_64`镜像，配合 `step3.5_vllm_v0.15.1.patch` 部署。有下列两种方式，在此之前请先从本项目仓库获取该补丁文件（例如 Releases 页面或补丁目录），并记住其保存路径：

```bash

# 通过 Docker

# 参考 "step3.5_vllm_v0.15.1.Dockerfile"

# 或通过 pip（v0.15.1 版本）

pip install -U vllm==0.15.1

cd /path/to/lib/python3.12/site-packages # 替换为 vLLM 安装路径的上级目录

git apply /path/to/step3.5_vllm_v0.15.1.patch # 将 /path/to/ 替换为补丁文件的实际路径

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

mariohong128 added 5 commits February 25, 2026 17:35

add patch

91ad52a

minor

fe157bf

add dockerfile and pip guide

e7bdb2c

update

c3cf94f

update

b6f9206

hengm3467 requested a review from Copilot February 27, 2026 07:36

Copilot started reviewing on behalf of hengm3467 February 27, 2026 07:36 View session

Copilot AI reviewed Feb 27, 2026

View reviewed changes

hengm3467 and others added 2 commits February 27, 2026 15:44

Apply suggestion from @Copilot

904a74b

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Apply suggestion from @Copilot

fcab3f1

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

WinstonDeng merged commit ed32e0f into stepfun-ai:main Mar 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add patch for vLLM local deployment#36

Add patch for vLLM local deployment#36
WinstonDeng merged 7 commits intostepfun-ai:mainfrom
mariohong128:step3p5_patch

mariohong128 commented Feb 25, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 27, 2026

Uh oh!

Copilot AI Feb 27, 2026

Uh oh!

Copilot AI Feb 27, 2026

Uh oh!

Copilot AI Feb 27, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 27, 2026

Uh oh!

Uh oh!

Copilot AI Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	RUN apt-get update && apt-get install -y git
	RUN apt-get update && apt-get install -y git && rm -rf /var/lib/apt/lists/*

-RUN git -C /usr/local/lib/python3.12/dist-packages apply --exclude='tests/*' --exclude='examples/*' /usr/bin/patches/step3.5_vllm_v0.15.1.patch
+# By default, tests and examples are excluded when applying the patch to keep the image minimal.
+# For development/debugging, build with: `docker build --build-arg INCLUDE_TESTS=1 -f step3.5_vllm_v0.15.1.Dockerfile .`
+ARG INCLUDE_TESTS=0
+RUN if [ "$INCLUDE_TESTS" = "1" ]; then \
+        git -C /usr/local/lib/python3.12/dist-packages apply /usr/bin/patches/step3.5_vllm_v0.15.1.patch; \
+    else \
+        git -C /usr/local/lib/python3.12/dist-packages apply --exclude='tests/*' --exclude='examples/*' /usr/bin/patches/step3.5_vllm_v0.15.1.patch; \
+    fi

	# 参考 "step3.5_vllm_v0.15.1.Dockerfile"
	# 参考 "step3.5_vllm_v0.15.1.Dockerfile"
	docker build -t vllm/vllm-openai:v0.15.1-x86_64 -f step3.5_vllm_v0.15.1.Dockerfile .

	cd /path/to/lib/python3.12/site-packages # 替换为vLLM安装路径的上级目录
	cd /path/to/lib/pythonX.Y/site-packages # 将 pythonX.Y 替换为你当前使用的 Python 版本，例如 python3.10；路径为 vLLM 安装路径的上级目录

Conversation

mariohong128 commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

The patch contains:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mariohong128 commented Feb 25, 2026 •

edited

Loading