simple-engine: unify tool-enabled chat on streaming path by krystophny · Pull Request #10 · computor-org/vllm-mlx

krystophny · 2026-03-24T17:47:34Z

Summary

Make tool-enabled non-stream SimpleEngine.chat() aggregate the existing streaming chat path instead of calling the separate blocking chat path.

Why

On some local models, non-stream tool-enabled chat stalls while the streaming path completes normally. The fix is to keep one tool-capable execution path for simple-engine chat and return a normal non-stream GenerationOutput from that streamed result.

What changed

tool-enabled non-stream chat now aggregates stream_chat() output
preserve the normal non-stream contract by cleaning the final text, reconstructing tokens, and keeping the streamed completion token count
add regression coverage for the non-stream contract on the tool-chat path

Files to review

vllm_mlx/engine/simple.py
tests/test_simple_engine.py

Validation

PYTHONPATH=/Users/ert/code/vllm-mlx /Users/ert/code/.venv/bin/python -m pytest tests/test_simple_engine.py tests/test_simple_engine_cancel_serialization.py -q

qodo-code-review · 2026-03-24T17:47:54Z

Review Summary by Qodo

Unify tool-enabled SimpleEngine chat on streaming path

🐞 Bug fix ✨ Enhancement

Walkthroughs

Description

• Unify tool-enabled non-stream chat to use streaming path
• Prevents stalling on local model tool requests
• Consolidates tool-capable execution into single path
• Updates test infrastructure to use anyio backend

Diagram

flowchart LR
  A["Tool-enabled chat request"] --> B{"Has tools?"}
  B -->|Yes| C["Use streaming path"]
  B -->|No| D["Use existing path"]
  C --> E["Aggregate final output"]
  E --> F["Return GenerationOutput"]
  D --> F

File Changes

1. vllm_mlx/engine/simple.py 🐞 Bug fix +25/-0

Route tool-enabled chat through streaming path

• Added tool-aware logic to chat() method that detects tool requests
• Routes tool-enabled non-stream chat through stream_chat() path
• Aggregates streaming outputs into final GenerationOutput object
• Prevents indefinite stalling on local models with tool requests

vllm_mlx/engine/simple.py

2. tests/test_simple_engine.py 🧪 Tests +58/-5

Add tool chat streaming path test and anyio support

• Added anyio_backend fixture to configure asyncio backend
• Replaced @pytest.mark.asyncio with @pytest.mark.anyio decorators
• Added new test test_chat_with_tools_aggregates_streaming_path()
• New test verifies tool-enabled chat uses streaming path and returns correct output

tests/test_simple_engine.py

qodo-code-review · 2026-03-24T17:47:55Z

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (0) 📐 Spec deviations (0)

1. Pytest anyio plugin missing 🐞 Bug ⛯ Reliability

Description

Tests were switched to @pytest.mark.anyio and an anyio_backend fixture was added, but the repo
doesn’t declare pytest-anyio (or register the anyio marker), so these tests can fail or warn in
CI/dev environments that don’t already have that plugin installed.

Code

tests/test_simple_engine.py[R13-16]

+    @pytest.fixture
+    def anyio_backend(self):
+        return "asyncio"
+

Evidence

The tests now depend on pytest-anyio’s marker/fixture conventions, but the project’s dev
dependencies and pytest config only reference pytest-asyncio and do not define an anyio marker.
This makes test execution environment-dependent and can break under strict marker handling.

tests/test_simple_engine.py[13-16]
tests/test_simple_engine.py[72-100]
pyproject.toml[65-72]
pytest.ini[1-20]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`tests/test_simple_engine.py` uses `@pytest.mark.anyio` and defines `anyio_backend`, which requires the `pytest-anyio` plugin (and typically an `anyio` marker registration). The repo currently only declares `pytest-asyncio` in dev deps, and `pytest.ini` does not register the `anyio` marker.
### Issue Context
- Current dev deps: `pytest`, `pytest-asyncio`.
- Tests now use anyio marker and backend fixture.
### Fix Focus Areas
- Add `pytest-anyio` to dev optional deps and register marker:
- pyproject.toml[65-72]
- pytest.ini[10-14]
OR
- Revert tests to pytest-asyncio:
- tests/test_simple_engine.py[13-16]
- tests/test_simple_engine.py[72-246]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. ~~Tool chat skips cleanup~~ ☑ 🐞 Bug ✓ Correctness

Description

In SimpleEngine.chat(), the new tool-enabled aggregation path returns final_output.text from
stream_chat() without applying clean_output_text(), unlike the existing blocking chat paths;
this can leak special tokens/channel markers and can also propagate finish_reason=None into a
non-stream response.

Code

vllm_mlx/engine/simple.py[R456-479]

+        # mlx-lm non-streaming chat with tools can stall indefinitely on some
+        # local models, while the streaming path completes normally. Reuse the
+        # streaming implementation and aggregate its final state so both chat
+        # APIs share the same tool-capable execution path.
+        if tools and not self._is_mllm:
+            final_output = GenerationOutput(text="")
+            async for output in self.stream_chat(
+                messages=messages,
+                max_tokens=max_tokens,
+                temperature=temperature,
+                top_p=top_p,
+                tools=tools,
+                images=images,
+                videos=videos,
+                **kwargs,
+            ):
+                final_output = output
+            return GenerationOutput(
+                text=final_output.text,
+                tokens=final_output.tokens,
+                prompt_tokens=final_output.prompt_tokens,
+                completion_tokens=final_output.completion_tokens,
+                finish_reason=final_output.finish_reason,
+            )

Evidence

The tool-enabled branch returns GenerationOutput(text=final_output.text, ...) directly, while the
existing LLM/MLLM blocking branches explicitly call clean_output_text(output.text) before
returning. clean_output_text() is responsible for stripping special tokens/channel markers, so
skipping it makes the tool-enabled non-stream path inconsistent with the rest of chat().

vllm_mlx/engine/simple.py[456-516]
vllm_mlx/api/utils.py[69-101]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Tool-enabled non-stream `SimpleEngine.chat()` aggregates `stream_chat()` but returns the final streamed text without applying `clean_output_text()`. This is inconsistent with the other `chat()` branches and can leak special tokens/channel markers in the returned non-stream response. Additionally, streamed paths can yield `finish_reason=None` on completion, which then propagates into a non-stream response.
### Issue Context
- Blocking `chat()` branches already do `text = clean_output_text(output.text)`.
- The new aggregation branch returns `final_output.text` directly.
### Fix Focus Areas
- Apply `clean_output_text()` before returning from the tool aggregation branch and consider normalizing `finish_reason` when `final_output.finished` but `finish_reason` is `None`:
- vllm_mlx/engine/simple.py[456-479]
- (Optional) Adjust/extend the new test if needed to tolerate cleaned special tokens while still preserving `<tool_call>` markup:
- tests/test_simple_engine.py[122-170]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

ⓘ The new review experience is currently in Beta. Learn more

qodo-code-review · 2026-03-24T18:00:17Z

+    @pytest.fixture
+    def anyio_backend(self):
+        return "asyncio"
+


1. Pytest anyio plugin missing 🐞 Bug ⛯ Reliability

Tests were switched to @pytest.mark.anyio and an anyio_backend fixture was added, but the repo doesn’t declare pytest-anyio (or register the anyio marker), so these tests can fail or warn in CI/dev environments that don’t already have that plugin installed.

Agent Prompt

### Issue description `tests/test_simple_engine.py` uses `@pytest.mark.anyio` and defines `anyio_backend`, which requires the `pytest-anyio` plugin (and typically an `anyio` marker registration). The repo currently only declares `pytest-asyncio` in dev deps, and `pytest.ini` does not register the `anyio` marker. ### Issue Context - Current dev deps: `pytest`, `pytest-asyncio`. - Tests now use anyio marker and backend fixture. ### Fix Focus Areas - Add `pytest-anyio` to dev optional deps and register marker: - pyproject.toml[65-72] - pytest.ini[10-14] OR - Revert tests to pytest-asyncio: - tests/test_simple_engine.py[13-16] - tests/test_simple_engine.py[72-246]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

* fix: unify tool-enabled simple chat on streaming path * fix: preserve simple chat contracts on streaming path * fix: keep tool chat on the streaming execution path * fix: preserve streamed completion token counts

fix: unify tool-enabled simple chat on streaming path

eec1b6d

qodo-code-review Bot reviewed Mar 24, 2026

View reviewed changes

krystophny added 3 commits March 24, 2026 19:03

fix: preserve simple chat contracts on streaming path

250c43a

fix: keep tool chat on the streaming execution path

8998a66

fix: preserve streamed completion token counts

99e4faf

krystophny merged commit 98cff0f into main Mar 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

simple-engine: unify tool-enabled chat on streaming path#10

simple-engine: unify tool-enabled chat on streaming path#10
krystophny merged 4 commits intomainfrom
fix/simple-engine-tool-chat-unify-streaming

krystophny commented Mar 24, 2026 •

edited

Loading

Uh oh!

qodo-code-review Bot commented Mar 24, 2026

Uh oh!

qodo-code-review Bot commented Mar 24, 2026 •

edited

Loading

Uh oh!

qodo-code-review Bot Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

krystophny commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

What changed

Files to review

Validation

Uh oh!

qodo-code-review Bot commented Mar 24, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-code-review Bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

qodo-code-review Bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

krystophny commented Mar 24, 2026 •

edited

Loading

qodo-code-review Bot commented Mar 24, 2026 •

edited

Loading