simple-engine: unify tool-enabled chat on streaming path#10
Conversation
Review Summary by QodoUnify tool-enabled SimpleEngine chat on streaming path
WalkthroughsDescription• Unify tool-enabled non-stream chat to use streaming path • Prevents stalling on local model tool requests • Consolidates tool-capable execution into single path • Updates test infrastructure to use anyio backend Diagramflowchart LR
A["Tool-enabled chat request"] --> B{"Has tools?"}
B -->|Yes| C["Use streaming path"]
B -->|No| D["Use existing path"]
C --> E["Aggregate final output"]
E --> F["Return GenerationOutput"]
D --> F
File Changes1. vllm_mlx/engine/simple.py
|
Code Review by Qodo
1. Pytest anyio plugin missing
|
| @pytest.fixture | ||
| def anyio_backend(self): | ||
| return "asyncio" | ||
|
|
There was a problem hiding this comment.
1. Pytest anyio plugin missing 🐞 Bug ⛯ Reliability
Tests were switched to @pytest.mark.anyio and an anyio_backend fixture was added, but the repo doesn’t declare pytest-anyio (or register the anyio marker), so these tests can fail or warn in CI/dev environments that don’t already have that plugin installed.
Agent Prompt
### Issue description
`tests/test_simple_engine.py` uses `@pytest.mark.anyio` and defines `anyio_backend`, which requires the `pytest-anyio` plugin (and typically an `anyio` marker registration). The repo currently only declares `pytest-asyncio` in dev deps, and `pytest.ini` does not register the `anyio` marker.
### Issue Context
- Current dev deps: `pytest`, `pytest-asyncio`.
- Tests now use anyio marker and backend fixture.
### Fix Focus Areas
- Add `pytest-anyio` to dev optional deps and register marker:
- pyproject.toml[65-72]
- pytest.ini[10-14]
OR
- Revert tests to pytest-asyncio:
- tests/test_simple_engine.py[13-16]
- tests/test_simple_engine.py[72-246]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
* fix: unify tool-enabled simple chat on streaming path * fix: preserve simple chat contracts on streaming path * fix: keep tool chat on the streaming execution path * fix: preserve streamed completion token counts
* fix: unify tool-enabled simple chat on streaming path * fix: preserve simple chat contracts on streaming path * fix: keep tool chat on the streaming execution path * fix: preserve streamed completion token counts
* fix: unify tool-enabled simple chat on streaming path * fix: preserve simple chat contracts on streaming path * fix: keep tool chat on the streaming execution path * fix: preserve streamed completion token counts
* fix: unify tool-enabled simple chat on streaming path * fix: preserve simple chat contracts on streaming path * fix: keep tool chat on the streaming execution path * fix: preserve streamed completion token counts
* fix: unify tool-enabled simple chat on streaming path * fix: preserve simple chat contracts on streaming path * fix: keep tool chat on the streaming execution path * fix: preserve streamed completion token counts
Summary
Make tool-enabled non-stream
SimpleEngine.chat()aggregate the existing streaming chat path instead of calling the separate blocking chat path.Why
On some local models, non-stream tool-enabled chat stalls while the streaming path completes normally. The fix is to keep one tool-capable execution path for simple-engine chat and return a normal non-stream
GenerationOutputfrom that streamed result.What changed
stream_chat()outputFiles to review
vllm_mlx/engine/simple.pytests/test_simple_engine.pyValidation
PYTHONPATH=/Users/ert/code/vllm-mlx /Users/ert/code/.venv/bin/python -m pytest tests/test_simple_engine.py tests/test_simple_engine_cancel_serialization.py -q