Conversation
There was a problem hiding this comment.
Code Review
This pull request adds a regression test to ensure that the /tokenize endpoint correctly expands image placeholders for VLM models. The test is well-structured and targets the specific regression. My feedback includes a suggestion to improve the test's robustness to make it less brittle against future changes in the model or tokenizer.
|
|
||
| # stop_sign.jpg (1300x876) produces 1451 tokens after expansion. | ||
| # Without expansion the count would be ~26 (text + one placeholder). | ||
| assert response.json()["count"] == 1451 |
There was a problem hiding this comment.
Hardcoding the exact token count 1451 makes this test brittle. Minor changes to the model's tokenizer or image processor in the future could cause this test to fail, even if the overall functionality is correct. To make the test more robust, consider asserting a range or a minimum value that clearly distinguishes between the buggy behavior (~26 tokens) and the correct behavior. For example, assert response.json()['count'] > 1000 would still effectively catch the regression while being more resilient to small changes.
| assert response.json()["count"] == 1451 | |
| assert response.json()["count"] > 1000, "Token count is too low, image placeholders were likely not expanded." |
Signed-off-by: hallerite <git@hallerite.com>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
Signed-off-by: hallerite <git@hallerite.com>
Signed-off-by: hallerite <git@hallerite.com>
Signed-off-by: hallerite <git@hallerite.com>
Purpose
Add a regression test verifying that
/tokenizecorrectly expands image placeholder tokens for VLM models.Before PR #34560,
/tokenizedid not run the multimodal processor, so it returned ~26 tokens for a message with an image instead of the expected 1451. Confirmed broken on both 0.15.1 and 0.16.0.Test Plan
pytest tests/entrypoints/openai/test_tokenization_vlm.py -vTest Result
On HEAD — PASS:
test_tokenize_chat_expands_image_placeholders PASSED
======================= 1 passed in 30.40s ========================
On vllm 0.15.1 / 0.16.0 — FAIL (expected):
/tokenize returned 26 tokens
FAIL — token count 26 is too low, image placeholders were NOT expande
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.