[V1] Raise error if chunked prefill is explicitly disabled#21645
[V1] Raise error if chunked prefill is explicitly disabled#2164522quinn wants to merge 2 commits intovllm-project:mainfrom
Conversation
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Code Review
This pull request introduces a validation check to explicitly raise an error if a user attempts to disable chunked prefill for the V1 engine, which is an unsupported configuration. Previously, this setting was silently ignored and overridden. The change is straightforward, correctly implemented, and improves the user experience by providing clear feedback on unsupported configurations. The test cases provided in the pull request description adequately demonstrate the new behavior.
| devices, otherwise default to `16`. | ||
| - `enable_chunked_prefill`: Set to `False` instead of `None` for | ||
| test reproducibility. | ||
| - `enable_chunked_prefill`: Set to `None` to support V1 tests. |
There was a problem hiding this comment.
@DarkLight1337 do you know the context for this? Feels we can set False individually for tests that need them?
There was a problem hiding this comment.
I forgot, it has been so long ago
| # for non-pooling tasks. | ||
| # For pooling tasks the default is False | ||
| if model_config.runner_type != "pooling": | ||
| if self.enable_chunked_prefill is False: |
There was a problem hiding this comment.
@WoosukKwon does it make sense to disable this completely?
|
This error message is very helpful for the user experience. Any update? |
|
closing for #28833 |
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.Purpose
Currently when a user specify
--no-enable-chunked-prefillorenable_chunked_prefill=False, it will silently overwrite the setting toTrue. This PR raises an explicit error telling user it's not supported.Users have asked for this behaviour e.g. in #18547
Test Plan
vllm serve:
LLM class:
Test Result
See above
(Optional) Documentation Update