fix: allow HuggingFace standard chat template params via **kwargs#27622
fix: allow HuggingFace standard chat template params via **kwargs#27622Isotr0py merged 4 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request provides a solid fix for the compatibility issue with tokenizers that use **kwargs for standard chat template parameters. The approach of dynamically inspecting the base PreTrainedTokenizer.apply_chat_template method is clean and maintainable. I've found one potential issue with how the parameters are extracted, which could lead to unexpected behavior. My review includes a suggestion to address this.
5f46dd0 to
e00dee4
Compare
Some tokenizer implementations (e.g., Kimi K2) use **kwargs to receive standard parameters like add_generation_prompt instead of declaring them explicitly. This fix extracts the standard parameter list from PreTrainedTokenizer.apply_chat_template base class signature to allow these parameters while maintaining security. The implementation also correctly excludes VAR_KEYWORD and VAR_POSITIONAL parameter types to prevent 'kwargs' or 'args' from being treated as valid parameter names. Signed-off-by: wangln19 <wanglinian@dev.wanglinian.msh-dev.svc.cluster.local>
3ec44f6 to
4636728
Compare
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Signed-off-by: wangln19 <96399074+wangln19@users.noreply.github.com>
DarkLight1337
left a comment
There was a problem hiding this comment.
Thanks, LGTM.
cc @Isotr0py
|
|
||
| # Allow standard HF parameters even if tokenizer uses **kwargs to receive them | ||
| hf_base_params = _get_hf_base_chat_template_params() | ||
|
|
||
| accept_vars = (fn_kw | template_vars | hf_base_params) - unexpected_vars |
There was a problem hiding this comment.
Can you also update the test in tests/entrypoints/test_chat_utils.py?
vllm/tests/entrypoints/test_chat_utils.py
Lines 1783 to 1806 in 5b3c35a
There was a problem hiding this comment.
Updated the test as suggested.
I considered adding Kimi K2 to the test model registry, but decided against it for the following reasons:
-
Infrastructure overhead: Adding a new model to
tests/models/registry.pyrequires extensive configuration (tokenizer path, trust_remote_code settings, HF overrides, etc.), which would be significant effort for testing a single parameter filtering behavior. -
Generic fix: This change is not Kimi K2-specific. It enables support for any tokenizer that uses
**kwargsto receive HuggingFace standard parameters. The mock tokenizer approach tests the core logic more directly. -
Sufficient coverage: The updated test now includes:
- Existing integration tests (Qwen2-VL, Qwen3) verify backward compatibility
- New mock tokenizer test validates the **kwargs scenario (like Kimi K2)
- Manual integration testing with actual Kimi K2 model confirms end-to-end functionality (as documented in PR description)
The mock approach isolates the parameter filtering logic we're actually fixing, without coupling the test suite to a specific model that may have availability/licensing constraints.
Some tokenizer implementations (e.g., Kimi K2) use **kwargs to receive standard parameters like add_generation_prompt instead of declaring them explicitly. This fix extracts the standard parameter list from PreTrainedTokenizer.apply_chat_template base class signature to allow these parameters while maintaining security. The implementation also correctly excludes VAR_KEYWORD and VAR_POSITIONAL parameter types to prevent 'kwargs' or 'args' from being treated as valid parameter names. Signed-off-by: wangln19 <wanglinian@dev.wanglinian.msh-dev.svc.cluster.local>
…lm-project#27622) Signed-off-by: wangln19 <wanglinian@dev.wanglinian.msh-dev.svc.cluster.local> Signed-off-by: wangln19 <96399074+wangln19@users.noreply.github.com> Co-authored-by: wangln19 <wanglinian@dev.wanglinian.msh-dev.svc.cluster.local> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
…lm-project#27622) Signed-off-by: wangln19 <wanglinian@dev.wanglinian.msh-dev.svc.cluster.local> Signed-off-by: wangln19 <96399074+wangln19@users.noreply.github.com> Co-authored-by: wangln19 <wanglinian@dev.wanglinian.msh-dev.svc.cluster.local> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
…lm-project#27622) Signed-off-by: wangln19 <wanglinian@dev.wanglinian.msh-dev.svc.cluster.local> Signed-off-by: wangln19 <96399074+wangln19@users.noreply.github.com> Co-authored-by: wangln19 <wanglinian@dev.wanglinian.msh-dev.svc.cluster.local> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
…lm-project#27622) Signed-off-by: wangln19 <wanglinian@dev.wanglinian.msh-dev.svc.cluster.local> Signed-off-by: wangln19 <96399074+wangln19@users.noreply.github.com> Co-authored-by: wangln19 <wanglinian@dev.wanglinian.msh-dev.svc.cluster.local> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Purpose
Fix compatibility issue with tokenizers that use
**kwargsto receive standard chat template parameters.Problem:
add_generation_prompt,tools, etc. in theirapply_chat_templatemethod signature**kwargsresolve_chat_template_kwargsusesallow_var_kwargs=False, which rejects these parametersfinish_reason: stopinstead offinish_reason: tool_calls)Root Cause:
The security fix in PR #25794 prevents passing parameters not explicitly declared in the function signature to avoid injection attacks. While this is correct for unknown parameters, it inadvertently blocks legitimate HuggingFace standard parameters when tokenizers use
**kwargs.Solution:
Dynamically extract the standard parameter list from
PreTrainedTokenizer.apply_chat_templatebase class signature and whitelist these parameters even when the tokenizer implementation uses**kwargsto receive them.Benefits:
Test Plan
Test Result
Before the fix:
finish_reason: "stop"(wrong - model generates text instead of tool call)add_generation_promptwas silently dropped{'tools': [...]}(missingadd_generation_prompt)After the fix:
finish_reason: "tool_calls"✅add_generation_prompt: truecorrectly passed to tokenizer{'tools': [...], 'add_generation_prompt': True}Security verification:
Code Changes
Modified file:
vllm/entrypoints/chat_utils.py_get_hf_base_chat_template_params()function to dynamically extract standard parameters from HuggingFace base classresolve_chat_template_kwargs()to includehf_base_paramsin the accept listimport inspectto module level for clarityLines changed: ~15 lines added
Checklist