[BugFix] Changed the minimax wrapper to accept **extra_kwargs#8866
[BugFix] Changed the minimax wrapper to accept **extra_kwargs#8866gcanlin wants to merge 1 commit intovllm-project:mainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a compatibility issue where the Minimax wrapper was failing due to an overly restrictive function signature. By accepting and forwarding arbitrary keyword arguments, the wrapper now correctly handles updates from upstream vllm, ensuring seamless integration with new parameters like chat_template_kwargs. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request updates the _wrapped_chat_completion_stream_generator to accept and forward **extra_kwargs, ensuring compatibility with recent upstream vLLM changes. The review feedback points out that the PR title and summary need to be updated to follow the repository's style guide. Furthermore, it is recommended to apply the same modification to the non-streaming _wrapped_chat_completion_full_generator to prevent TypeError when new parameters are passed through that path.
| tokenizer, | ||
| request_metadata: engine_protocol.RequestResponseMetadata, | ||
| reasoning_parser=None, | ||
| **extra_kwargs: Any, |
There was a problem hiding this comment.
The pull request title and summary do not adhere to the repository style guide. The title is missing the module prefix, and the summary sections for user-facing changes and testing are incomplete.
Suggested PR Title:
[Ops][BugFix] Changed the minimax wrapper to accept **extra_kwargsSuggested PR Summary:
### What this PR does / why we need it?
This PR updates the `_wrapped_chat_completion_stream_generator` to accept and forward `**extra_kwargs`. This ensures compatibility with recent changes in upstream vLLM where `chat_template_kwargs` was added to the signature, preventing `TypeError` when these arguments are passed.
Fixes #
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
(Please provide details on how this change was verified, e.g., manual testing with affected models or CI results).References
- PR Title and Summary must follow the specified format and include all required sections with content. (link)
| tokenizer, | ||
| request_metadata: engine_protocol.RequestResponseMetadata, | ||
| reasoning_parser=None, | ||
| **extra_kwargs: Any, |
There was a problem hiding this comment.
The fix applied to _wrapped_chat_completion_stream_generator should also be applied to _wrapped_chat_completion_full_generator (starting at line 327). Currently, the non-streaming generator is still missing the **extra_kwargs parameter and does not forward it to the original generator, which will cause a TypeError when upstream vLLM passes new parameters like chat_template_kwargs through the non-streaming path.
What this PR does / why we need it?
The _wrapped_chat_completion_stream_generator in patch_minimax_usage_accounting.py:293-320](https://github.com/vllm-project/vllm-ascend/tree/main/vllm_ascend/patch/platform/patch_minimax_usage_accounting.py#L293-L320) had an explicit signature that didn't include chat_template_kwargs. Upstream vllm added this parameter to chat_completion_stream_generator.
The call chain was:
Fix: Changed the minimax wrapper to accept **extra_kwargs and forward them to the original, making it forward-compatible with any future new parameters.
Does this PR introduce any user-facing change?
How was this patch tested?