Skip to content

[BugFix] Changed the minimax wrapper to accept **extra_kwargs#8866

Open
gcanlin wants to merge 1 commit intovllm-project:mainfrom
gcanlin:patch-fix
Open

[BugFix] Changed the minimax wrapper to accept **extra_kwargs#8866
gcanlin wants to merge 1 commit intovllm-project:mainfrom
gcanlin:patch-fix

Conversation

@gcanlin
Copy link
Copy Markdown
Collaborator

@gcanlin gcanlin commented May 2, 2026

What this PR does / why we need it?

The _wrapped_chat_completion_stream_generator in patch_minimax_usage_accounting.py:293-320](https://github.com/vllm-project/vllm-ascend/tree/main/vllm_ascend/patch/platform/patch_minimax_usage_accounting.py#L293-L320) had an explicit signature that didn't include chat_template_kwargs. Upstream vllm added this parameter to chat_completion_stream_generator.

The call chain was:

  • vllm passes chat_template_kwargs=... to the GLM wrapper (which uses *args, **kwargs)
  • GLM wrapper forwards it to the minimax wrapper — which rejects it with TypeError

Fix: Changed the minimax wrapper to accept **extra_kwargs and forward them to the original, making it forward-compatible with any future new parameters.

Does this PR introduce any user-facing change?

How was this patch tested?

@gcanlin gcanlin requested a review from wangxiyuan as a code owner May 2, 2026 13:33
@gcanlin gcanlin changed the title [Bugfix] Changed the minimax wrapper to accept **extra_kwargs [BugFix] Changed the minimax wrapper to accept **extra_kwargs May 2, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a compatibility issue where the Minimax wrapper was failing due to an overly restrictive function signature. By accepting and forwarding arbitrary keyword arguments, the wrapper now correctly handles updates from upstream vllm, ensuring seamless integration with new parameters like chat_template_kwargs.

Highlights

  • Signature Update: Updated the _wrapped_chat_completion_stream_generator function signature to include **extra_kwargs.
  • Forward Compatibility: Ensured that additional arguments passed through the call chain are correctly forwarded, preventing TypeError exceptions.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 2, 2026

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the _wrapped_chat_completion_stream_generator to accept and forward **extra_kwargs, ensuring compatibility with recent upstream vLLM changes. The review feedback points out that the PR title and summary need to be updated to follow the repository's style guide. Furthermore, it is recommended to apply the same modification to the non-streaming _wrapped_chat_completion_full_generator to prevent TypeError when new parameters are passed through that path.

tokenizer,
request_metadata: engine_protocol.RequestResponseMetadata,
reasoning_parser=None,
**extra_kwargs: Any,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The pull request title and summary do not adhere to the repository style guide. The title is missing the module prefix, and the summary sections for user-facing changes and testing are incomplete.

Suggested PR Title:

[Ops][BugFix] Changed the minimax wrapper to accept **extra_kwargs

Suggested PR Summary:

### What this PR does / why we need it?

This PR updates the `_wrapped_chat_completion_stream_generator` to accept and forward `**extra_kwargs`. This ensures compatibility with recent changes in upstream vLLM where `chat_template_kwargs` was added to the signature, preventing `TypeError` when these arguments are passed.

Fixes #

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

(Please provide details on how this change was verified, e.g., manual testing with affected models or CI results).
References
  1. PR Title and Summary must follow the specified format and include all required sections with content. (link)

tokenizer,
request_metadata: engine_protocol.RequestResponseMetadata,
reasoning_parser=None,
**extra_kwargs: Any,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fix applied to _wrapped_chat_completion_stream_generator should also be applied to _wrapped_chat_completion_full_generator (starting at line 327). Currently, the non-streaming generator is still missing the **extra_kwargs parameter and does not forward it to the original generator, which will cause a TypeError when upstream vLLM passes new parameters like chat_template_kwargs through the non-streaming path.

Signed-off-by: gcanlin <canlinguosdu@gmail.com>
@gcanlin gcanlin added ready read for review ready-for-test start test by label for PR labels May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant