[CI] Upgrade vllm commit to 0512#9054
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request updates the reference commit hash for vLLM within the project's documentation configuration. This ensures that the documentation remains aligned with the latest developments and features in the vLLM repository. Highlights
New Features🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request updates the main_vllm_commit hash in docs/source/conf.py to track a more recent version of the upstream vLLM repository. The review feedback identifies that the PR title and summary do not comply with the repository's style guide, specifically noting a missing action tag and a typo in the title, and provides a structured template for the required summary.
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
What this PR does / why we need it?
fix [Bugfix] Fix SP pass for multimodal models and PP+SP residual handling vllm#33322
overwrite
gpu_modelrunner.sync_and_gather_intermediate_tensors, for the sceniropp+sp+tp, skip scatter the residual for ascend[Model Runner V2] support qwen35 / mamba hybrid model vllm#35520
Adapted to the modifications of
ModelRunner v2for hybrid attn in interface level, .Todo: Added support for Mamba in ModelRunner in Ascend. any pull_request is welcome
[Aiter][ROCm] gdn_linear_attn kernel fusion vllm#40711
[Attention][Cleanup] Remove tree attention vllm#42121
[Model] use AutoWeightsLoader for DeepSeekV2 vllm#41706
[Core] Replace routing replay with device cache and async D2H pipeline vllm#39917
Disable
async_schedulewhenenable_return_routed_experts=True[MoE Refactor] Move expert map related code into ExpertMapManager class vllm#41046
[MoE Refactor] EPLB refactoring for FusedMoE vllm#41055
[Model Runner V2] Apply synthetic mode to probabilistic rejection sampler vllm#41035
Revert "[Core] Replace routing replay with device cache and async D2H pipeline" (#39917) vllm#42434
Does this PR introduce any user-facing change?
How was this patch tested?