-
Notifications
You must be signed in to change notification settings - Fork 15
[do not merge] ibm-dev build ibm-20241024 #212
Conversation
…lm-project#8909) Co-authored-by: DarkLight1337 <[email protected]>
…ty token_ids (vllm-project#9034) Co-authored-by: Nick Hill <[email protected]>
Co-authored-by: sanghol <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]>
…ls (vllm-project#9412) Co-authored-by: DarkLight1337 <[email protected]>
…project#9267) Signed-off-by: Russell Bryant <[email protected]>
Co-authored-by: Varun Sundar Rabindranath <[email protected]> Co-authored-by: Michael Goin <[email protected]>
…t#9628) Signed-off-by: mgoin <[email protected]>
Signed-off-by: Vinay Damodaran <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>
Signed-off-by: Woosuk Kwon <[email protected]>
Co-authored-by: Zhuohan Li <[email protected]>
…-project#9639) Signed-off-by: youkaichao <[email protected]> Co-authored-by: youkaichao <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
…-project#9637) Signed-off-by: youkaichao <[email protected]> Co-authored-by: youkaichao <[email protected]>
…workflow (vllm-project#9661) Signed-off-by: Harry Mellor <[email protected]>
…-project#9641) Signed-off-by: youkaichao <[email protected]> Co-authored-by: youkaichao <[email protected]>
…20241024 Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: fialhocoelho The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@fialhocoelho: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
…ahub-io#245) FILL IN THE PR DESCRIPTION HERE for model(like chatglm2/3-6b) whose `rotary_dim` not equal to `head_size`, current code will crash due to dim not equal. opendatahub-io#212 have a not robust enough fix. chatglm series could work, but chatglm2-6b result is not correct. this fix follow vllm rotary_embeding pytorch native impl. verified on chatglm2-6b and chatglm3-6b **BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE** --- <details> <!-- inside this <details> section, markdown rendering does not work, so we use raw html here. --> <summary><b> PR Checklist (Click to Expand) </b></summary> <p>Thank you for your contribution to vLLM! Before submitting the pull request, please ensure the PR meets the following criteria. This helps vLLM maintain the code quality and improve the efficiency of the review process.</p> <h3>PR Title and Classification</h3> <p>Only specific types of PRs will be reviewed. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:</p> <ul> <li><code>[Bugfix]</code> for bug fixes.</li> <li><code>[CI/Build]</code> for build or continuous integration improvements.</li> <li><code>[Doc]</code> for documentation fixes and improvements.</li> <li><code>[Model]</code> for adding a new model or improving an existing model. Model name should appear in the title.</li> <li><code>[Frontend]</code> For changes on the vLLM frontend (e.g., OpenAI API server, <code>LLM</code> class, etc.) </li> <li><code>[Kernel]</code> for changes affecting CUDA kernels or other compute kernels.</li> <li><code>[Core]</code> for changes in the core vLLM logic (e.g., <code>LLMEngine</code>, <code>AsyncLLMEngine</code>, <code>Scheduler</code>, etc.)</li> <li><code>[Hardware][Vendor]</code> for hardware-specific changes. Vendor name should appear in the prefix (e.g., <code>[Hardware][AMD]</code>).</li> <li><code>[Misc]</code> for PRs that do not fit the above categories. Please use this sparingly.</li> </ul> <p><strong>Note:</strong> If the PR spans more than one category, please include all relevant prefixes.</p> <h3>Code Quality</h3> <p>The PR need to meet the following code quality standards:</p> <ul> <li>We adhere to <a href="https://google.github.io/styleguide/pyguide.html">Google Python style guide</a> and <a href="https://google.github.io/styleguide/cppguide.html">Google C++ style guide</a>.</li> <li>Pass all linter checks. Please use <a href="https://github.com/vllm-project/vllm/blob/main/format.sh"><code>format.sh</code></a> to format your code.</li> <li>The code need to be well-documented to ensure future contributors can easily understand the code.</li> <li>Include sufficient tests to ensure the project to stay correct and robust. This includes both unit tests and integration tests.</li> <li>Please add documentation to <code>docs/source/</code> if the PR modifies the user-facing behaviors of vLLM. It helps vLLM user understand and utilize the new features or changes.</li> </ul> <h3>Notes for Large Changes</h3> <p>Please keep the changes as concise as possible. For major architectural changes (>500 LOC excluding kernel/data/config/test), we would expect a GitHub issue (RFC) discussing the technical design and justification. Otherwise, we will tag it with <code>rfc-required</code> and might not go through the PR.</p> <h3>What to Expect for the Reviews</h3> <p>The goal of the vLLM team is to be a <i>transparent reviewing machine</i>. We would like to make the review process transparent and efficient and make sure no contributor feel confused or frustrated. However, the vLLM team is small, so we need to prioritize some PRs over others. Here is what you can expect from the review process: </p> <ul> <li> After the PR is submitted, the PR will be assigned to a reviewer. Every reviewer will pick up the PRs based on their expertise and availability.</li> <li> After the PR is assigned, the reviewer will provide status update every 2-3 days. If the PR is not reviewed within 7 days, please feel free to ping the reviewer or the vLLM team.</li> <li> After the review, the reviewer will put an <code> action-required</code> label on the PR if there are changes required. The contributor should address the comments and ping the reviewer to re-review the PR.</li> <li> Please respond to all comments within a reasonable time frame. If a comment isn't clear or you disagree with a suggestion, feel free to ask for clarification or discuss the suggestion. </li> </ul> <h3>Thank You</h3> <p> Finally, thank you for taking the time to read these guidelines and for your interest in contributing to vLLM. Your contributions make vLLM a great tool for everyone! </p> </details>
SUMMARY: sync to upstream `v0.8.4` and cherry-pick of `7eb42556281d30436a3a988f2c9184ec63c59338`. the cherry-pick is @LucasWilkinson 's llama4 patch. GIT LOG: ```bash commit b197179 (HEAD -> sync-upstream-v0.8.4, origin/sync-upstream-v0.8.4) Author: Lucas Wilkinson <[email protected]> Date: Fri Apr 18 01:13:29 2025 -0400 [BugFix] Accuracy fix for llama4 int4 - improperly casted scales (vllm-project#16801) Signed-off-by: Lucas Wilkinson <[email protected]> commit 60267cc Author: andy-neuma <[email protected]> Date: Mon Apr 21 14:57:52 2025 -0400 remove duplicate entries commit 9d18b50 Merge: db0e117 dc1b4a6 Author: andy-neuma <[email protected]> Date: Mon Apr 21 14:50:01 2025 -0400 Merge remote-tracking branch 'upstream/v0.8.4' into sync-upstream-v0.8.4 commit db0e117 Author: andy-neuma <[email protected]> Date: Mon Apr 21 14:35:23 2025 -0400 Revert "Revert "[V1] DP scale-out (1/N): Use zmq ROUTER/DEALER sockets for input queue (vllm-project#15906)"" This reverts commit 296c657. commit dc1b4a6 (tag: v0.8.4, upstream/v0.8.4) Author: Russell Bryant <[email protected]> Date: Sun Apr 13 22:13:38 2025 -0400 [Core][V0] Enable regex support with xgrammar (vllm-project#13228) Signed-off-by: Russell Bryant <[email protected]> ``` COMMANDS: ```bash git fetch upstream git checkout -b sync-upstream-v0.8.4 git revert 296c657 git merge upstream/v0.8.4 git cherry-pick 7eb4255 ``` TEST PLAN: accept sync ... https://github.com/neuralmagic/nm-cicd/actions/runs/14581880024 release ... https://github.com/neuralmagic/nm-cicd/actions/runs/14596026989 --------- Signed-off-by: Tristan Leclercq <[email protected]> Signed-off-by: yihong0618 <[email protected]> Signed-off-by: reidliu41 <[email protected]> Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: chaunceyjiang <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]> Signed-off-by: Jonghyun Choe <[email protected]> Signed-off-by: Lu Fang <[email protected]> Signed-off-by: Hyesoo Yang <[email protected]> Signed-off-by: Ben Jackson <[email protected]> Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Isotr0py <[email protected]> Signed-off-by: rongfu.leng <[email protected]> Signed-off-by: Varun Sundar Rabindranath <[email protected]> Signed-off-by: paolovic <[email protected]> Signed-off-by: Chengji Yao <[email protected]> Signed-off-by: Kay Yan <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: shen-shanshan <[email protected]> Signed-off-by: YamPengLi <[email protected]> Signed-off-by: WangErXiao <[email protected]> Signed-off-by: Aston Zhang <[email protected]> Signed-off-by: Chris Thi <[email protected]> Signed-off-by: drisspg <[email protected]> Signed-off-by: Jon Swenson <[email protected]> Signed-off-by: Keyun Tong <[email protected]> Signed-off-by: Lu Fang <[email protected]> Signed-off-by: Xiaodong Wang <[email protected]> Signed-off-by: Yang Chen <[email protected]> Signed-off-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: Yong Hoon Shin <[email protected]> Signed-off-by: Zijing Liu <[email protected]> Signed-off-by: Lu Fang <[email protected]> Signed-off-by: Lucia Fang <[email protected]> Signed-off-by: Gregory Shtrasberg <[email protected]> Signed-off-by: NickLucche <[email protected]> Signed-off-by: Benjamin Chislett <[email protected]> Signed-off-by: Nick Hill <[email protected]> Signed-off-by: Leon Seidel <[email protected]> Signed-off-by: mgoin <[email protected]> Signed-off-by: youkaichao <[email protected]> Signed-off-by: Miles Williams <[email protected]> Signed-off-by: mgoin <[email protected]> Signed-off-by: Siyuan Liu <[email protected]> Signed-off-by: Kebe <[email protected]> Signed-off-by: simon-mo <[email protected]> Signed-off-by: Alex-Brooks <[email protected]> Signed-off-by: Tianyuan Wu <[email protected]> Signed-off-by: imkero <[email protected]> Signed-off-by: Lucas Wilkinson <[email protected]> Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Yue <[email protected]> Signed-off-by: tjtanaa <[email protected]> Signed-off-by: kliuae <[email protected]> Signed-off-by: luka <[email protected]> Signed-off-by: lvfei.lv <[email protected]> Signed-off-by: Ajay Vohra <[email protected]> Signed-off-by: Guillaume Calmettes <[email protected]> Signed-off-by: zh Wang <[email protected]> Signed-off-by: Chendi Xue <[email protected]> Signed-off-by: Joe Runde <[email protected]> Signed-off-by: zRzRzRzRzRzRzR <[email protected]> Signed-off-by: Aaron Ang <[email protected]> Signed-off-by: Benjamin Kitor <[email protected]> Signed-off-by: Michael Goin <[email protected]> Signed-off-by: Chenyaaang <[email protected]> Signed-off-by: cyy <[email protected]> Signed-off-by: wineandchord <[email protected]> Signed-off-by: LiuXiaoxuanPKU <[email protected]> Signed-off-by: Chih-Chieh-Yang <[email protected]> Signed-off-by: look <[email protected]> Signed-off-by: jadewang21 <[email protected]> Signed-off-by: alexey-belyakov <[email protected]> Signed-off-by: jiang.li <[email protected]> Signed-off-by: DefTruth <[email protected]> Signed-off-by: chaow <[email protected]> Signed-off-by: Tomasz Zielinski <[email protected]> Signed-off-by: rzou <[email protected]> Signed-off-by: Travis Johnson <[email protected]> Signed-off-by: Christian Sears <[email protected]> Signed-off-by: Gogs <[email protected]> Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: Tianer Zhou <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Jie Fu <[email protected]> Signed-off-by: snowcharm <[email protected]> Signed-off-by: Ryan McConville <[email protected]> Co-authored-by: Tristan Leclercq <[email protected]> Co-authored-by: Kevin H. Luu <[email protected]> Co-authored-by: yihong <[email protected]> Co-authored-by: Reid <[email protected]> Co-authored-by: reidliu41 <[email protected]> Co-authored-by: Harry Mellor <[email protected]> Co-authored-by: Chauncey <[email protected]> Co-authored-by: Jinzhen Lin <[email protected]> Co-authored-by: Jonghyun Choe <[email protected]> Co-authored-by: Lucia Fang <[email protected]> Co-authored-by: Hyesoo Yang <[email protected]> Co-authored-by: Ben Jackson <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Paul Schweigert <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: rongfu.leng <[email protected]> Co-authored-by: Varun Sundar Rabindranath <[email protected]> Co-authored-by: Varun Sundar Rabindranath <[email protected]> Co-authored-by: paolovic <[email protected]> Co-authored-by: paolovic <[email protected]> Co-authored-by: Chengji Yao <[email protected]> Co-authored-by: Woosuk Kwon <[email protected]> Co-authored-by: Martin Hoyer <[email protected]> Co-authored-by: Kay Yan <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Shanshan Shen <[email protected]> Co-authored-by: YamPengLi <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Robin <[email protected]> Co-authored-by: Lu Fang <[email protected]> Co-authored-by: Lu Fang <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Gregory Shtrasberg <[email protected]> Co-authored-by: Nicolò Lucchesi <[email protected]> Co-authored-by: Benjamin Chislett <[email protected]> Co-authored-by: Nick Hill <[email protected]> Co-authored-by: leon-seidel <[email protected]> Co-authored-by: Driss Guessous <[email protected]> Co-authored-by: Michael Goin <[email protected]> Co-authored-by: youkaichao <[email protected]> Co-authored-by: Miles Williams <[email protected]> Co-authored-by: Satyajith Chilappagari <[email protected]> Co-authored-by: mgoin <[email protected]> Co-authored-by: Jennifer Zhao <[email protected]> Co-authored-by: zxfan-cpu <[email protected]> Co-authored-by: Yong Hoon Shin <[email protected]> Co-authored-by: Siyuan Liu <[email protected]> Co-authored-by: Kebe <[email protected]> Co-authored-by: Simon Mo <[email protected]> Co-authored-by: Alex Brooks <[email protected]> Co-authored-by: TY-AMD <[email protected]> Co-authored-by: wang.yuqi <[email protected]> Co-authored-by: Kero Liang <[email protected]> Co-authored-by: Lucas Wilkinson <[email protected]> Co-authored-by: Russell Bryant <[email protected]> Co-authored-by: Jee Jee Li <[email protected]> Co-authored-by: yueshen2016 <[email protected]> Co-authored-by: TJian <[email protected]> Co-authored-by: Hongxia Yang <[email protected]> Co-authored-by: kliuae <[email protected]> Co-authored-by: Luka Govedič <[email protected]> Co-authored-by: Accelerator1996 <[email protected]> Co-authored-by: ajayvohra2005 <[email protected]> Co-authored-by: Guillaume Calmettes <[email protected]> Co-authored-by: zh Wang <[email protected]> Co-authored-by: Chendi.Xue <[email protected]> Co-authored-by: Joe Runde <[email protected]> Co-authored-by: Yuxuan Zhang <[email protected]> Co-authored-by: Aaron Ang <[email protected]> Co-authored-by: Jintao <[email protected]> Co-authored-by: Benjamin Kitor <[email protected]> Co-authored-by: Chenyaaang <[email protected]> Co-authored-by: cyyever <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]> Co-authored-by: wineandchord <[email protected]> Co-authored-by: Nicolò Lucchesi <[email protected]> Co-authored-by: Lily Liu <[email protected]> Co-authored-by: Chih-Chieh Yang <[email protected]> Co-authored-by: Yu Chin Fabian Lim <[email protected]> Co-authored-by: look <[email protected]> Co-authored-by: WWW <[email protected]> Co-authored-by: Alexey Belyakov <[email protected]> Co-authored-by: Li, Jiang <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: chaow-amd <[email protected]> Co-authored-by: Tomasz Zielinski <[email protected]> Co-authored-by: Richard Zou <[email protected]> Co-authored-by: Travis Johnson <[email protected]> Co-authored-by: Kai Wu <[email protected]> Co-authored-by: Christian Sears <[email protected]> Co-authored-by: Gogs <[email protected]> Co-authored-by: Yuan Tang <[email protected]> Co-authored-by: Tianer Zhou <[email protected]> Co-authored-by: Huazhong Ji <[email protected]> Co-authored-by: Jie Fu (傅杰) <[email protected]> Co-authored-by: SnowCharm <[email protected]> Co-authored-by: Ryan McConville <[email protected]> Co-authored-by: andy-neuma <[email protected]>
PR to trigger the build process for a compiled image