Skip to content
This repository was archived by the owner on Sep 4, 2025. It is now read-only.

Conversation

@fialhocoelho
Copy link

PR to trigger the build process for a compiled image

WoosukKwon and others added 30 commits October 14, 2024 15:02
Co-authored-by: sanghol <[email protected]>
Co-authored-by: Roger Wang <[email protected]>
Co-authored-by: Roger Wang <[email protected]>
Co-authored-by: Varun Sundar Rabindranath <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
mgoin and others added 21 commits October 24, 2024 10:07
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
@fialhocoelho fialhocoelho requested a review from njhill as a code owner October 24, 2024 18:19
@openshift-ci openshift-ci bot requested review from NickLucche and dtrifiro October 24, 2024 18:20
@openshift-ci
Copy link

openshift-ci bot commented Oct 24, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fialhocoelho
Once this PR has been reviewed and has the lgtm label, please assign raghul-m for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link

openshift-ci bot commented Oct 24, 2024

@fialhocoelho: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/rocm-pr-image-mirror 080973c link true /test rocm-pr-image-mirror
ci/prow/images 080973c link true /test images
ci/prow/smoke-test 080973c link true /test smoke-test

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

riprasad pushed a commit to red-hat-data-services/vllm that referenced this pull request Oct 25, 2024
…ahub-io#245)

FILL IN THE PR DESCRIPTION HERE

for model(like chatglm2/3-6b) whose `rotary_dim` not equal to
`head_size`, current code will crash due to dim not equal.
opendatahub-io#212 have a not robust enough fix. chatglm series could work, but
chatglm2-6b result is not correct.
this fix follow vllm rotary_embeding pytorch native impl. verified on
chatglm2-6b and chatglm3-6b

**BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE
DESCRIPTION ABOVE**

---

<details>
<!-- inside this <details> section, markdown rendering does not work, so
we use raw html here. -->
<summary><b> PR Checklist (Click to Expand) </b></summary>

<p>Thank you for your contribution to vLLM! Before submitting the pull
request, please ensure the PR meets the following criteria. This helps
vLLM maintain the code quality and improve the efficiency of the review
process.</p>

<h3>PR Title and Classification</h3>
<p>Only specific types of PRs will be reviewed. The PR title is prefixed
appropriately to indicate the type of change. Please use one of the
following:</p>
<ul>
    <li><code>[Bugfix]</code> for bug fixes.</li>
<li><code>[CI/Build]</code> for build or continuous integration
improvements.</li>
<li><code>[Doc]</code> for documentation fixes and improvements.</li>
<li><code>[Model]</code> for adding a new model or improving an existing
model. Model name should appear in the title.</li>
<li><code>[Frontend]</code> For changes on the vLLM frontend (e.g.,
OpenAI API server, <code>LLM</code> class, etc.) </li>
<li><code>[Kernel]</code> for changes affecting CUDA kernels or other
compute kernels.</li>
<li><code>[Core]</code> for changes in the core vLLM logic (e.g.,
<code>LLMEngine</code>, <code>AsyncLLMEngine</code>,
<code>Scheduler</code>, etc.)</li>
<li><code>[Hardware][Vendor]</code> for hardware-specific changes.
Vendor name should appear in the prefix (e.g.,
<code>[Hardware][AMD]</code>).</li>
<li><code>[Misc]</code> for PRs that do not fit the above categories.
Please use this sparingly.</li>
</ul>
<p><strong>Note:</strong> If the PR spans more than one category, please
include all relevant prefixes.</p>

<h3>Code Quality</h3>

<p>The PR need to meet the following code quality standards:</p>

<ul>
<li>We adhere to <a
href="https://google.github.io/styleguide/pyguide.html">Google Python
style guide</a> and <a
href="https://google.github.io/styleguide/cppguide.html">Google C++
style guide</a>.</li>
<li>Pass all linter checks. Please use <a
href="https://github.com/vllm-project/vllm/blob/main/format.sh"><code>format.sh</code></a>
to format your code.</li>
<li>The code need to be well-documented to ensure future contributors
can easily understand the code.</li>
<li>Include sufficient tests to ensure the project to stay correct and
robust. This includes both unit tests and integration tests.</li>
<li>Please add documentation to <code>docs/source/</code> if the PR
modifies the user-facing behaviors of vLLM. It helps vLLM user
understand and utilize the new features or changes.</li>
</ul>

<h3>Notes for Large Changes</h3>
<p>Please keep the changes as concise as possible. For major
architectural changes (>500 LOC excluding kernel/data/config/test), we
would expect a GitHub issue (RFC) discussing the technical design and
justification. Otherwise, we will tag it with <code>rfc-required</code>
and might not go through the PR.</p>

<h3>What to Expect for the Reviews</h3>

<p>The goal of the vLLM team is to be a <i>transparent reviewing
machine</i>. We would like to make the review process transparent and
efficient and make sure no contributor feel confused or frustrated.
However, the vLLM team is small, so we need to prioritize some PRs over
others. Here is what you can expect from the review process: </p>

<ul>
<li> After the PR is submitted, the PR will be assigned to a reviewer.
Every reviewer will pick up the PRs based on their expertise and
availability.</li>
<li> After the PR is assigned, the reviewer will provide status update
every 2-3 days. If the PR is not reviewed within 7 days, please feel
free to ping the reviewer or the vLLM team.</li>
<li> After the review, the reviewer will put an <code>
action-required</code> label on the PR if there are changes required.
The contributor should address the comments and ping the reviewer to
re-review the PR.</li>
<li> Please respond to all comments within a reasonable time frame. If a
comment isn't clear or you disagree with a suggestion, feel free to ask
for clarification or discuss the suggestion.
 </li>
</ul>

<h3>Thank You</h3>

<p> Finally, thank you for taking the time to read these guidelines and
for your interest in contributing to vLLM. Your contributions make vLLM
a great tool for everyone! </p>


</details>
heyselbi pushed a commit to red-hat-data-services/vllm that referenced this pull request Apr 24, 2025
SUMMARY:

sync to upstream `v0.8.4` and cherry-pick of
`7eb42556281d30436a3a988f2c9184ec63c59338`. the cherry-pick is
@LucasWilkinson 's llama4 patch.

GIT LOG:
```bash
commit b197179 (HEAD -> sync-upstream-v0.8.4, origin/sync-upstream-v0.8.4)
Author: Lucas Wilkinson <[email protected]>
Date:   Fri Apr 18 01:13:29 2025 -0400

    [BugFix] Accuracy fix for llama4 int4 - improperly casted scales (vllm-project#16801)
    
    Signed-off-by: Lucas Wilkinson <[email protected]>

commit 60267cc
Author: andy-neuma <[email protected]>
Date:   Mon Apr 21 14:57:52 2025 -0400

    remove duplicate entries

commit 9d18b50
Merge: db0e117 dc1b4a6
Author: andy-neuma <[email protected]>
Date:   Mon Apr 21 14:50:01 2025 -0400

    Merge remote-tracking branch 'upstream/v0.8.4' into sync-upstream-v0.8.4

commit db0e117
Author: andy-neuma <[email protected]>
Date:   Mon Apr 21 14:35:23 2025 -0400

    Revert "Revert "[V1] DP scale-out (1/N): Use zmq ROUTER/DEALER sockets for input queue (vllm-project#15906)""
    
    This reverts commit 296c657.

commit dc1b4a6 (tag: v0.8.4, upstream/v0.8.4)
Author: Russell Bryant <[email protected]>
Date:   Sun Apr 13 22:13:38 2025 -0400

    [Core][V0] Enable regex support with xgrammar (vllm-project#13228)
    
    Signed-off-by: Russell Bryant <[email protected]>
```

COMMANDS:
```bash
git fetch upstream
git checkout -b sync-upstream-v0.8.4
git revert 296c657
git merge upstream/v0.8.4
git cherry-pick 7eb4255
```

TEST PLAN:

accept sync ...
https://github.com/neuralmagic/nm-cicd/actions/runs/14581880024

release ...
https://github.com/neuralmagic/nm-cicd/actions/runs/14596026989

---------

Signed-off-by: Tristan Leclercq <[email protected]>
Signed-off-by: yihong0618 <[email protected]>
Signed-off-by: reidliu41 <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: Jinzhen Lin <[email protected]>
Signed-off-by: Jonghyun Choe <[email protected]>
Signed-off-by: Lu Fang <[email protected]>
Signed-off-by: Hyesoo Yang <[email protected]>
Signed-off-by: Ben Jackson <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: rongfu.leng <[email protected]>
Signed-off-by: Varun Sundar Rabindranath <[email protected]>
Signed-off-by: paolovic <[email protected]>
Signed-off-by: Chengji Yao <[email protected]>
Signed-off-by: Kay Yan <[email protected]>
Signed-off-by: Woosuk Kwon <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: shen-shanshan <[email protected]>
Signed-off-by: YamPengLi <[email protected]>
Signed-off-by: WangErXiao <[email protected]>
Signed-off-by: Aston Zhang <[email protected]>
Signed-off-by: Chris Thi <[email protected]>
Signed-off-by: drisspg <[email protected]>
Signed-off-by: Jon Swenson <[email protected]>
Signed-off-by: Keyun Tong <[email protected]>
Signed-off-by: Lu Fang <[email protected]>
Signed-off-by: Xiaodong Wang <[email protected]>
Signed-off-by: Yang Chen <[email protected]>
Signed-off-by: Ye (Charlotte) Qi <[email protected]>
Signed-off-by: Yong Hoon Shin <[email protected]>
Signed-off-by: Zijing Liu <[email protected]>
Signed-off-by: Lu Fang <[email protected]>
Signed-off-by: Lucia Fang <[email protected]>
Signed-off-by: Gregory Shtrasberg <[email protected]>
Signed-off-by: NickLucche <[email protected]>
Signed-off-by: Benjamin Chislett <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: Leon Seidel <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: Miles Williams <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: Siyuan Liu <[email protected]>
Signed-off-by: Kebe <[email protected]>
Signed-off-by: simon-mo <[email protected]>
Signed-off-by: Alex-Brooks <[email protected]>
Signed-off-by: Tianyuan Wu <[email protected]>
Signed-off-by: imkero <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>
Signed-off-by: Russell Bryant <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Yue <[email protected]>
Signed-off-by: tjtanaa <[email protected]>
Signed-off-by: kliuae <[email protected]>
Signed-off-by: luka <[email protected]>
Signed-off-by: lvfei.lv <[email protected]>
Signed-off-by: Ajay Vohra <[email protected]>
Signed-off-by: Guillaume Calmettes <[email protected]>
Signed-off-by: zh Wang <[email protected]>
Signed-off-by: Chendi Xue <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: zRzRzRzRzRzRzR <[email protected]>
Signed-off-by: Aaron Ang <[email protected]>
Signed-off-by: Benjamin Kitor <[email protected]>
Signed-off-by: Michael Goin <[email protected]>
Signed-off-by: Chenyaaang <[email protected]>
Signed-off-by: cyy <[email protected]>
Signed-off-by: wineandchord <[email protected]>
Signed-off-by: LiuXiaoxuanPKU <[email protected]>
Signed-off-by: Chih-Chieh-Yang <[email protected]>
Signed-off-by: look <[email protected]>
Signed-off-by: jadewang21 <[email protected]>
Signed-off-by: alexey-belyakov <[email protected]>
Signed-off-by: jiang.li <[email protected]>
Signed-off-by: DefTruth <[email protected]>
Signed-off-by: chaow <[email protected]>
Signed-off-by: Tomasz Zielinski <[email protected]>
Signed-off-by: rzou <[email protected]>
Signed-off-by: Travis Johnson <[email protected]>
Signed-off-by: Christian Sears <[email protected]>
Signed-off-by: Gogs <[email protected]>
Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: Tianer Zhou <[email protected]>
Signed-off-by: [email protected] <[email protected]>
Signed-off-by: Jie Fu <[email protected]>
Signed-off-by: snowcharm <[email protected]>
Signed-off-by: Ryan McConville <[email protected]>
Co-authored-by: Tristan Leclercq <[email protected]>
Co-authored-by: Kevin H. Luu <[email protected]>
Co-authored-by: yihong <[email protected]>
Co-authored-by: Reid <[email protected]>
Co-authored-by: reidliu41 <[email protected]>
Co-authored-by: Harry Mellor <[email protected]>
Co-authored-by: Chauncey <[email protected]>
Co-authored-by: Jinzhen Lin <[email protected]>
Co-authored-by: Jonghyun Choe <[email protected]>
Co-authored-by: Lucia Fang <[email protected]>
Co-authored-by: Hyesoo Yang <[email protected]>
Co-authored-by: Ben Jackson <[email protected]>
Co-authored-by: Roger Wang <[email protected]>
Co-authored-by: Paul Schweigert <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
Co-authored-by: rongfu.leng <[email protected]>
Co-authored-by: Varun Sundar Rabindranath <[email protected]>
Co-authored-by: Varun Sundar Rabindranath <[email protected]>
Co-authored-by: paolovic <[email protected]>
Co-authored-by: paolovic <[email protected]>
Co-authored-by: Chengji Yao <[email protected]>
Co-authored-by: Woosuk Kwon <[email protected]>
Co-authored-by: Martin Hoyer <[email protected]>
Co-authored-by: Kay Yan <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Shanshan Shen <[email protected]>
Co-authored-by: YamPengLi <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Robin <[email protected]>
Co-authored-by: Lu Fang <[email protected]>
Co-authored-by: Lu Fang <[email protected]>
Co-authored-by: Roger Wang <[email protected]>
Co-authored-by: Gregory Shtrasberg <[email protected]>
Co-authored-by: Nicolò Lucchesi <[email protected]>
Co-authored-by: Benjamin Chislett <[email protected]>
Co-authored-by: Nick Hill <[email protected]>
Co-authored-by: leon-seidel <[email protected]>
Co-authored-by: Driss Guessous <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
Co-authored-by: youkaichao <[email protected]>
Co-authored-by: Miles Williams <[email protected]>
Co-authored-by: Satyajith Chilappagari <[email protected]>
Co-authored-by: mgoin <[email protected]>
Co-authored-by: Jennifer Zhao <[email protected]>
Co-authored-by: zxfan-cpu <[email protected]>
Co-authored-by: Yong Hoon Shin <[email protected]>
Co-authored-by: Siyuan Liu <[email protected]>
Co-authored-by: Kebe <[email protected]>
Co-authored-by: Simon Mo <[email protected]>
Co-authored-by: Alex Brooks <[email protected]>
Co-authored-by: TY-AMD <[email protected]>
Co-authored-by: wang.yuqi <[email protected]>
Co-authored-by: Kero Liang <[email protected]>
Co-authored-by: Lucas Wilkinson <[email protected]>
Co-authored-by: Russell Bryant <[email protected]>
Co-authored-by: Jee Jee Li <[email protected]>
Co-authored-by: yueshen2016 <[email protected]>
Co-authored-by: TJian <[email protected]>
Co-authored-by: Hongxia Yang <[email protected]>
Co-authored-by: kliuae <[email protected]>
Co-authored-by: Luka Govedič <[email protected]>
Co-authored-by: Accelerator1996 <[email protected]>
Co-authored-by: ajayvohra2005 <[email protected]>
Co-authored-by: Guillaume Calmettes <[email protected]>
Co-authored-by: zh Wang <[email protected]>
Co-authored-by: Chendi.Xue <[email protected]>
Co-authored-by: Joe Runde <[email protected]>
Co-authored-by: Yuxuan Zhang <[email protected]>
Co-authored-by: Aaron Ang <[email protected]>
Co-authored-by: Jintao <[email protected]>
Co-authored-by: Benjamin Kitor <[email protected]>
Co-authored-by: Chenyaaang <[email protected]>
Co-authored-by: cyyever <[email protected]>
Co-authored-by: Ye (Charlotte) Qi <[email protected]>
Co-authored-by: wineandchord <[email protected]>
Co-authored-by: Nicolò Lucchesi <[email protected]>
Co-authored-by: Lily Liu <[email protected]>
Co-authored-by: Chih-Chieh Yang <[email protected]>
Co-authored-by: Yu Chin Fabian Lim <[email protected]>
Co-authored-by: look <[email protected]>
Co-authored-by: WWW <[email protected]>
Co-authored-by: Alexey Belyakov <[email protected]>
Co-authored-by: Li, Jiang <[email protected]>
Co-authored-by: DefTruth <[email protected]>
Co-authored-by: chaow-amd <[email protected]>
Co-authored-by: Tomasz Zielinski <[email protected]>
Co-authored-by: Richard Zou <[email protected]>
Co-authored-by: Travis Johnson <[email protected]>
Co-authored-by: Kai Wu <[email protected]>
Co-authored-by: Christian Sears <[email protected]>
Co-authored-by: Gogs <[email protected]>
Co-authored-by: Yuan Tang <[email protected]>
Co-authored-by: Tianer Zhou <[email protected]>
Co-authored-by: Huazhong Ji <[email protected]>
Co-authored-by: Jie Fu (傅杰) <[email protected]>
Co-authored-by: SnowCharm <[email protected]>
Co-authored-by: Ryan McConville <[email protected]>
Co-authored-by: andy-neuma <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.