Skip to content
This repository was archived by the owner on Sep 4, 2025. It is now read-only.

Conversation

@fialhocoelho
Copy link

Image build.

DarkLight1337 and others added 30 commits October 17, 2024 13:55
Co-authored-by: Varun Sundar Rabindranath <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
…vllm-project#8704)

Removing the block manager v1. This is the initial piece of prefix-caching-centric design. In order to achieve prefix-caching-centric design, we need to simplify the code path so that we only use v2 block manager (which has much higher performance on prefix caching).
…ect#9056)

Signed-off-by: Max de Bayser <[email protected]>
Signed-off-by: Max de Bayser <[email protected]>
Co-authored-by: Andrew Feldman <[email protected]>
Co-authored-by: afeldman-nm <[email protected]>
Co-authored-by: Woosuk Kwon <[email protected]>
Co-authored-by: laishzh <[email protected]>
Co-authored-by: Max de Bayser <[email protected]>
Co-authored-by: Max de Bayser <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Kunjan Patel <kunjanp_google_com@vllm.us-central1-a.c.kunjanp-gke-dev-2.internal>
…r cores (vllm-project#9497)

Signed-off-by: Thomas Parnell <[email protected]>
Co-authored-by: Chih-Chieh Yang <[email protected]>
Co-authored-by: Cody Yu <[email protected]>
robertgshaw2-redhat and others added 15 commits November 4, 2024 16:01
Signed-off-by: Linkun Chen <[email protected]>
Co-authored-by: Linkun Chen <[email protected]>
Co-authored-by: Linkun Chen <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
@fialhocoelho fialhocoelho self-assigned this Nov 5, 2024
@fialhocoelho fialhocoelho requested a review from njhill as a code owner November 5, 2024 15:14
@openshift-ci openshift-ci bot requested review from heyselbi and rpancham November 5, 2024 15:14
@openshift-ci
Copy link

openshift-ci bot commented Nov 5, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fialhocoelho
Once this PR has been reviewed and has the lgtm label, please assign danielezonca for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>

Pin tiktoken >=0.7.0, <0.8.0

Signed-off-by: Jefferson Fialho <[email protected]>

Pin tiktoken==0.7.0

Signed-off-by: Jefferson Fialho <[email protected]>

Pin pillow==10.4.0

Signed-off-by: Jefferson Fialho <[email protected]>

pin pytorch in cmake list

Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
Signed-off-by: Jefferson Fialho <[email protected]>
@openshift-ci
Copy link

openshift-ci bot commented Nov 6, 2024

@fialhocoelho: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/rocm-pr-image-mirror 9b51d3d link true /test rocm-pr-image-mirror
ci/prow/images 9b51d3d link true /test images
ci/prow/smoke-test 9b51d3d link true /test smoke-test

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

groenenboomj pushed a commit that referenced this pull request Feb 24, 2025
…218)

* Automatically set rpd env var with profile flag

* Add readme

* Fix lint errors

---------

Co-authored-by: AdrianAbeyta <[email protected]>
dtrifiro pushed a commit to red-hat-data-services/vllm that referenced this pull request May 13, 2025
Syncing midstream NM fork to Upstream tag of
[v0.8.5.post1](https://github.com/vllm-project/vllm/tree/v0.8.5.post1) +
cherry pick of
vllm-project@be633fb
needed for benchmarks +
[CP](neuralmagic/nm-vllm-ent@1fe447d)
for compressed tensor bump +
[CP](vllm-project#17677) for lora on AMD +
[CP](vllm-project#17315) for llama4 w/ pure
dense layers

```
commit 31c73ba (HEAD -> upstream-v0.8.5, nm-fork/upstream-v0.8.5)
Author: Chauncey <[email protected]>
Date:   Wed Apr 30 15:11:04 2025 +0800

    [Bugfix] Fix AttributeError: 'State' object has no attribute 'engine_client' (vllm-project#17434)
    
    Signed-off-by: chaunceyjiang <[email protected]>

commit f8db0bd
Author: Lucas Wilkinson <[email protected]>
Date:   Fri May 2 14:01:38 2025 -0400

    [BugFix][Attention] Fix sliding window attention in V1 giving incorrect results (vllm-project#17574)
    
    Signed-off-by: Lucas Wilkinson <[email protected]>

commit e335c34
Author: Robert Shaw <[email protected]>
Date:   Fri May 2 04:07:03 2025 -0400

    [BugFix] Fix Memory Leak (vllm-project#17567)
    
    Signed-off-by: [email protected] <[email protected]>

commit cc463fe
Merge: 1e358ff ba41cc9
Author: Selbi Nuryyeva <[email protected]>
Date:   Tue Apr 29 12:34:57 2025 -0400

    Merge branch 'tag-upstream-v0.8.5' into upstream-v0.8.5

commit ba41cc9 (tag: v0.8.5, tag-upstream-v0.8.5)
Author: Michael Goin <[email protected]>
Date:   Mon Apr 28 16:20:24 2025 -0600

    [Model] Add tuned triton fused_moe configs for Qwen3Moe (vllm-project#17328)
    
    Signed-off-by: mgoin <[email protected]>

commit dcbac4c
Author: Simon Mo <[email protected]>
Date:   Mon Apr 28 14:12:01 2025 -0700

    [Model] Qwen3 Dense FP8 Compat Fixes (vllm-project#17318)
    
    Signed-off-by: simon-mo <[email protected]>
[...]
```

Commands
```
git fetch upstream
git checkout -b upstream-v0.8.5
git merge upstream/v0.8.5
git cherry-pick be633fb
```

TEST PLAN
accept sync:
https://github.com/neuralmagic/nm-cicd/actions/runs/14841223552
related PR in cicd: neuralmagic/nm-cicd#99
release workflow:
https://github.com/neuralmagic/nm-cicd/actions/runs/14845693864
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.