Skip to content

Bump up version to 0.1.1#204

Merged
zhuohan123 merged 1 commit intomainfrom
bumpup-version-0-1-1
Jun 22, 2023
Merged

Bump up version to 0.1.1#204
zhuohan123 merged 1 commit intomainfrom
bumpup-version-0-1-1

Conversation

@zhuohan123
Copy link
Copy Markdown
Member

No description provided.

@zhuohan123 zhuohan123 requested a review from WoosukKwon June 22, 2023 07:33
Copy link
Copy Markdown
Collaborator

@WoosukKwon WoosukKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@zhuohan123 zhuohan123 merged commit 83658c8 into main Jun 22, 2023
@zhuohan123 zhuohan123 deleted the bumpup-version-0-1-1 branch June 22, 2023 07:33
@zhuohan123 zhuohan123 restored the bumpup-version-0-1-1 branch June 22, 2023 07:34
@WoosukKwon WoosukKwon deleted the bumpup-version-0-1-1 branch June 22, 2023 08:01
michaelfeil pushed a commit to michaelfeil/vllm that referenced this pull request Jun 24, 2023
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
yukavio pushed a commit to yukavio/vllm that referenced this pull request Jul 3, 2024
SUMMARY:
* update NIGHTLY workflow to be whl centric
* update benchmarking jobs to use generated whl

TEST PLAN:
runs on remote push. i'm also triggering NIGHTLY manually.

---------

Co-authored-by: andy-neuma <andy@neuralmagic.com>
Co-authored-by: Domenic Barbuzzi <domenic@neuralmagic.com>
Co-authored-by: Domenic Barbuzzi <dbarbuzzi@gmail.com>
mht-sharma pushed a commit to mht-sharma/vllm that referenced this pull request Oct 30, 2024
dtrifiro added a commit to dtrifiro/vllm that referenced this pull request Apr 7, 2025
"variables" in `docker-bake.hcl` can have defaults, but are overridden
by env vars with the same name. We can remove these (useless) defaults
and fix the name for `GITHUB_REPO` (it's actually `GITHUB_REPOSITORY`)


Example:
```bash 
env \
  GITHUB_REPOSITORY=neuralmagic/nm-vllm-ent \
  PYTHON_VERSION=3.12 \
  GITHUB_SHA=$(git rev-parse HEAD) \
  VLLM_VERSION=0.8.3 \
  docker buildx bake cuda --print
```
output:
```json
{
  "group": {
    "default": {
      "targets": [
        "cuda"
      ]
    }
  },
  "target": {
    "cuda": {
      "context": ".",
      "dockerfile": "Dockerfile.ubi",
      "args": {
        "BASE_UBI_IMAGE_TAG": "9.5-1739420147",
        "FLASHINFER_VERSION": "https://github.com/flashinfer-ai/flashinfer/releases/download/v0.2.1.post1/flashinfer_python-0.2.1.post1+cu124torch2.5-cp38-abi3-linux_x86_64.whl",
        "LIBSODIUM_VERSION": "1.0.20",
        "PYTHON_VERSION": "3.12",
        "VLLM_TGIS_ADAPTER_VERSION": "0.6.3"
      },
      "labels": {
        "org.opencontainers.image.source": "https://github.com/neuralmagic/nm-vllm-ent",
        "vcs-ref": "9803ee1c6d30330c9dc3fca6d42491794f135013",
        "vcs-type": "git"
      },
      "tags": [
        "quay.io/vllm/vllm:0.8.3",
        "quay.io/vllm/vllm:9803ee1c6d30330c9dc3fca6d42491794f135013",
        "quay.io/vllm/vllm:2025-04-04-17-55"
      ],
      "platforms": [
        "linux/amd64"
      ]
    }
  }
}
```
chaojun-zhang pushed a commit to chaojun-zhang/vllm that referenced this pull request Jun 17, 2025
* use 2025.1.1 instead (vllm-project#196)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* Use standalone_compile by default in torch >= 2.8.0 (vllm-project#18846)

Signed-off-by: rzou <zou3519@gmail.com>

* fix xpu compile issue

---------

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: rzou <zou3519@gmail.com>
Co-authored-by: Richard Zou <zou3519@users.noreply.github.com>
chaojun-zhang pushed a commit to chaojun-zhang/vllm that referenced this pull request Jun 17, 2025
* use 2025.1.1 instead (vllm-project#196)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* Use standalone_compile by default in torch >= 2.8.0 (vllm-project#18846)

Signed-off-by: rzou <zou3519@gmail.com>

* fix xpu compile issue

---------

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: rzou <zou3519@gmail.com>
Co-authored-by: Richard Zou <zou3519@users.noreply.github.com>
jikunshang added a commit to jikunshang/vllm that referenced this pull request Jun 18, 2025
* use 2025.1.1 instead (vllm-project#196)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* Use standalone_compile by default in torch >= 2.8.0 (vllm-project#18846)

Signed-off-by: rzou <zou3519@gmail.com>

* fix xpu compile issue

---------

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: rzou <zou3519@gmail.com>
Co-authored-by: Richard Zou <zou3519@users.noreply.github.com>
zhenwei-intel pushed a commit to zhenwei-intel/vllm that referenced this pull request Jun 23, 2025
* use 2025.1.1 instead (vllm-project#196)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* Use standalone_compile by default in torch >= 2.8.0 (vllm-project#18846)

Signed-off-by: rzou <zou3519@gmail.com>

* fix xpu compile issue

---------

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: rzou <zou3519@gmail.com>
Co-authored-by: Richard Zou <zou3519@users.noreply.github.com>
jikunshang added a commit to jikunshang/vllm that referenced this pull request Jun 24, 2025
* use 2025.1.1 instead (vllm-project#196)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* Use standalone_compile by default in torch >= 2.8.0 (vllm-project#18846)

Signed-off-by: rzou <zou3519@gmail.com>

* fix xpu compile issue

---------

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: rzou <zou3519@gmail.com>
Co-authored-by: Richard Zou <zou3519@users.noreply.github.com>
amy-why-3459 pushed a commit to amy-why-3459/vllm that referenced this pull request Sep 15, 2025
…nd v_cache. (vllm-project#204)

This PR changes the shape of kv cache to avoid the view of k_cache and
v_cache.
What's more, cache the metadata of k_cache and v_cache to avoid
duplicative slice operations to improve performance.

Signed-off-by: hw_whx <wanghexiang7@huawei.com>
iwooook pushed a commit to moreh-dev/vllm that referenced this pull request Nov 29, 2025
* gpt oss integration

* [gpt-oss] flashinfer mxfp4 (vllm-project#22339)

Signed-off-by: simon-mo <xmo@berkeley.edu>
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
Co-authored-by: simon-mo <xmo@berkeley.edu>

* fix worker and update readme

* address comments

* precommit

* address comments

* pre-commit

* Update tt_metal/README.md

Co-authored-by: Salar Hosseini <159165450+skhorasganiTT@users.noreply.github.com>

* fix llama cmd

* fix mesh device tuple parsing

* revert rounding

---------

Signed-off-by: simon-mo <xmo@berkeley.edu>
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
Co-authored-by: Yongye Zhu <zyy1102000@gmail.com>
Co-authored-by: simon-mo <xmo@berkeley.edu>
Co-authored-by: Pratikkumar Prajapati <pprajapati@tenstorrent.com>
Co-authored-by: Salar Hosseini <159165450+skhorasganiTT@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants