Skip to content

[release] Add nightly wheel release index#2345

Merged
khluu merged 7 commits intomainfrom
khluu/release
Mar 31, 2026
Merged

[release] Add nightly wheel release index#2345
khluu merged 7 commits intomainfrom
khluu/release

Conversation

@khluu
Copy link
Copy Markdown
Collaborator

@khluu khluu commented Mar 30, 2026

Summary

Add a nightly wheel release pipeline for vllm-omni. Wheels are hosted on the existing vllm-wheels S3 bucket under the omni/ prefix, served via wheels.vllm.ai.

How it works

  1. Build — Installs uv with Python 3.12, runs python3 -m build to produce a pure Python wheel
  2. Upload — Uploads the wheel to s3://vllm-wheels/omni/<commit>/
  3. Index — Generates a PEP 503 compliant index and uploads it to:
    • s3://vllm-wheels/omni/<commit>/ (always)
    • s3://vllm-wheels/omni/nightly/ (when NIGHTLY=1 env var is set)
    • s3://vllm-wheels/omni/<version>/ (for non-dev release versions only)

The nightly index at omni/nightly/ uses relative paths pointing back to the latest commit's wheel directory.

Installation

uv pip install vllm-omni --extra-index-url https://wheels.vllm.ai/omni/nightly --torch-backend=auto

Files added

File Description
.buildkite/nightly-release-pipeline.yaml Two-step pipeline: build wheel, then generate index
.buildkite/scripts/upload-nightly-wheels.sh Uploads wheel from dist/ to S3 under omni/<commit>/
.buildkite/scripts/generate-and-upload-nightly-index.sh Lists wheels from S3, generates index, uploads to nightly/commit/version paths
.buildkite/scripts/generate-nightly-index.py PEP 503 index generator (simplified from vLLM's script)

S3 layout

s3://vllm-wheels/
├── omni/
│   ├── <commit>/
│   │   ├── vllm_omni-*.whl
│   │   ├── index.html
│   │   └── vllm-omni/
│   │       ├── index.html
│   │       └── metadata.json
│   └── nightly/          # symlink-like copy of latest commit's index
│       ├── index.html
│       └── vllm-omni/
│           ├── index.html
│           └── metadata.json
├── nightly/              # existing vLLM nightly (unaffected)
└── ...

🤖 Generated with Claude Code

khluu and others added 4 commits March 30, 2026 14:26
p
Signed-off-by: khluu <khluu000@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@khluu khluu requested a review from ywang96 March 30, 2026 22:24
@khluu khluu requested a review from hsliuustc0106 as a code owner March 30, 2026 22:24
khluu and others added 2 commits March 30, 2026 15:26
Remove variant/ROCm/alias handling -- just generate a flat PEP 503
index. Install uv with Python 3.12 for faster wheel builds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@khluu
Copy link
Copy Markdown
Collaborator Author

khluu commented Mar 30, 2026

You can try it out with uv pip install vllm-omni --extra-index-url https://wheels.vllm.ai/omni/nightly --torch-backend=auto

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9c66d3671e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +96 to +97
if suffix.startswith(("rocm", "cu", "cpu")):
variant = suffix
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Parse +npu/+xpu release suffixes as variants

Update parse_from_filename to recognize all release device suffixes the project can emit. The current check only treats rocm, cu, and cpu suffixes as variants, so release wheels like ...+npu or ...+xpu (produced by setup.py) are misclassified as the default variant. If multiple device wheels are uploaded under one commit/version directory, the generated default index can point at NPU/XPU artifacts instead of isolating them under variant subdirectories, which leads to incorrect wheel resolution from the generic index.

Useful? React with 👍 / 👎.

@khluu
Copy link
Copy Markdown
Collaborator Author

khluu commented Mar 30, 2026

@khluu khluu enabled auto-merge (squash) March 31, 2026 07:57
@khluu khluu added the ready label to trigger buildkite CI label Mar 31, 2026
Copy link
Copy Markdown
Collaborator

@Gaohan123 Gaohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's check if it can solve the pipeline failure https://buildkite.com/vllm/omni-release/builds/21/steps/canvas

@khluu khluu merged commit f8d0bf5 into main Mar 31, 2026
6 of 7 checks passed
@david6666666
Copy link
Copy Markdown
Collaborator

Could you add a documentation explaining how users can use this feature? Thank you very much.

vraiti pushed a commit to vraiti/vllm-omni that referenced this pull request Apr 9, 2026
Signed-off-by: khluu <khluu000@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants