[Docs] Add multi-thread weight loading documentation by SamitHuang · Pull Request #2445 · vllm-project/vllm-omni

SamitHuang · 2026-04-02T08:33:32Z

Purpose

Add documentation for the multi-thread weight loading startup optimization introduced in PR #1504. This feature loads safetensors shards in parallel using a thread pool, reducing model startup time by 5-6x for large diffusion models.

Updates docs/user_guide/diffusion_features.md to include:

A "Startup Optimization" entry in the feature overview table under Lossless Acceleration
A dedicated "Multi-Thread Weight Loading" section with:
- Feature description and default behavior (enabled by default, 4 threads)
- Configuration table with CLI flags (--disable-multithread-weight-load, --num-weight-load-threads) and Python parameters
- Online serving and offline inference usage examples
- Benchmark results (Qwen-Image: 168s → 27s, Wan2.2 I2V 14B: 283s → 56s on H800)
A link in the "Learn More" section

Test Plan

Documentation-only change. Verified markdown rendering and internal anchor links are correct.

Test Result

N/A (docs only).

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

Document the multi-thread weight loading startup optimization introduced in PR vllm-project#1504, including configuration, CLI flags, usage examples, and benchmark results. Made-with: Cursor Signed-off-by: samithuang <285365963@qq.com>

gcanlin

LGTM

Signed-off-by: samithuang <285365963@qq.com>

SamitHuang requested a review from Gaohan123 April 2, 2026 08:34

gcanlin approved these changes Apr 3, 2026

View reviewed changes

SamitHuang added the ready label to trigger buildkite CI label Apr 3, 2026

SamitHuang merged commit a7bf405 into vllm-project:main Apr 9, 2026
6 checks passed

vraiti pushed a commit to vraiti/vllm-omni that referenced this pull request Apr 9, 2026

[Docs] Add multi-thread weight loading documentation (vllm-project#2445)

f9fb024

Signed-off-by: samithuang <285365963@qq.com>

Sy0307 pushed a commit to Sy0307/vllm-omni that referenced this pull request Apr 10, 2026

[Docs] Add multi-thread weight loading documentation (vllm-project#2445)

676558d

Signed-off-by: samithuang <285365963@qq.com>

daixinning pushed a commit to daixinning/vllm-omni that referenced this pull request Apr 13, 2026

[Docs] Add multi-thread weight loading documentation (vllm-project#2445)

555f8a7

Signed-off-by: samithuang <285365963@qq.com>

lengrongfu pushed a commit to lengrongfu/vllm-omni that referenced this pull request May 1, 2026

[Docs] Add multi-thread weight loading documentation (vllm-project#2445)

0022370

Signed-off-by: samithuang <285365963@qq.com>

clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026

[Docs] Add multi-thread weight loading documentation (vllm-project#2445)

3c7b175

Signed-off-by: samithuang <285365963@qq.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Docs] Add multi-thread weight loading documentation#2445

[Docs] Add multi-thread weight loading documentation#2445
SamitHuang merged 1 commit into
vllm-project:mainfrom
SamitHuang:docs/multithread-weight-loading

SamitHuang commented Apr 2, 2026 •

edited

Loading

Uh oh!

gcanlin left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SamitHuang commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gcanlin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SamitHuang commented Apr 2, 2026 •

edited

Loading