[BugFix]config priority fix#2289
Conversation
74f178b to
d03ed70
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d03ed7022f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| stage_args: | ||
| - stage_id: 0 |
There was a problem hiding this comment.
Keep diffusion stage in default Hunyuan config
This change removes the stage_id: 1 diffusion stage from the default hunyuan_image_3_moe config, so launching without diffusion_only now resolves to an AR-only pipeline and breaks image generation flows that previously relied on default config resolution. That regresses both offline usage (tests/e2e/offline_inference/test_expert_parallel.py expects Omni(model="tencent/HunyuanImage-3.0") to return images) and serving paths (vllm_omni/entrypoints/openai/api_server.py rejects pipelines with no diffusion stage for /v1/images/*). The diffusion stage should remain in the default config, with diffusion_only controlling config-construction priority rather than deleting the stage.
Useful? React with 👍 / 👎.
Signed-off-by: dengyunyang <584797741@qq.com>
d03ed70 to
fd42b50
Compare
|
fix by #2076 |
1 similar comment
|
fix by #2076 |
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Background
Currently, vllm-omni constructs stage configurations during startup according to the following priority rules.
Diffusion Model
Omni Model
vllm_omni/model_executor/stage_configs/vllm_omni/platforms/xxx/stage_configs/Problem
HunyuanImage-3.0 supports two execution modes:
However, during startup the system always loads the stage configuration from
vllm_omni/model_executor/stage_configs/. As a result, it always initializes the AR stage (or AR + DIT in the future), even when only the DIT stage is required.
PR #1826
attempts to address this by placing both AR and DIT configurations in
hunyuan_image_3_moe.yaml, and dynamically selecting the relevant configuration based on the task type (text-to-image, image-to-text, etc.).However, this approach introduces a new issue:
When starting Hunyuan DIT, a YAML configuration must always be specified, or use the default config with 8 card and 8 tensor parallel from
hunyuan_image_3_moe.yamlCLI kwargs (e.g. --tensor-parallel-size) are totally ignored.
This problem is discussed in:
#2282
Purpose
For DIT-only models (e.g., HunyuanImage DIT), the startup behavior should be consistent with other diffusion models. Users should be able to launch the model directly with CLI arguments, for example:
For multi-stage models (e.g., HunyuanImage AR + DIT), the system should still load stage configurations from the default directories:
vllm_omni/model_executor/stage_configs/orvllm_omni/platforms/xxx/stage_configs/.Change
Introduce a new parameter:
--diffusion-onlyvllm_omni/model_executor/stage_configs/orvllm_omni/platforms/xxx/stage_configs/cc @fake0fan @xuechendi @yinpeiqi
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)