[Fix] Bridge flat CLI parallel args into DiffusionParallelConfig before YAML stage-config merge#2264
Conversation
…L/model config path Signed-off-by: Zhou Zhuoxin <zhouzhuoxin1508@outlook.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b5348fa63d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| hsdp_replicate_size = kwargs.get("hsdp_replicate_size", 1) | ||
| if sequence_parallel_size is None: | ||
| sequence_parallel_size = ulysses_degree * ring_degree | ||
| kwargs["parallel_config"] = DiffusionParallelConfig( |
There was a problem hiding this comment.
Apply CLI parallel overrides after YAML merge
Setting kwargs["parallel_config"] here does not actually let flat CLI flags override stage YAML defaults, because load_stage_configs_from_yaml later merges base_engine_args with stage_arg.engine_args in that order (see vllm_omni/entrypoints/utils.py lines 292-295), so the YAML engine_args.parallel_config wins for keys like tensor_parallel_size and enable_expert_parallel. In the default diffusion configs (e.g. hunyuan_image_3_moe.yaml), this means --tensor-parallel-size/--enable-expert-parallel are still ignored despite this new bridge.
Useful? React with 👍 / 👎.
lishunyang12
left a comment
There was a problem hiding this comment.
This alone doesn't fix the CLI override problem for models that ship a YAML with parallel_config in engine_args (e.g. hunyuan_image_3_moe.yaml). The YAML stage engine_args still wins in merge_configs(base, stage). Left a couple comments.
|
|
||
| stage_configs_path = kwargs.get("stage_configs_path", None) | ||
| explicit_stage_configs = kwargs.pop("stage_configs", None) | ||
|
|
There was a problem hiding this comment.
This does a network round-trip (get_hf_file_to_dict) on every call just to guard the parallel-config bridging. _resolve_stage_configs already resolves the model type downstream — can you reuse that or at least cache the result?
| hsdp_replicate_size = kwargs.get("hsdp_replicate_size", 1) | ||
| if sequence_parallel_size is None: | ||
| sequence_parallel_size = ulysses_degree * ring_degree | ||
| kwargs["parallel_config"] = DiffusionParallelConfig( |
There was a problem hiding this comment.
As the bot noted: merge_configs(base_engine_args, stage_arg.engine_args) gives the YAML the last word, so this parallel_config is overwritten for any model that ships a stage YAML with parallel_config (all current HunyuanImage/Bagel configs). Without #2076 merged first, this PR doesn't actually fix the CLI path. Please either land #2076 first or reverse the merge order here.
There was a problem hiding this comment.
Yes, this PR is based on #2076. I'll rebase once it's merged.
Signed-off-by: zhou zhuoxin <zhouzhuoxin1508@outlook.com>
|
This looks incorrect.
kwargs["parallel_config"] = DiffusionParallelConfig(...)does not affect the resolved diffusion stage config for the current initialization. |
Signed-off-by: zhou zhuoxin <zhouzhuoxin1508@outlook.com>
c9f10f3 to
bb64b0c
Compare
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
This has been addressed. The code now writes directly to cfg.engine_args.parallel_config on each resolved diffusion stage , instead of mutating kwargs after consumption. Could you take another look? Thanks! @xiaohajiayou |
Problem
When launching a diffusion model server via CLI:
kwargs entering the engine are flat top-level keys:
However, bundled stage YAMLs for multi-stage diffusion models (e.g. hunyuan_image_3_moe.yaml) store these under a nested
parallel_config block:
OmegaConf.merge only overrides matching key paths, so a flat enable_expert_parallel can never reach parallel_config.enable_expert_parallel regardless of merge order. Both keys coexist independently after the merge,
and the diffusion worker reads only from parallel_config — so the CLI values are silently ignored. #2076
The offline example scripts avoid this because they explicitly construct DiffusionParallelConfig and pass it as parallel_config before calling Omni(). The CLI/server path was missing this bridging step.
Affects all models with such configs, including HunyuanImage and Bagel.
Fix
Assemble DiffusionParallelConfig from flat kwargs before YAML merge in_resolve_stage_configs.
Essential Elements of an Effective PR Description Checklist
- [ ] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". - [ ] The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the [test style doc](https://docs.vllm.ai/projects/vllm-omni/en/latest/contributing/ci/tests_style/) - [ ] The test results. Please paste the results comparison before and after, or the e2e results. - [ ] (Optional) The necessary documentation update, such as updating `supported_models.md` and `examples` for a new model. **Please run `mkdocs serve` to sync the documentation editions to `./docs`.** - [ ] (Optional) Release notes update. If your change is user-facing, please update the release notes draft.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)