[ConfigRefactor] GLM-Image#2977
Conversation
Signed-off-by: JaredforReal <w13431838023@gmail.com>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
Are there any doc that you need you update as well? |
There was a problem hiding this comment.
Pull request overview
Refactors GLM-Image configuration to the new “frozen pipeline topology + deploy YAML” split introduced by the config refactor work, and updates offline example entrypoints to reference the new deploy config location.
Changes:
- Removed legacy
stage_configs/glm_image*.yamlconfigs and introducedvllm_omni/deploy/glm_image.yaml. - Added a frozen GLM-Image
PipelineConfig(model_executor/models/glm_image/pipeline.py) and registered it inpipeline_registry.py. - Updated offline inference examples to use the new deploy YAML path by default.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| vllm_omni/model_executor/stage_configs/glm_image_muilticonnector.yaml | Removes legacy MultiConnector stage config YAML. |
| vllm_omni/model_executor/stage_configs/glm_image.yaml | Removes legacy GLM-Image stage config YAML. |
| vllm_omni/model_executor/models/glm_image/pipeline.py | Adds frozen two-stage GLM-Image pipeline topology. |
| vllm_omni/deploy/glm_image.yaml | Adds deploy YAML for GLM-Image stages (resources + sampling defaults). |
| vllm_omni/config/pipeline_registry.py | Registers the new glm_image pipeline for lazy loading. |
| examples/offline_inference/glm_image/run_t2i.sh | Points default config to vllm_omni/deploy/glm_image.yaml. |
| examples/offline_inference/glm_image/run_i2i.sh | Points default config to vllm_omni/deploy/glm_image.yaml. |
| examples/offline_inference/glm_image/end2end.py | Updates default config path fallback to the deploy YAML. |
| examples/offline_inference/glm_image/README.md | Updates config-path examples (but still has one lingering legacy path). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| final_output_type="image", | ||
| model_arch="GlmImagePipeline", | ||
| custom_process_input_func="vllm_omni.model_executor.stage_input_processors.glm_image.ar2diffusion", | ||
| omni_kv_config={"need_recv_cache": False}, |
There was a problem hiding this comment.
omni_kv_config is set on this StagePipelineConfig, but it is currently never propagated into stage engine_args by merge_pipeline_deploy (and StagePipelineConfig.omni_kv_config is otherwise unused). Either move this into deploy YAML as omni_kv_config (stage engine extra) or update the merge logic to carry it into yaml_engine_args, otherwise this setting has no effect.
| omni_kv_config={"need_recv_cache": False}, |
|
#2072 Make sure that in 5 level use cases, configs can take affect on this model, especicially stage-config overrides. |
Signed-off-by: JaredforReal <w13431838023@gmail.com>
…utils Signed-off-by: JaredforReal <w13431838023@gmail.com>
|
Test results have been shown offline. Waiting for CI green. |
Signed-off-by: JaredforReal <w13431838023@gmail.com>
|
PTAL @xiaohajiayou Can you help check if this pr has the override precedence issue you mentioned? |
|
resolve conflicts please |
|
@gcanlin please take care of pipeline yamls for different hardwares |
|
@gcanlin Okay, tomorrow I will test on glm-image. |
Signed-off-by: Jared Wen <w13431838023@gmail.com>
Signed-off-by: JaredforReal <w13431838023@gmail.com>
Signed-off-by: JaredforReal <w13431838023@gmail.com>
|
one suggestion: please rm the glm-image folder under examples and update the glm-image recipe later |
Signed-off-by: JaredforReal <w13431838023@gmail.com>
|
@hsliuustc0106 examples removed |
|
please remember to update the recipe |
Signed-off-by: JaredforReal <w13431838023@gmail.com> Signed-off-by: Jared Wen <w13431838023@gmail.com> Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>
Migrate GLM-Image to the new declarative config system (PipelineConfig + DeployConfig), fixing a broken two-stage pipeline where only the diffusion stage was loaded.
model_index.jsonat root but noconfig.json. _auto_detect_model_type()only checked forconfig.json, so it returned None and the system fell back to single-stage diffusion.async_chunkdefaulted to True — The legacy deploy YAML didn't setasync_chunk, andmerge_pipeline_deploywould raiseValueErrorsince no GLM-Image stage declares async-chunk processors.engine_args:/runtime:/stage_type: formatwith topology fields that now belong inPipelineConfig.Changes
PipelineConfig.diffusers_class_name— New field lets pipelines declare their diffusers_class_name. The model type detector now checksmodel_index.jsonand matches against registered pipelines, eliminating the need for a separate_DIFFUSERS_CLASS_TO_CONFIGmapping table.StagePipelineConfig.model_subdir/tokenizer_subdir— Moved from deploy YAML to pipeline topology. These are structural properties (AR config lives invision_language_encoder/), not deployment knobs. Injected into engine_args by_build_engine_args.deploy/glm_image.yaml— Rewritten to the new flat format withasync_chunk: false, containing only deployment knobs (GPU placement, memory, sampling params). All topology fields removed.pipeline.py— Addeddiffusers_class_name,model_subdir,tokenizer_subdir,requires_multimodal_data, andmodel_archon the diffusion stage.Test plan
vllm serve zai-org/GLM-Image --port 8000 --host 0.0.0.0— verify both AR and diffusion stages initialize (check logs for model loading messages from both stages)Try override with
--stage-overrides '{"0": {"gpu_memory_utilization": 0.65}}'python examples/offline_inference/glm_image/end2end.py --model-path <path-to-GLM-Image> --prompt "A cat sitting on the table" --output cat.png --height 1024 --width 1024 --num-inference-steps 50 --enable-diffusion-pipeline-profilerEssential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)