-
Notifications
You must be signed in to change notification settings - Fork 1k
[Model][Rebase] Add GLM-Image Model and Partial Rebase to v0.14.0 (Support AR Offiline) #763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Gaohan123
merged 25 commits into
vllm-project:dev/rebase_0.14.0
from
tzhouam:dev/rebase-0.14.0
Jan 14, 2026
Merged
Changes from all commits
Commits
Show all changes
25 commits
Select commit
Hold shift + click to select a range
3059e27
init and registry
JaredforReal c0a7684
implement glm_image_transformer.py
JaredforReal 800cea4
update transformer
JaredforReal 8664695
init pipeline_glm_image.py
JaredforReal b88b4b2
init pipeline_glm_image.py
JaredforReal b9108f4
remove pre process
JaredforReal 371afd5
add check_input(), implement CFG parallel in diffuse(), align generat…
JaredforReal 3d4f5f2
fix check_input(prompt_embed), add KVCache for Image Edit
JaredforReal 0810dae
print out vllm version
8e36c51
update model config
tzhouam 7f704d5
update worker
tzhouam 4afb2ff
update one import in AsyncOmniLLM (not finish all, but can run)
tzhouam cb2e053
update Qwen3 Omni ViT init based on updated interface (the update for…
tzhouam e052c4a
Remove unnecessary override for OmniRequestState (the update for Omni…
tzhouam c08dcdd
update model runner dummy run
tzhouam 166fc78
update ar scheduler
tzhouam 4db8f0b
update _preprocess, execute model and sample_tokens for AR Model Runner
tzhouam 63a69a5
debug AR Scheduler
tzhouam 5bcdb43
update OmniGPUModelRunner._update_states
tzhouam 2a0f72f
update the offline LLM request sorting due to changed requested id fo…
tzhouam f7c8af9
update Qwen3 Omni to fit with the engine core logic
tzhouam f12e0af
Merge PR #724
tzhouam e2462d2
update generation model runner
tzhouam d89e3c4
debug GLM-Image Model
tzhouam f269e0e
remove deleted args from doc string
tzhouam File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With only
thinker_sampling_paramsinsampling_params_list, the default Qwen3-Omni Instruct pipeline (three stages invllm_omni/model_executor/stage_configs/qwen3_omni_moe.yaml) will raise aValueErrorbecauseOmni._run_generationrequireslen(sampling_params_list) == len(self.stage_list)(vllm_omni/entrypoints/omni.py). This means running the example with the default stage config now fails before any generation occurs; it only works if users manually supply a single-stage config (e.g., thinking-only), which isn’t the default for this model.Useful? React with 👍 / 👎.