Skip to content

add support for MammothModa2 model#336

Merged
hsliuustc0106 merged 135 commits into
vllm-project:mainfrom
HonestDeng:add-mammoth-moda2-support
Mar 4, 2026
Merged

add support for MammothModa2 model#336
hsliuustc0106 merged 135 commits into
vllm-project:mainfrom
HonestDeng:add-mammoth-moda2-support

Conversation

@HonestDeng
Copy link
Copy Markdown
Contributor

@HonestDeng HonestDeng commented Dec 16, 2025

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Resolve #314 , add support for MammothModa2 model https://github.com/bytedance/mammothmoda

Test Plan

Machine:

  • H200(140GB) x 1

Parallel:

  • TP: None

Image:

  • Size: 1024 x 1024
  • DiT Step: 50
  1. Image Summery

Machine:

  • H200(140GB) x 1

Parallel:

  • TP: None

Image:

  • Size: 1024 x 1024

Test Result

Image in the left side is generated by MammothModa2 official implementation while the right side from vllm-omni:
image

This table shows performance in two implementations:

Stages official-impl vllm-omni
AR stage 83.529s 74.06s
DiT stage 10.320s 9.65s

Transfer time: 4.012ms

We get better performance.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
For simplicity, most code of DiT stage is copied from https://github.com/bytedance/mammothmoda.
These code will be simplified and reviewd after the pipeline running
successfully.

Signed-off-by: HonestDeng <2958906959@qq.com>
because preview version of mammothmoda2 only use last hidden state

Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
@hsliuustc0106
Copy link
Copy Markdown
Collaborator

Hi, will the model be ready before 1230 release?

@HonestDeng
Copy link
Copy Markdown
Contributor Author

HonestDeng commented Dec 20, 2025

Yes.

The MammothModa2-Preview is combined Qwen25-VL(with extra gen-experts in MLP layers) with an DiT module for image generation. Now I have already implemented the Qwen25-VL part of MammothModa2-Preview by reusing vllm code, such as Qwen2Attention, Qwen2MLP, and we can takes text and image as input to generate text token.

Now I'm currently working on DiT parts. Hopefully I will finish DiT parts in this weekend and review my code before 1230.

I'm not quite familiar in supporting new models. If there is any problem in my code, please correct me. Thanks!

Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
@hsliuustc0106
Copy link
Copy Markdown
Collaborator

Yes.

The MammothModa2-Preview is combined Qwen25-VL(with extra gen-experts in MLP layers) with an DiT module for image generation. Now I have already implemented the Qwen25-VL part of MammothModa2-Preview by reusing vllm code, such as Qwen2Attention, Qwen2MLP, and we can takes text and image as input to generate text token.

Now I'm currently working on DiT parts. Hopefully I will finish DiT parts in this weekend and review my code before 1230.

I'm not quite familiar in supporting new models. If there is any problem in my code, please correct me. Thanks!

the model seems quite similar to Qwen-Image strcuture with a qwen-vl for encoding and a DiT module for image generation.

@princepride
Copy link
Copy Markdown
Collaborator

python3 examples/offline_inference/mammothmodal2_preview/run_mammothmoda2_t2i.py   --model bytedance-research/MammothModa2-Preview --stage-c
onfig ./vllm_omni/model_executor/stage_configs/mammoth_moda2.yaml   --prompt "A stylish woman riding a motorcycle 
in NYC, movie poster style"   --height 1024   --width 1024   --num-inference-steps 50   --text-guidance-scale 4.0 
  --out output.png
/proj-tango-pvc/users/zhipeng.wang/workspace/vllm-omni/vllm_omni/__init__.py:32: RuntimeWarning: Failed to import version from _version.py: No module named 'vllm_omni._version'
This typically happens in development mode before building.
Using fallback version 'dev'.
  from .version import __version__, __version_tuple__  # isort:skip
Traceback (most recent call last):
  File "/proj-tango-pvc/users/zhipeng.wang/workspace/vllm-omni/examples/offline_inference/mammothmodal2_preview/run_mammothmoda2_t2i.py", line 241, in <module>
    main()
  File "/proj-tango-pvc/users/zhipeng.wang/workspace/vllm-omni/examples/offline_inference/mammothmodal2_preview/run_mammothmoda2_t2i.py", line 186, in main
    gen_cfg = load_t2i_generation_config(args.model)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/proj-tango-pvc/users/zhipeng.wang/workspace/vllm-omni/examples/offline_inference/mammothmodal2_preview/run_mammothmoda2_t2i.py", line 57, in load_t2i_generation_config
    raise FileNotFoundError(f"Config not found: {gen_cfg_path}")
FileNotFoundError: Config not found: bytedance-research/MammothModa2-Preview/t2i_generation_config.json

I got an error when using huggingface id, PTAL.

Copy link
Copy Markdown
Collaborator

@princepride princepride left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL

Comment thread tests/e2e/offline_inference/test_mammoth_moda2.py
Comment thread vllm_omni/diffusion/models/mammoth_moda2/mammothmoda2_dit_model.py Outdated
Comment thread vllm_omni/diffusion/models/mammoth_moda2/mammoth_moda2_dit.py Outdated
Comment thread vllm_omni/model_executor/models/mammoth_moda2/mammoth_moda2.py Outdated
Comment thread vllm_omni/model_executor/models/mammoth_moda2/mammoth_moda2.py Outdated
HonestDeng and others added 5 commits March 3, 2026 14:26
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
@princepride
Copy link
Copy Markdown
Collaborator

@HonestDeng pre-commit failed, PTAL

Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
@HonestDeng
Copy link
Copy Markdown
Contributor Author

Now huggingface id has been supported and pre-commit run successfully.

@HonestDeng HonestDeng requested a review from princepride March 3, 2026 08:04
Comment thread vllm_omni/tokenizers/mammoth_moda2_tokenizer.py
Comment thread vllm_omni/worker/gpu_generation_model_runner.py
Comment thread vllm_omni/worker/gpu_model_runner.py
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: HonestDeng <2958906959@qq.com>
@HonestDeng HonestDeng force-pushed the add-mammoth-moda2-support branch from 38c123b to b92d12f Compare March 3, 2026 08:33
@HonestDeng HonestDeng requested a review from princepride March 3, 2026 08:34
Copy link
Copy Markdown
Collaborator

@princepride princepride left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roughly LGTM, after testing, I will approve it.

Comment thread vllm_omni/diffusion/models/mammoth_moda2/schedulers.py Outdated
Signed-off-by: HonestDeng <2958906959@qq.com>
@HonestDeng
Copy link
Copy Markdown
Contributor Author

Roughly LGTM, after testing, I will approve it.

Thanks

@hsliuustc0106 hsliuustc0106 added the ready label to trigger buildkite CI label Mar 4, 2026
@lishunyang12
Copy link
Copy Markdown
Collaborator

Looks like good progress — active iteration with @hsliuustc0106 and @princepride. I'll defer to them on the remaining items.

@hsliuustc0106
Copy link
Copy Markdown
Collaborator

Looks like good progress — active iteration with @hsliuustc0106 and @princepride. I'll defer to them on the remaining items.

Later, we can move to v2.5 and fix the remaining issues

Copy link
Copy Markdown
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@hsliuustc0106 hsliuustc0106 merged commit 1612948 into vllm-project:main Mar 4, 2026
7 checks passed
@david6666666 david6666666 mentioned this pull request Mar 5, 2026
63 tasks
ahengljh pushed a commit to ahengljh/vllm-omni that referenced this pull request Mar 5, 2026
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: iwzbi <iwzbi@zju.edu.cn>
Signed-off-by: iwzbi <wzbi@zju.edu.cn>
Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Co-authored-by: catcat <108673086+iwzbi@users.noreply.github.com>
Co-authored-by: iwzbi <iwzbi@zju.edu.cn>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: dsinghvi <divyanshsinghvi@gmail.com>
Co-authored-by: Canlin Guo <canlinguosdu@gmail.com>
Co-authored-by: Peiqi Yin <60515999+yinpeiqi@users.noreply.github.com>
Co-authored-by: 汪志鹏 <wangzhipeng628@gmail.com>
clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026
Signed-off-by: HonestDeng <2958906959@qq.com>
Signed-off-by: iwzbi <iwzbi@zju.edu.cn>
Signed-off-by: iwzbi <wzbi@zju.edu.cn>
Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Co-authored-by: catcat <108673086+iwzbi@users.noreply.github.com>
Co-authored-by: iwzbi <iwzbi@zju.edu.cn>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: dsinghvi <divyanshsinghvi@gmail.com>
Co-authored-by: Canlin Guo <canlinguosdu@gmail.com>
Co-authored-by: Peiqi Yin <60515999+yinpeiqi@users.noreply.github.com>
Co-authored-by: 汪志鹏 <wangzhipeng628@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[New Model]: bytedance-research/MammothModa2-Preview