[model] fix: correct GLM-4.5V inference parallelism for 46-layer model by yaoyu-33 · Pull Request #2322 · NVIDIA-NeMo/Megatron-Bridge

yaoyu-33 · 2026-02-11T02:19:34Z

What

Update GLM-4.5V example inference parallelism to across all commands.
Keep 8-GPU total while making the pipeline split valid for 46 layers.
Add temporary workaround in VLM and text generation conversion scripts to avoid inference failure.

Why

GLM-4.5V has 46 layers, which is not divisible by and causes assertion failure during inference.

Validation

fix end of files.....................................(no files to check)Skipped
trim trailing whitespace.............................(no files to check)Skipped
ruff.................................................(no files to check)Skipped
ruff.................................................(no files to check)Skipped
ruff-format..........................................(no files to check)Skipped
Disallow '_' in Markdown filenames...................(no files to check)Skipped
Branch push successful

Summary by CodeRabbit

Release Notes

Bug Fixes
- Fixed model parallel layer configuration issue in Megatron model inference pipelines to prevent errors during CUDA operations when using HuggingFace-converted models.
Chores
- Updated parallelism distribution settings in example inference scripts for improved resource allocation: pipeline parallelism reduced from 4 to 2, expert parallelism increased from 2 to 4.

…ound Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

copy-pr-bot · 2026-02-11T02:19:38Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

yaoyu-33 · 2026-02-11T02:19:42Z

/ok to test ccb2456

fix: update glm-4.5v inference parallelism and generation temp workar…

ccb2456

…ound Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

copy-pr-bot bot temporarily deployed to nemo-ci February 11, 2026 02:20 Inactive

copy-pr-bot bot temporarily deployed to test February 11, 2026 02:20 Inactive

yaoyu-33 changed the title ~~fix: correct GLM-4.5V inference parallelism for 46-layer model~~ [model] fix: correct GLM-4.5V inference parallelism for 46-layer model Feb 11, 2026

yaoyu-33 added the r0.3.0 Cherry-pick label for r0.3.0 release branch label Feb 11, 2026

copy-pr-bot bot temporarily deployed to nemo-ci February 11, 2026 02:26 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci February 11, 2026 02:34 Failure

cuichenx approved these changes Feb 11, 2026

View reviewed changes

yaoyu-33 enabled auto-merge (squash) February 11, 2026 02:38

copy-pr-bot bot temporarily deployed to nemo-ci February 11, 2026 16:51 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 11, 2026 17:03 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci February 11, 2026 17:03 Failure

copy-pr-bot bot temporarily deployed to nemo-ci February 11, 2026 17:03 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci February 11, 2026 17:03 Error

copy-pr-bot bot temporarily deployed to nemo-ci February 11, 2026 17:03 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 11, 2026 18:32 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci February 11, 2026 18:32 Failure

copy-pr-bot bot temporarily deployed to nemo-ci February 11, 2026 18:32 Inactive

copy-pr-bot bot temporarily deployed to public February 11, 2026 19:27 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 11, 2026 20:28 Inactive

This was referenced Feb 11, 2026

cp: [model] fix: correct GLM-4.5V inference parallelism for 46-layer model (2322) into r0.3.0 #2336

Merged

[fix] Example scripts miscellaneous enhancement #2362

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[model] fix: correct GLM-4.5V inference parallelism for 46-layer model#2322

[model] fix: correct GLM-4.5V inference parallelism for 46-layer model#2322
yaoyu-33 merged 2 commits intomainfrom
yuya/fix-glm45v-inference-parallelism

yaoyu-33 commented Feb 11, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Feb 11, 2026

Uh oh!

yaoyu-33 commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yaoyu-33 commented Feb 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Validation

Summary by CodeRabbit

Release Notes

Uh oh!

copy-pr-bot bot commented Feb 11, 2026

Uh oh!

yaoyu-33 commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yaoyu-33 commented Feb 11, 2026 •

edited by coderabbitai bot

Loading