Conversation
#2322) Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
|
/ok to test 0cd4f86 |
📝 WalkthroughWalkthroughThree example and configuration files are modified to disable the mtp_num_layers parameter in sub-model configurations and adjust pipeline/expert parallelism settings for GLM-4.5V inference. The changes add post-initialization steps clearing mtp_num_layers and update tensor parallelism distribution parameters. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 🧹 Recent nitpick comments
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
beep boop [🤖]: Hi @yaoyu-33 👋,
Summary by CodeRabbit
Bug Fixes
Configuration Updates