[build] chore: Upgrade transformers to 5.0#2068
Conversation
Signed-off-by: root <root@pool0-00120.cm.cluster>
|
/ok to test 8ccb17a |
📝 WalkthroughWalkthroughUpdated project dependencies in Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@pyproject.toml`:
- Around line 117-121: Update the version constraint for
transformer-engine[pytorch] to allow the 2.11 release to match the source
override: locate the dependency constraint for transformer-engine[pytorch] (the
existing "<2.10.0" constraint) and change it to an appropriate 2.11-compatible
range (for example ">=2.11,<2.12") so it aligns with the [tool.uv.sources]
transformer-engine entry that points to rev "release_v2.11"; alternatively, if
you intended to stay on <2.10.0, change the [tool.uv.sources] transformer-engine
rev to a matching release (e.g., a 2.9/2.10 tag) so the constraint and the
source override are consistent.
🧹 Nitpick comments (1)
pyproject.toml (1)
68-81: Add an upper bound for the transformers major version.Line 69 allows future majors (e.g., 6.0+) which could introduce breaking changes outside this PR's scope. Transformers v5.0.0 (released January 26, 2026) contains significant breaking changes including PyTorch-only focus, tokenizer refactoring, and new weight-loading APIs. Pin to the v5.x line to allow patch and minor updates while preventing major version surprises.
🔧 Proposed adjustment
- "transformers>=5.0.0", + "transformers>=5.0.0,<6.0.0",
pyproject.toml
Outdated
| # uv.sources allows us to override dependencies with VCS commits. | ||
| # Lets use this only for debugging purposes, but not for production (main). | ||
| [tool.uv.sources] | ||
| transformer-engine = { git = "https://github.com/NVIDIA/TransformerEngine.git", rev = "6a34b6574fa6c29d9d07fdcddf9812cbb1488878" } | ||
| transformer-engine = { git = "https://github.com/NVIDIA/TransformerEngine.git", rev = "release_v2.11" } | ||
| megatron-core = { path = "3rdparty/Megatron-LM/" } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find and examine pyproject.toml to verify the version constraints
find . -name "pyproject.toml" -type f | head -5Repository: NVIDIA-NeMo/Megatron-Bridge
Length of output: 87
🏁 Script executed:
# Read the relevant lines from pyproject.toml (lines 110-125 to get full context)
sed -n '110,125p' pyproject.tomlRepository: NVIDIA-NeMo/Megatron-Bridge
Length of output: 959
🌐 Web query:
TransformerEngine release_v2.11 git tag version number package version
💡 Result:
The Git tag release_v2.11 corresponds to Transformer Engine package version 2.11.0 (i.e., “2.11” release line with patch .0). [1], [2]
Resolve transformer-engine version mismatch between source rev and override constraint.
Line 114 constrains transformer-engine[pytorch] to <2.10.0, but line 120 specifies a source override pointing to release_v2.11 (which corresponds to version 2.11.0). This creates a constraint violation that will cause resolution failures or force silent downgrades. Align the override with the intended 2.11 release.
Example alignment for 2.11.x
- "transformer-engine[pytorch]>=2.9.0a0,<2.10.0",
+ "transformer-engine[pytorch]>=2.11.0,<2.12.0",🤖 Prompt for AI Agents
In `@pyproject.toml` around lines 117 - 121, Update the version constraint for
transformer-engine[pytorch] to allow the 2.11 release to match the source
override: locate the dependency constraint for transformer-engine[pytorch] (the
existing "<2.10.0" constraint) and change it to an appropriate 2.11-compatible
range (for example ">=2.11,<2.12") so it aligns with the [tool.uv.sources]
transformer-engine entry that points to rev "release_v2.11"; alternatively, if
you intended to stay on <2.10.0, change the [tool.uv.sources] transformer-engine
rev to a matching release (e.g., a 2.9/2.10 tag) so the constraint and the
source override are consistent.
Signed-off-by: root <root@pool0-00120.cm.cluster>
|
/ok to test 022fcef |
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
|
/ok to test 81ad475 |
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
…ron-Hub into chore/transformers_5p0
Use rope_theta_from_hf compat function in hf_config_to_provider_kwargs as fallback when CONFIG_MAPPING cannot find rope_theta as a direct attribute (transformers 5.0+ stores it in rope_parameters dict). Fix mock configs and test assertions accordingly. Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>
|
/ok to test 437a2ad |
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
…icmethod Remove the staticmethod indirection on MegatronModelBridge for rope_theta_from_hf, rope_local_base_freq_from_hf, and rope_scaling_factor_from_hf. All call sites now import and call the functions directly from transformers_compat. Also remove unused get_common_configs from deepseek/common.py. Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>
|
/ok to test b5f6253 |
Fix config.json generation in GPT-OSS conversion test to use model.config.to_dict() instead of raw overrides, and update various functional tests for transformers 5.0 API changes. Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>
|
/ok to test 2e6bbe8 |
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # src/megatron/bridge/models/deepseek/common.py
|
/ok to test 5b9cea4 |
…sformers 5.0+ In transformers 5.0+, Qwen2_5_VLConfig serializes with a nested structure where text model params (hidden_size, num_attention_heads) are under text_config rather than at the top level of config.json. Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>
|
/ok to test 05852e1 |
Transformers 5.0 renames rope_scaling to rope_parameters and uses rope_type instead of type. Update Qwen3 VL bridge and all related tests to prefer rope_parameters when available, falling back to rope_scaling for backward compatibility. Also fixes: add model_type to LlamaNemotron test config, use glob pattern for NemotronVL weight files, and reduce deepstack_visual_indexes to fit within PP-split layer counts. Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>
|
/ok to test ae12a45 |
…L MoE bridge Transformers <5.0 stored fused expert weights transposed as [num_experts, hidden_size, 2*intermediate_size], while transformers 5.0+ uses the standard nn.Linear convention [num_experts, 2*intermediate_size, hidden_size]. Use _align_weight_to_shape (same pattern as GLM MoE bridge) to auto-detect the layout and transpose only when necessary. Signed-off-by: Yuya Morimoto <ymorimoto@nvidia.com> Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>
|
/ok to test 432173f |
Remove brittle rope_scaling assertion in llama_nemotron test and use glob pattern for safetensors filename in nemotron_vl test to handle changes in serialization shard naming. Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # uv.lock
Signed-off-by: root <root@pool0-01847.cm.cluster>
|
/ok to test c3c8be7 |
| return {} | ||
|
|
||
|
|
||
| def _align_weight_to_shape(weight: torch.Tensor, target_shape: torch.Size, name: str) -> torch.Tensor: |
There was a problem hiding this comment.
Q: same func also defined in src/megatron/bridge/models/glm/glm_moe_mappings.py
do we expect each model has the same func in bridge?
There was a problem hiding this comment.
also same a few other func like _uses_fused_experts
There was a problem hiding this comment.
it's only added for specific models, I dont think it's going to be used after we fully migrate to 5.0, we might just keep one path
# Conflicts: # uv.lock
|
/ok to test 5388a3c |
Upgrade transformers dependency to version 5.0
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.