[fix] Example scripts miscellaneous enhancement#2362
Conversation
Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: Chen Cui <chcui@nvidia.com>
📝 WalkthroughWalkthroughUpdates documentation for multi-token prediction training with new import paths and configuration structure, moves MTP config keys under Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~15 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@examples/models/vlm/ministral3/conversion.sh`:
- Around line 23-35: The script uses bare `python` to run the conversion and
round-trip commands (the `convert_checkpoints.py import` and
`convert_checkpoints.py export` invocations and the `python -m
torch.distributed.run ...` call), which deviates from the repo guideline to use
`uv run`; either change those three invocations to use `uv run python ...` or,
if there is a legitimate reason to call `python` directly (e.g., specific
interpreter/virtualenv requirements for distributed launch), add a brief inline
comment above each command explaining the intentional deviation so it is not
auto-corrected later.
In `@examples/models/vlm/ministral3/sft.sh`:
- Line 48: Replace the direct use of "torchrun" in the launch command in the
scripts/training invocation with the project's required "uv run" form (e.g., use
"uv run python -m torch.distributed.run" or "uv run torchrun" when invoking the
distributed launcher in the sft.sh recipe), updating the command that currently
calls torchrun in the line with scripts/training/run_recipe.py; if there is a
known incompatibility with "uv run" and transformers v5, instead keep the
existing torchrun but add a clear inline comment immediately above that command
explaining the exact incompatibility and why "uv run" cannot be used so future
contributors won't revert it.
🧹 Nitpick comments (3)
examples/models/vlm/glm_45v/slurm_sft.sh (1)
47-49: Minor duplication with existing WANDB guidance at line 96.The new comment block duplicates the
WANDB_API_KEYguidance already present in the "Authentication tokens" section (line 96). Not a blocker for an example script, but consider consolidating to avoid drift between the two locations.examples/models/vlm/glm_45v/slurm_peft.sh (1)
47-49: Minor duplication with existing WANDB guidance at line 96.Same as
slurm_sft.sh— the auth tokens section already documentsWANDB_API_KEY. Consider consolidating.examples/models/vlm/ministral3/inference.sh (1)
19-20: Nit: Consider specifying the minimum version.The comment says "requires transformers version 5" but the pip command would install whatever latest is available. Consider making it more explicit:
# Note: Ministral 3 requires transformers version 5 -# pip install --upgrade transformers +# pip install "transformers>=5"
Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: Chen Cui <chcui@nvidia.com>
What does this PR do ?
uv runin ministral example scripts.Changelog
GitHub Actions CI
See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.
Before your PR is "Ready for review"
Pre checks:
If you haven't finished some of the above items you can still open "Draft" PR.
Additional Information
Summary by CodeRabbit
Documentation
Chores