upgrade trl==0.19.1#2892
Conversation
|
Warning Rate limit exceeded@winglian has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 7 minutes and 31 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (9)
WalkthroughThis update introduces configuration enhancements for vLLM integration, including new fields for data parallelism and mode selection. Workflow testing is expanded to cover additional CUDA, Python, and PyTorch versions. The GRPO trainer's dataloader logic is refactored for consistency, and dependency versions for Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI
participant Config
participant VllmServe
participant AxolotlScriptArguments
User->>CLI: Start vLLM serve command
CLI->>Config: Load configuration
Config-->>CLI: Return config (may include data_parallel_size)
CLI->>VllmServe: Call do_vllm_serve(config, cli_args)
VllmServe->>AxolotlScriptArguments: Instantiate with data_parallel_size
AxolotlScriptArguments-->>VllmServe: Arguments ready
VllmServe->>CLI: Start vLLM server with arguments
Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
requirements.txt(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: PyTest (3.11, 2.7.0)
- GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.6.0, 2, true)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
- GitHub Check: pre-commit
- GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.7.1, 2, true)
- GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
- GitHub Check: PyTest (3.11, 2.6.0)
- GitHub Check: PyTest (3.11, 2.7.1)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
- GitHub Check: pre-commit
| datasets==3.6.0 | ||
| deepspeed>=0.17.0 | ||
| trl==0.18.2 | ||
| trl==0.19.1 |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Upgrade acknowledged – please double-check runtime compatibility with TRL 0.19.1
trl 0.19.x introduced several breaking-change PRs (e.g. the new AutoModelForCausalLMWithValueHead split, renamed callbacks, and stricter accelerate/transformers minimums). While the pinned transformers==4.53.1 and accelerate==1.8.1 versions in this repo satisfy the new lower-bounds, any in-tree training scripts that relied on the older APIs may now error at runtime.
Run the quick scan below to surface the most common API symbols that changed between 0.18 → 0.19 and make sure they’re not used:
Expected result: no matches.
If matches appear, those call-sites must be migrated to the new APIs before merging to avoid runtime failures.
🏁 Script executed:
#!/usr/bin/env bash
# Look for TRL identifiers that were removed/renamed in 0.19
rg -n --type py -e 'AutoModelForSequenceClassificationWithValueHead' \
-e 'create_reference_model' \
-e 'CONFIG_NAME.json' \
-e 'RewardTrainer' \
-e 'RewardConfig' \
-e 'ReinforceTrainer' \
-e 'PPOTrainer.*reference_model' \
-e 'CausalLMWithValueHead' \
-e 'DPOTrainer.*ref_model'Length of output: 1288
Breaking Change: Update In-Tree TRL API Usage for 0.19.1
Quick scan detected legacy TRL identifiers that were removed or renamed in 0.19.0–0.19.1. These call-sites must be migrated before merging to avoid runtime errors:
• src/axolotl/core/training_args.py
– Line 11: from trl import …, RewardConfig
– Line 55: class AxolotlRewardConfig(…, RewardConfig):
• src/axolotl/core/builders/causal.py
– Multiple references to AxolotlRewardTrainer and AxolotlRewardConfig (wrappers over TRL’s old APIs)
• src/axolotl/core/trainers/trl.py
– Line 11: import RewardTrainer
– Lines 107–111: class AxolotlRewardTrainer(…, RewardTrainer)
• src/axolotl/core/trainers/init.py
– Line 16: exports AxolotlRewardTrainer
Please replace these with the new 0.19.x equivalents (e.g. use the split AutoModelForCausalLMWithValueHead, updated config classes, and renamed trainer/callback APIs). Double-check all training scripts against the 0.19.1 changelog and run end-to-end tests to confirm compatibility.
🤖 Prompt for AI Agents
In requirements.txt at line 21, the TRL package is updated to version 0.19.1,
which introduces breaking changes. You need to update all legacy TRL API usages
in the codebase accordingly: replace imports of RewardConfig and RewardTrainer
with the new 0.19.x equivalents, update classes like AxolotlRewardConfig and
AxolotlRewardTrainer to use the new split AutoModelForCausalLMWithValueHead and
updated config classes, and revise all references in
src/axolotl/core/training_args.py, src/axolotl/core/builders/causal.py,
src/axolotl/core/trainers/trl.py, and src/axolotl/core/trainers/__init__.py to
match the new API. Finally, verify all training scripts against the 0.19.1
changelog and run end-to-end tests to ensure compatibility.
Codecov ReportAttention: Patch coverage is 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
setup.py(1 hunks)src/axolotl/core/trainers/grpo/__init__.py(1 hunks)src/axolotl/utils/schemas/trl.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: PyTest (3.11, 2.6.0)
- GitHub Check: PyTest (3.11, 2.7.1)
- GitHub Check: PyTest (3.11, 2.7.0)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
- GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
- GitHub Check: pre-commit
- GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.6.0, 2, true)
- GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.7.1, 2, true)
- GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.7.0, vllm, 2, true)
- GitHub Check: pre-commit
- GitHub Check: preview
🔇 Additional comments (2)
setup.py (1)
73-73: vLLM 0.9.2 is available and compatible with PyTorch 2.7.0vLLM 0.9.2 is the latest release on PyPI and its
requires_distpinstorch==2.7.0, confirming compatibility.
- File:
setup.py
Line: 73
Change:- extras_require_map["vllm"] = ["vllm>=0.9.0"] + extras_require_map["vllm"] = ["vllm>=0.9.2"]src/axolotl/core/trainers/grpo/__init__.py (1)
47-47: LGTM - Configuration propagation is correct.The addition of
vllm_modeto the training arguments is consistent with the existing pattern and correctly placed within theuse_vllmconditional block. The implementation properly propagates the configuration value to the training arguments.Note: The effectiveness of this change depends on proper validation of the
vllm_modefield in the schema definition (as mentioned in the previous file review).
|
📖 Documentation Preview: https://6871fb18fc77479c11e7e564--resonant-treacle-0fd729.netlify.app Deployed on Netlify from commit c963dad |
Description
Motivation and Context
How has this been tested?
Screenshots (if appropriate)
Types of changes
Social Handles (Optional)
Summary by CodeRabbit
New Features
Bug Fixes
Chores
huggingface_hubandtrl.Tests