drop old patches and code that are no longer needed#3007
Conversation
📝 WalkthroughWalkthroughThis change set removes all code, configuration options, documentation, and test logic related to the Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~15–20 minutes Suggested labels
Suggested reviewers
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/axolotl/utils/schemas/validation.py (2)
594-595: Update error message to reflect single module checkSince the validation now only checks for
flash_attn_fuse_mlp, consider updating the error message to be more specific:- raise ValueError("Fused modules are not supported with LoRA/QLoRA") + raise ValueError("Fused MLP module is not supported with LoRA/QLoRA")
1183-1184: Update error message for consistencySimilar to the previous comment, update this error message to be more specific:
- raise ValueError("Fused modules are not supported with ReLoRA") + raise ValueError("Fused MLP module is not supported with ReLoRA")
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
.runpod/README.md(0 hunks).runpod/src/config/config.yaml(0 hunks)examples/archived/stablelm-2/1.6b/fft.yml(0 hunks)examples/llama-2/fft_optimized.yml(0 hunks)examples/llama-2/lisa.yml(0 hunks)src/axolotl/loaders/patch_manager.py(2 hunks)src/axolotl/monkeypatch/llama_attn_hijack_flash.py(1 hunks)src/axolotl/monkeypatch/mistral_attn_hijack_flash.py(0 hunks)src/axolotl/utils/schemas/config.py(0 hunks)src/axolotl/utils/schemas/validation.py(2 hunks)tests/e2e/patched/test_fused_llama.py(0 hunks)
💤 Files with no reviewable changes (8)
- examples/archived/stablelm-2/1.6b/fft.yml
- examples/llama-2/lisa.yml
- examples/llama-2/fft_optimized.yml
- .runpod/README.md
- tests/e2e/patched/test_fused_llama.py
- .runpod/src/config/config.yaml
- src/axolotl/utils/schemas/config.py
- src/axolotl/monkeypatch/mistral_attn_hijack_flash.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
- GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
- GitHub Check: pre-commit
- GitHub Check: PyTest (3.11, 2.7.1)
- GitHub Check: PyTest (3.11, 2.6.0)
- GitHub Check: PyTest (3.11, 2.7.0)
- GitHub Check: pre-commit
- GitHub Check: preview
🔇 Additional comments (3)
src/axolotl/loaders/patch_manager.py (2)
333-350: LGTM!The simplification of
_patch_llama_flash_attentionmethod correctly removes the packed parameter and related logic while preserving the shifted-sparse attention and standard flash attention paths.
403-405: Fix reference to removed configuration optionThe condition on line 403 still references
self.cfg.flash_attn_fuse_qkvwhich has been removed from the configuration schema. This should be updated to only checkflash_attn_fuse_mlp:- if self.cfg.flash_attn_fuse_mlp and is_xformers_swiglu_available(): + if self.cfg.flash_attn_fuse_mlp and is_xformers_swiglu_available():Wait, I notice the condition already checks
flash_attn_fuse_mlp. However, based on the AI summary mentioning that "code that conditionally patched with fused QKV attention was removed entirely", this entire conditional block for MLP fusion should have been preserved. Let me verify if this is the intended behavior.src/axolotl/monkeypatch/llama_attn_hijack_flash.py (1)
118-138: Clean simplification of flash attention patchingThe removal of packed sequence handling and fused QKV support has resulted in a much cleaner implementation that focuses on the core flash attention functionality with optional shifted-sparse attention, cross entropy, and RMS norm optimizations.
|
📖 Documentation Preview: https://688e286b3a184849f54b0791--resonant-treacle-0fd729.netlify.app Deployed on Netlify from commit 2cd6874 |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
NanoCode012
left a comment
There was a problem hiding this comment.
Changes look fine. CI fail from train loss being high which is not related to this.
Description
Motivation and Context
How has this been tested?
Screenshots (if appropriate)
Types of changes
Social Handles (Optional)
Summary by CodeRabbit