feat: add lfm2 family and latest moe model#3208
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughWalkthroughAdds LiquidAI example docs and a LoRA training YAML for LFM2-8B; updates cut-cross-entropy install pins across docs, scripts, and a Colab notebook; extends supported model mappings for LFM2/LFM2-MoE (architectures and multipack); updates cut-cross-entropy supported models list. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
examples/LiquidAI/README.md(3 hunks)examples/LiquidAI/lfm2-8b-a1b-lora.yaml(1 hunks)examples/colab-notebooks/colab-axolotl-example.ipynb(1 hunks)scripts/cutcrossentropy_install.py(1 hunks)src/axolotl/common/architectures.py(1 hunks)src/axolotl/integrations/cut_cross_entropy/README.md(2 hunks)src/axolotl/integrations/cut_cross_entropy/__init__.py(1 hunks)src/axolotl/monkeypatch/multipack.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: PyTest (3.11, 2.8.0)
- GitHub Check: PyTest (3.11, 2.7.1)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
- GitHub Check: preview
- GitHub Check: PyTest from Source Dist (3.11, 2.8.0)
🔇 Additional comments (13)
examples/colab-notebooks/colab-axolotl-example.ipynb (1)
43-43: Pinned hash update looks goodThanks for bumping the cut-cross-entropy commit to 49f3308 here—this keeps the Colab notebook aligned with the rest of the repo updates.
examples/LiquidAI/README.md (2)
9-9: LGTM!Nice acknowledgment of the LiquidAI team's early access support.
36-42: Transformers commit 0c9a72e4… verified for LFM2-MoE support
The specified commit introduces all LFM2-MoE model code, configuration, documentation, and tests; the installation instructions are correct.examples/LiquidAI/lfm2-8b-a1b-lora.yaml (4)
3-4: LGTM!The CutCrossEntropyPlugin integration is correctly configured to reduce VRAM usage during training.
38-57: LGTM!The training hyperparameters are well-configured:
- Reasonable learning rate (5e-5) with cosine scheduler
- Appropriate batch size and gradient accumulation
- Proper evaluation and checkpointing cadence
30-30: Verify LoRA target modules regex
The pattern includescross_attn(likely absent in this decoder-only model); confirm actual layer names in LFM2-8B-A1B and adjust the regex accordingly.
1-1: Base model availability confirmed
LiquidAI/LFM2-8B-A1Bis published on Hugging Face Hub (MoE model, ~8.3B parameters, 32 768 token context) and can be loaded via transformers.src/axolotl/integrations/cut_cross_entropy/README.md (2)
57-63: LGTM!The supported models list has been correctly updated to include the new LFM2 family (lfm2, lfm2_moe, lfm2_vl) and llava, which aligns with the PR objectives.
22-22: LGTM!The installation command is updated with the new commit hash (49f3308), consistent with other files in this PR.
src/axolotl/integrations/cut_cross_entropy/__init__.py (1)
38-38: LGTM!The installation message string is correctly updated with the new commit hash (49f3308), maintaining consistency across all installation references in the codebase.
src/axolotl/monkeypatch/multipack.py (2)
48-49: LGTM with verification needed!Adding "lfm2" and "lfm2_moe" to the supported multipack model types is correct and enables sample packing for these models. However, this depends on proper architecture mappings being defined elsewhere (see verification request above).
48-49: Confirm mapping forlfm2_moe
MOE_ARCH_BLOCKalready defines"lfm2_moe": "Lfm2MoeSparseMoeBlock"— no further changes required.scripts/cutcrossentropy_install.py (1)
32-32: Confirm LFM2 support in cut-cross-entropy update
Commit49f3308addslfm2,lfm2_moe, andlfm2_vltransformer modules, verifying required LFM2 model support.
| "qwen3_moe": "Qwen3MoeSparseMoeBlock", | ||
| "deepseek_v2": "DeepseekV2MoE", | ||
| "gpt_oss": "GptOssDecoderLayer", | ||
| "lfm2_moe": "Lfm2MoeSparseMoeBlock", |
There was a problem hiding this comment.
🧩 Analysis chain
LGTM! Please verify the class exists.
The addition follows the established pattern and correctly extends MoE architecture support for LFM2.
Run the following script to verify that the Lfm2MoeSparseMoeBlock class is accessible:
🏁 Script executed:
#!/bin/bash
# Description: Verify that Lfm2MoeSparseMoeBlock exists in the codebase or dependencies
# Search for the class definition
rg -n "class Lfm2MoeSparseMoeBlock"
# Search for imports of this class
rg -n "Lfm2MoeSparseMoeBlock"
# Check if it's from transformers library (common pattern for these MoE blocks)
rg -n "from transformers.*Lfm2MoeSparseMoeBlock|import.*Lfm2MoeSparseMoeBlock"Length of output: 79
Add missing Lfm2MoeSparseMoeBlock
The entry at src/axolotl/common/architectures.py:17 references Lfm2MoeSparseMoeBlock, but no definition or import exists in the codebase—add or import the class.
🤖 Prompt for AI Agents
In src/axolotl/common/architectures.py around line 17, the mapping includes
"lfm2_moe": "Lfm2MoeSparseMoeBlock" but that class is not defined or imported;
add the missing symbol by either importing Lfm2MoeSparseMoeBlock from its
defining module (add a top-level from <module_path> import
Lfm2MoeSparseMoeBlock) or implement the class in this file matching the
project's existing block API (same base class, constructor args and methods as
other sparse/moe blocks), then run tests/type checks to ensure the name resolves
and the mapping works.
|
📖 Documentation Preview: https://68e79faad724464e5ee9768d--resonant-treacle-0fd729.netlify.app Deployed on Netlify from commit 7789bf8 |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
* feat: add lfm2 family and latest moe model * fix: use ml-cross-entropy for lfm2 examples (cherry picked from commit ab63b92)
Description
https://huggingface.co/LiquidAI/LFM2-8B-A1B
Motivation and Context
How has this been tested?
Ran LoRA model
Screenshots (if appropriate)
Types of changes
Social Handles (Optional)
Summary by CodeRabbit
New Features
Documentation
Chores