Skip to content

feat: add lfm2 family and latest moe model#3208

Merged
winglian merged 2 commits into
mainfrom
feat/lfm2-moe
Oct 9, 2025
Merged

feat: add lfm2 family and latest moe model#3208
winglian merged 2 commits into
mainfrom
feat/lfm2-moe

Conversation

@NanoCode012

@NanoCode012 NanoCode012 commented Oct 9, 2025

Copy link
Copy Markdown
Collaborator

Description

https://huggingface.co/LiquidAI/LFM2-8B-A1B

Motivation and Context

How has this been tested?

Ran LoRA model

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

  • New Features

    • Added support for LFM2 and LFM2-MoE models, including multipack compatibility.
    • Introduced an example config for fine-tuning LFM2-8B with LoRA and 8-bit loading.
  • Documentation

    • Expanded LiquidAI README with installation and fine-tuning steps for LFM2-MoE, troubleshooting tips, consolidated Optimizations Guide, and new Related Resources (including LFM2-MoE blog).
    • Updated Cut Cross Entropy docs to list supported models (lfm2, lfm2_moe, lfm2_vl, llava).
  • Chores

    • Updated pinned commit for cut-cross-entropy dependency across guides, scripts, and Colab notebook.

@coderabbitai

coderabbitai Bot commented Oct 9, 2025

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

Adds LiquidAI example docs and a LoRA training YAML for LFM2-8B; updates cut-cross-entropy install pins across docs, scripts, and a Colab notebook; extends supported model mappings for LFM2/LFM2-MoE (architectures and multipack); updates cut-cross-entropy supported models list.

Changes

Cohort / File(s) Summary of changes
LiquidAI examples
examples/LiquidAI/README.md, examples/LiquidAI/lfm2-8b-a1b-lora.yaml
README updated with LFM2-MoE install/fine-tune instructions, troubleshooting note, consolidated optimization link, and resources; new YAML config for LFM2-8B LoRA SFT with dataset, training hyperparams, precision, packing, and plugins.
Cut Cross Entropy pin updates
examples/colab-notebooks/colab-axolotl-example.ipynb, scripts/cutcrossentropy_install.py, src/axolotl/integrations/cut_cross_entropy/README.md, src/axolotl/integrations/cut_cross_entropy/__init__.py
Bumps git commit for cut-cross-entropy[transformers] to 49f3308 in notebook, install script, README, and init guidance; README also expands Supported Models to include lfm2, lfm2_moe, lfm2_vl, and llava.
Model support wiring
src/axolotl/common/architectures.py, src/axolotl/monkeypatch/multipack.py
Adds "lfm2_moe": "Lfm2MoeSparseMoeBlock" to MOE_ARCH_BLOCK; adds "lfm2" and "lfm2_moe" to SUPPORTED_MULTIPACK_MODEL_TYPES.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

ready to merge

Suggested reviewers

  • winglian
  • djsaunde
  • SalmanMohammadi

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title accurately reflects the main objectives of the pull request by announcing support for the LFM2 model family and the new MoE architecture. It succinctly describes the primary additions without unnecessary detail. It is clear, specific, and aligns with the changeset that updates architecture mappings, configurations, and documentation to integrate LFM2 and its MoE variant.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f8ce02 and 9505941.

📒 Files selected for processing (8)
  • examples/LiquidAI/README.md (3 hunks)
  • examples/LiquidAI/lfm2-8b-a1b-lora.yaml (1 hunks)
  • examples/colab-notebooks/colab-axolotl-example.ipynb (1 hunks)
  • scripts/cutcrossentropy_install.py (1 hunks)
  • src/axolotl/common/architectures.py (1 hunks)
  • src/axolotl/integrations/cut_cross_entropy/README.md (2 hunks)
  • src/axolotl/integrations/cut_cross_entropy/__init__.py (1 hunks)
  • src/axolotl/monkeypatch/multipack.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: PyTest (3.11, 2.8.0)
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: preview
  • GitHub Check: PyTest from Source Dist (3.11, 2.8.0)
🔇 Additional comments (13)
examples/colab-notebooks/colab-axolotl-example.ipynb (1)

43-43: Pinned hash update looks good

Thanks for bumping the cut-cross-entropy commit to 49f3308 here—this keeps the Colab notebook aligned with the rest of the repo updates.

examples/LiquidAI/README.md (2)

9-9: LGTM!

Nice acknowledgment of the LiquidAI team's early access support.


36-42: Transformers commit 0c9a72e4… verified for LFM2-MoE support
The specified commit introduces all LFM2-MoE model code, configuration, documentation, and tests; the installation instructions are correct.

examples/LiquidAI/lfm2-8b-a1b-lora.yaml (4)

3-4: LGTM!

The CutCrossEntropyPlugin integration is correctly configured to reduce VRAM usage during training.


38-57: LGTM!

The training hyperparameters are well-configured:

  • Reasonable learning rate (5e-5) with cosine scheduler
  • Appropriate batch size and gradient accumulation
  • Proper evaluation and checkpointing cadence

30-30: Verify LoRA target modules regex
The pattern includes cross_attn (likely absent in this decoder-only model); confirm actual layer names in LFM2-8B-A1B and adjust the regex accordingly.


1-1: Base model availability confirmed
LiquidAI/LFM2-8B-A1B is published on Hugging Face Hub (MoE model, ~8.3B parameters, 32 768 token context) and can be loaded via transformers.

src/axolotl/integrations/cut_cross_entropy/README.md (2)

57-63: LGTM!

The supported models list has been correctly updated to include the new LFM2 family (lfm2, lfm2_moe, lfm2_vl) and llava, which aligns with the PR objectives.


22-22: LGTM!

The installation command is updated with the new commit hash (49f3308), consistent with other files in this PR.

src/axolotl/integrations/cut_cross_entropy/__init__.py (1)

38-38: LGTM!

The installation message string is correctly updated with the new commit hash (49f3308), maintaining consistency across all installation references in the codebase.

src/axolotl/monkeypatch/multipack.py (2)

48-49: LGTM with verification needed!

Adding "lfm2" and "lfm2_moe" to the supported multipack model types is correct and enables sample packing for these models. However, this depends on proper architecture mappings being defined elsewhere (see verification request above).


48-49: Confirm mapping for lfm2_moe
MOE_ARCH_BLOCK already defines "lfm2_moe": "Lfm2MoeSparseMoeBlock" — no further changes required.

scripts/cutcrossentropy_install.py (1)

32-32: Confirm LFM2 support in cut-cross-entropy update
Commit 49f3308 adds lfm2, lfm2_moe, and lfm2_vl transformer modules, verifying required LFM2 model support.

"qwen3_moe": "Qwen3MoeSparseMoeBlock",
"deepseek_v2": "DeepseekV2MoE",
"gpt_oss": "GptOssDecoderLayer",
"lfm2_moe": "Lfm2MoeSparseMoeBlock",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

LGTM! Please verify the class exists.

The addition follows the established pattern and correctly extends MoE architecture support for LFM2.

Run the following script to verify that the Lfm2MoeSparseMoeBlock class is accessible:


🏁 Script executed:

#!/bin/bash
# Description: Verify that Lfm2MoeSparseMoeBlock exists in the codebase or dependencies

# Search for the class definition
rg -n "class Lfm2MoeSparseMoeBlock"

# Search for imports of this class
rg -n "Lfm2MoeSparseMoeBlock"

# Check if it's from transformers library (common pattern for these MoE blocks)
rg -n "from transformers.*Lfm2MoeSparseMoeBlock|import.*Lfm2MoeSparseMoeBlock"

Length of output: 79


Add missing Lfm2MoeSparseMoeBlock

The entry at src/axolotl/common/architectures.py:17 references Lfm2MoeSparseMoeBlock, but no definition or import exists in the codebase—add or import the class.

🤖 Prompt for AI Agents
In src/axolotl/common/architectures.py around line 17, the mapping includes
"lfm2_moe": "Lfm2MoeSparseMoeBlock" but that class is not defined or imported;
add the missing symbol by either importing Lfm2MoeSparseMoeBlock from its
defining module (add a top-level from <module_path> import
Lfm2MoeSparseMoeBlock) or implement the class in this file matching the
project's existing block API (same base class, constructor args and methods as
other sparse/moe blocks), then run tests/type checks to ensure the name resolves
and the mapping works.

@github-actions

github-actions Bot commented Oct 9, 2025

Copy link
Copy Markdown
Contributor

📖 Documentation Preview: https://68e79faad724464e5ee9768d--resonant-treacle-0fd729.netlify.app

Deployed on Netlify from commit 7789bf8

@codecov

codecov Bot commented Oct 9, 2025

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@winglian winglian merged commit ab63b92 into main Oct 9, 2025
16 checks passed
@winglian winglian deleted the feat/lfm2-moe branch October 9, 2025 15:21
flaviusburca pushed a commit to invergent-ai/axolotl that referenced this pull request Oct 18, 2025
* feat: add lfm2 family and latest moe model

* fix: use ml-cross-entropy for lfm2 examples

(cherry picked from commit ab63b92)
@coderabbitai coderabbitai Bot mentioned this pull request Dec 2, 2025
2 tasks
@coderabbitai coderabbitai Bot mentioned this pull request Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants