Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions examples/LiquidAI/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ LFM2 features a new hybrid Liquid architecture with multiplicative gates, short-

This guide shows how to fine-tune both the LFM2 and LFM2-VL models with Axolotl.

Thanks to the team at LiquidAI for giving us early access to prepare for these releases.

## Getting Started

1. Install Axolotl following the [installation guide](https://docs.axolotl.ai/docs/installation.html).
Expand All @@ -31,6 +33,14 @@ This guide shows how to fine-tune both the LFM2 and LFM2-VL models with Axolotl.
axolotl train examples/LiquidAI/lfm2-vl-lora.yaml
```

**LFM2-MoE**
```bash
pip install git+https://github.com/huggingface/transformers.git@0c9a72e4576fe4c84077f066e585129c97bfd4e6

# LoRA SFT (1x48GB @ 16.2GiB)
axolotl train examples/LiquidAI/lfm2-8b-a1b-lora.yaml
```

### TIPS

- **Installation Error**: If you encounter `ImportError: ... undefined symbol ...` or `ModuleNotFoundError: No module named 'causal_conv1d_cuda'`, the `causal-conv1d` package may have been installed incorrectly. Try uninstalling it:
Expand All @@ -45,14 +55,13 @@ This guide shows how to fine-tune both the LFM2 and LFM2-VL models with Axolotl.

## Optimization Guides

- [Multi-GPU Training](https://docs.axolotl.ai/docs/multi-gpu.html)
- [LoRA Optimizations](https://docs.axolotl.ai/docs/lora_optims.html)
- [Multi-Node Training](https://docs.axolotl.ai/docs/multi-node.html)
- [Optimizations Guide](https://docs.axolotl.ai/docs/optimizations.html)

## Related Resources

- [LFM2 Blog](https://www.liquid.ai/blog/liquid-foundation-models-v2-our-second-series-of-generative-ai-models)
- [LFM2-VL Blog](https://www.liquid.ai/blog/lfm2-vl-efficient-vision-language-models)
- [LFM2-MoE Blog](https://www.liquid.ai/blog/lfm2-8b-a1b-an-efficient-on-device-mixture-of-experts)
- [Axolotl Docs](https://docs.axolotl.ai)
- [Axolotl GitHub](https://github.com/axolotl-ai-cloud/axolotl)
- [Axolotl Discord](https://discord.gg/7m9sfhzaf3)
3 changes: 2 additions & 1 deletion examples/LiquidAI/lfm2-350m-fft.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
base_model: LiquidAI/LFM2-350M

chunked_cross_entropy: true
plugins:
- axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin

eot_tokens:
- "<|im_end|>"
Expand Down
59 changes: 59 additions & 0 deletions examples/LiquidAI/lfm2-8b-a1b-lora.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
base_model: LiquidAI/LFM2-8B-A1B

plugins:
- axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin

load_in_8bit: true

eot_tokens:
- "<|im_end|>"
datasets:
- path: mlabonne/FineTome-100k
type: chat_template
split: train[:20%]
field_messages: conversations
message_field_role: from
message_field_content: value
dataset_prepared_path: last_run_prepared
val_set_size: 0.05
output_dir: ./outputs/out

sequence_len: 4096
sample_packing: true

adapter: lora
lora_model_dir:

lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules: 'model.layers.[\d]+.(mlp|cross_attn|self_attn).(up|down|gate|q|k|v|o)_proj'

wandb_project:
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:

gradient_accumulation_steps: 2
micro_batch_size: 4
num_epochs: 1
optimizer: adamw_torch_fused
lr_scheduler: cosine
learning_rate: 5e-5

bf16: true
tf32: true

gradient_checkpointing: true
resume_from_checkpoint:
logging_steps: 1
flash_attention: true

warmup_ratio: 0.1
evals_per_epoch: 2
saves_per_epoch: 1

weight_decay: 0.0

# save_first_step: true # uncomment this to validate checkpoint saving works with your config
3 changes: 3 additions & 0 deletions examples/LiquidAI/lfm2-vl-lora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@ trust_remote_code: true
model_type: AutoModelForImageTextToText
processor_type: AutoProcessor

plugins:
- axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin

# these 3 lines are needed for now to handle vision chat templates w images
skip_prepare_dataset: true
remove_unused_columns: false
Expand Down
2 changes: 1 addition & 1 deletion examples/colab-notebooks/colab-axolotl-example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@
"%%capture\n",
"# This step can take ~5-10 minutes to install dependencies\n",
"!pip install --no-build-isolation axolotl[flash-attn]>=0.9.1\n",
"!pip install \"cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@147ea28\""
"!pip install \"cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@49f3308\""
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion scripts/cutcrossentropy_install.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,5 +29,5 @@

print(
UNINSTALL_PREFIX
+ f'{UV_PREFIX}pip install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@147ea28"'
+ f'{UV_PREFIX}pip install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@49f3308"'
)
1 change: 1 addition & 0 deletions src/axolotl/common/architectures.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,5 @@
"qwen3_moe": "Qwen3MoeSparseMoeBlock",
"deepseek_v2": "DeepseekV2MoE",
"gpt_oss": "GptOssDecoderLayer",
"lfm2_moe": "Lfm2MoeSparseMoeBlock",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

LGTM! Please verify the class exists.

The addition follows the established pattern and correctly extends MoE architecture support for LFM2.

Run the following script to verify that the Lfm2MoeSparseMoeBlock class is accessible:


🏁 Script executed:

#!/bin/bash
# Description: Verify that Lfm2MoeSparseMoeBlock exists in the codebase or dependencies

# Search for the class definition
rg -n "class Lfm2MoeSparseMoeBlock"

# Search for imports of this class
rg -n "Lfm2MoeSparseMoeBlock"

# Check if it's from transformers library (common pattern for these MoE blocks)
rg -n "from transformers.*Lfm2MoeSparseMoeBlock|import.*Lfm2MoeSparseMoeBlock"

Length of output: 79


Add missing Lfm2MoeSparseMoeBlock

The entry at src/axolotl/common/architectures.py:17 references Lfm2MoeSparseMoeBlock, but no definition or import exists in the codebase—add or import the class.

🤖 Prompt for AI Agents
In src/axolotl/common/architectures.py around line 17, the mapping includes
"lfm2_moe": "Lfm2MoeSparseMoeBlock" but that class is not defined or imported;
add the missing symbol by either importing Lfm2MoeSparseMoeBlock from its
defining module (add a top-level from <module_path> import
Lfm2MoeSparseMoeBlock) or implement the class in this file matching the
project's existing block API (same base class, constructor args and methods as
other sparse/moe blocks), then run tests/type checks to ensure the name resolves
and the mapping works.

}
6 changes: 5 additions & 1 deletion src/axolotl/integrations/cut_cross_entropy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ python scripts/cutcrossentropy_install.py | sh

- If you are installing from pip
```bash
pip3 uninstall -y cut-cross-entropy && pip3 install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@147ea28"
pip3 uninstall -y cut-cross-entropy && pip3 install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@49f3308"
```

## Usage
Expand Down Expand Up @@ -54,9 +54,13 @@ plugins:
- granitemoehybrid
- hunyuan_v1_dense
- hunyuan_v1_moe
- lfm2
- lfm2_moe
- lfm2_vl
- llama
- llama4
- llama4_text
- llava
- mistral
- mistral3
- mixtral
Expand Down
2 changes: 1 addition & 1 deletion src/axolotl/integrations/cut_cross_entropy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@

_CCE_INSTALL_MESSAGE = (
"Please install Axolotl's fork of cut_cross_entropy with transformers support using "
'`pip install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@147ea28"`'
'`pip install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@49f3308"`'
)


Expand Down
2 changes: 2 additions & 0 deletions src/axolotl/monkeypatch/multipack.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@
"gpt_oss",
"arcee",
"seed_oss",
"lfm2",
"lfm2_moe",
]


Expand Down