Feat: add arcee by NanoCode012 · Pull Request #3028 · axolotl-ai-cloud/axolotl

NanoCode012 · 2025-08-07T04:57:29Z

Description

Arcee.ai 's AFM model was trained in Axolotl. This PR adds configs for fine-tuning it.

Motivation and Context

How has this been tested?

Manual run working.

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

New Features
- Added documentation and configuration for fine-tuning ArceeAI's AFM 4.5B model using Axolotl, including a sample YAML config and setup instructions.
Bug Fixes
- Expanded supported model types for multipack and cut_cross_entropy integrations to include "arcee" and other models.
Documentation
- Updated and added README files with installation instructions, supported models, and resource links.
Chores
- Updated cut-cross-entropy package installation to a new commit hash in scripts, documentation, and notebooks.
- Minor formatting improvements in YAML configuration files.

coderabbitai · 2025-08-07T04:57:37Z

📝 Walkthrough

Walkthrough

This update introduces a new README and YAML configuration for fine-tuning ArceeAI's AFM-4.5B model, updates the cut-cross-entropy package commit hash in multiple locations, adds "arcee" to supported model types, and makes minor formatting corrections to Magistral YAML files. The cut-cross-entropy documentation now includes additional supported models.

Changes

Cohort / File(s)	Change Summary
Arcee Model Fine-Tuning Documentation & Config `examples/arcee/README.md`, `examples/arcee/afm-4.5b-qlora.yaml`	Adds a README and QLoRA YAML config for fine-tuning ArceeAI's AFM-4.5B model using Axolotl, with instructions, parameters, and resource links.
Cut-Cross-Entropy Commit Hash Update `examples/colab-notebooks/colab-axolotl-example.ipynb`, `scripts/cutcrossentropy_install.py`, `src/axolotl/integrations/cut_cross_entropy/README.md`, `src/axolotl/integrations/cut_cross_entropy/__init__.py`	Updates the cut-cross-entropy package install command to use commit `bb8d9f8` instead of `48b5169` in several locations; expands supported models in the README.
Magistral YAML Formatting `examples/magistral/magistral-small-fsdp-qlora.yaml`, `examples/magistral/magistral-small-qlora.yaml`, `examples/magistral/magistral-small-think-qlora.yaml`	Removes single blank lines between config entries for improved formatting; no value changes.
Supported Model Types Update `src/axolotl/monkeypatch/multipack.py`	Adds `"arcee"` to the list of supported multipack model types.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

chore: update cce commit to include gemma3n fixes #2881: Also updates the cut-cross-entropy package commit hash, indicating both PRs modify installation references for this package.
use latest version of cce fork for SP fix #2871: Updates cut-cross-entropy to a different commit hash, showing related changes in dependency management.
feat: upgrade cce commit to include smollm3, granite, granitemoe #2993: Updates the cut-cross-entropy commit hash and adds support for other models, similar to this PR's changes for "arcee".

Suggested labels

scheduled_release

Suggested reviewers

NanoCode012

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/arcee

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (6)

src/axolotl/monkeypatch/multipack.py (1)
13-40: Maintain alphabetical ordering of SUPPORTED_MULTIPACK_MODEL_TYPES to ease future diffs

"arcee" was appended at the end of the list, breaking the alphabetical (and grouped-by-family) ordering that makes the list easy to scan and keeps merge conflicts small.
Consider re-inserting the new entry in its sorted position (after "deepseek_v3" and before "falcon").
@@
-    "deepseek_v3",
-    "glm",
+    "arcee",
+    "deepseek_v3",
+    "glm",
@@
-    "smollm3",
-    "arcee",
+    "smollm3",
src/axolotl/integrations/cut_cross_entropy/README.md (2)
22-23: Minor: escape the commit hash with back-ticks for consistency

All other code-style snippets use back-ticks around the command – the second pip3 line would read cleaner with them.
-pip3 uninstall -y cut-cross-entropy && pip3 install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@bb8d9f8"
+pip3 uninstall -y cut-cross-entropy && \
+pip3 install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@bb8d9f8"
34-56: Consider alphabetising the Supported Models list

The growing list is becoming hard to scan. Alphabetical ordering (or grouping by vendor) would make future additions trivial and avoid accidental duplicates.

Not blocking – informational only.
examples/arcee/README.md (2)

30-30: Fill the VRAM placeholder

This config uses about (---) VRAM. still contains a placeholder. Replace with an approximate value (e.g., “≈ 24 GB” on A100-80GB) so users can gauge hardware needs.

11-11: Minor phrasing tweak for clarity

“You need to install from main as AFM is only on nightly or use our latest Docker images” is hard to parse. Consider:

“Install Axolotl from the main branch (nightly) or pull our latest Docker image, because AFM support hasn’t landed in a stable release yet.”

examples/arcee/afm-4.5b-qlora.yaml (1)

52-53: bf16: auto & tf32: false – document hardware expectations

bf16: auto silently falls back to fp16 if the GPU lacks BF16; disabling TF32 may hurt throughput on Ampere/Hopper. Add a comment so users know why TF32 is disabled and whether BF16 fallback is acceptable.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4bce713 and 8a03796.

📒 Files selected for processing (10)

examples/arcee/README.md (1 hunks)
examples/arcee/afm-4.5b-qlora.yaml (1 hunks)
examples/colab-notebooks/colab-axolotl-example.ipynb (1 hunks)
examples/magistral/magistral-small-fsdp-qlora.yaml (0 hunks)
examples/magistral/magistral-small-qlora.yaml (0 hunks)
examples/magistral/magistral-small-think-qlora.yaml (0 hunks)
scripts/cutcrossentropy_install.py (1 hunks)
src/axolotl/integrations/cut_cross_entropy/README.md (3 hunks)
src/axolotl/integrations/cut_cross_entropy/__init__.py (1 hunks)
src/axolotl/monkeypatch/multipack.py (1 hunks)

💤 Files with no reviewable changes (3)

examples/magistral/magistral-small-fsdp-qlora.yaml
examples/magistral/magistral-small-think-qlora.yaml
examples/magistral/magistral-small-qlora.yaml

🧰 Additional context used

🪛 LanguageTool

examples/arcee/README.md

[style] ~31-~31: Consider using polite language here.
Context: ...`` This config uses about (---) VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...

(INSERT_PLEASE)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: preview
GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
GitHub Check: pre-commit
GitHub Check: PyTest (3.11, 2.7.0)
GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
GitHub Check: PyTest (3.11, 2.6.0)
GitHub Check: PyTest (3.11, 2.7.1)
GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
GitHub Check: pre-commit

🔇 Additional comments (6)

src/axolotl/monkeypatch/multipack.py (1)

55-57: Double-check that “arcee” needs no special multipack patch

patch_for_multipack() currently triggers extra logic only for "mixtral" when DeepSpeed-ZeRO-3 is enabled.
If the Arcee model family requires bespoke forwarding / _get_unpad_data handling similar to Mixtral or Qwen-MOE, you may need to add a dedicated branch here (or in cut_cross_entropy) to avoid silent performance penalties or shape mismatches.

Please confirm that the stock causal-LM forward path suffices; otherwise, add an explicit patch or TODO.

src/axolotl/integrations/cut_cross_entropy/__init__.py (1)

35-38: Verify commit bb8d9f8 exists and retains Axolotl patches

Hard-pinning to a specific commit helps with reproducibility, but CI cannot find bb8d9f8 in the axolotl-ai-cloud/ml-cross-entropy repo. Please:

Confirm that commit bb8d9f8 has been pushed to GitHub under axolotl-ai-cloud/ml-cross-entropy.

Once available, verify that it still defines:

AXOLOTL_CCE_FORK = True

Exports cut_cross_entropy.transformers.patch.cce_patch

Without the correct hash and these patches in place, the installation instructions will break.

scripts/cutcrossentropy_install.py (1)

30-33: Hash consistency verified – no stale references found

Ran rg --fixed-strings 48b5169 and confirmed there are no remaining occurrences of the old hash. The new hash bb8d9f8 is used consistently in scripts/cutcrossentropy_install.py, the README, and __init__.py. No further action required.

examples/colab-notebooks/colab-axolotl-example.ipynb (1)

42-44: Pinned SHA bumped—validate downstream impact

The cut-cross-entropy install now targets commit bb8d9f8. Ensure:

That SHA still exposes the transformers extra (install will break otherwise).

Any breaking API/CLI changes between 48b5169 → bb8d9f8 are reflected in later notebook cells and docs.

If you have already run the notebook end-to-end after this bump, all good.

examples/arcee/README.md (1)

16-16: Unrealistic PyTorch version requirement

# Ensure you have Pytorch installed (Pytorch 2.6.0 min) references a release that does not exist upstream (latest stable is 2.2.x).
If unreleased nightlies are genuinely required, state that explicitly and link to the correct install command; otherwise adjust the version floor.

examples/arcee/afm-4.5b-qlora.yaml (1)

17-18: val_set_size datatype

val_set_size: 0.1 is parsed as a float (10 %) in recent Axolotl versions, but older releases expected an integer sample count. Verify you’re on a commit that supports float fractions; otherwise training will crash on schema validation.

coderabbitai · 2025-08-07T05:02:42Z

+lora_model_dir:
+


🛠️ Refactor suggestion

Unset lora_model_dir may override CLI resume logic

lora_model_dir: is present but empty. Axolotl interprets an empty string as “use the same directory as output_dir”, which can silently overwrite checkpoints when resuming. If that’s intentional, drop the key; otherwise set an explicit path.

- lora_model_dir: +# lora_model_dir: ./outputs/afm-4.5b-qlora-adapter

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

lora_model_dir:

lora_model_dir: ./outputs/afm-4.5b-qlora-adapter

🤖 Prompt for AI Agents

In examples/arcee/afm-4.5b-qlora.yaml around lines 21 to 22, the key `lora_model_dir` is present but set to an empty value, which Axolotl treats as the same directory as `output_dir`, potentially overwriting checkpoints during resume. To fix this, either remove the `lora_model_dir` key entirely if you want to use the default resume behavior, or set it explicitly to a different directory path to avoid accidental overwrites.

github-actions · 2025-08-07T05:02:55Z

📖 Documentation Preview: https://6895e8725f3f8d711fcf3adf--resonant-treacle-0fd729.netlify.app

Deployed on Netlify from commit eb0f860

codecov · 2025-08-07T05:14:37Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

examples/arcee/README.md (1)
3-3: Fix malformed Markdown link
The link has two opening brackets, so Markdown renders it as plain text instead of a hyperlink.
-[Arcee Foundation Models (AFM)]((https://huggingface.co/collections/arcee-ai/afm-45b-68823397c351603014963473))
+[Arcee Foundation Models (AFM)](https://huggingface.co/collections/arcee-ai/afm-45b-68823397c351603014963473)

🧹 Nitpick comments (1)

examples/arcee/README.md (1)
1-1: Hyphenate “Fine-tune” in the header
Style nit: “Fine-tune” is normally hyphenated when used as a verb-noun compound in titles.
-# Finetune ArceeAI's AFM with Axolotl
+# Fine-tune ArceeAI's AFM with Axolotl

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 440f980 and c780f79.

📒 Files selected for processing (2)

examples/arcee/README.md (1 hunks)
examples/arcee/afm-4.5b-qlora.yaml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

examples/arcee/afm-4.5b-qlora.yaml

🧰 Additional context used

🪛 LanguageTool

examples/arcee/README.md

[style] ~31-~31: Consider using polite language here.
Context: ...` This config uses about 7.8GiB VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...

(INSERT_PLEASE)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
GitHub Check: PyTest (3.11, 2.7.1)
GitHub Check: PyTest (3.11, 2.7.0)
GitHub Check: PyTest (3.11, 2.6.0)
GitHub Check: pre-commit
GitHub Check: pre-commit
GitHub Check: preview

🔇 Additional comments (1)

examples/arcee/README.md (1)

16-17: Verify PyTorch version and fix casing
“PyTorch 2.6.0” does not yet exist (latest stable ≤ 2.x). This may confuse users. Also capitalise “PyTorch”.
-# Ensure you have Pytorch installed (Pytorch 2.6.0 min)
+# Ensure you have PyTorch installed (recommended ≥ 2.1.0 – update this when 2.6.0 is released)
Please confirm the minimum supported version and update the doc accordingly.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

examples/arcee/README.md (2)
1-1: Unify “fine-tune” spelling for consistency & professionalism

The doc alternates between “Finetune/finetuning” and the standard “fine-tune/fine-tuning”. Standardising improves polish and avoids search/grep misses.
-# Finetune ArceeAI's AFM with Axolotl
+# Fine-tune ArceeAI's AFM with Axolotl
@@
-This guide shows how to fine-tune it with Axolotl with multi-turn conversations and proper masking.
+This guide shows how to fine-tune it with Axolotl using multi-turn conversations and proper masking.
@@
-Let us know how it goes. Happy finetuning! 🚀
+Let us know how it goes. Happy fine-tuning! 🚀
Also applies to: 5-5, 32-32

20-21: Hard-pinning build tools may create avoidable conflicts

Locking packaging==23.2, setuptools==75.8.0, and wheel can clash with users’ existing environments and future PyPI security fixes. Unless a specific bug is being avoided, consider relaxing or dropping these pins.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c780f79 and 22b7b23.

📒 Files selected for processing (1)

examples/arcee/README.md (1 hunks)

🧰 Additional context used

🪛 LanguageTool

examples/arcee/README.md

[style] ~31-~31: Consider using polite language here.
Context: ...` This config uses about 7.8GiB VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...

(INSERT_PLEASE)

🔇 Additional comments (2)

examples/arcee/README.md (2)

16-17: PyTorch 2.6.0 doesn’t exist yet—please verify the minimum version

Current upstream releases are 2.3.x. Referencing 2.6.0 may confuse users and break automated setup scripts.

3-3: Markdown link fixed – looks good

The earlier extra bracket has been removed; the link now renders correctly.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (12)

examples/arcee/README.md (12)
1-1: Hyphenate and unify branding in the title

Minor wording polish for consistency with the rest of the doc (“Arcee.ai”) and standard hyphenation.
-# Finetune ArceeAI's AFM with Axolotl
+# Fine-tune Arcee.ai's AFM with Axolotl
3-3: Grammar: “4.5B-parameter open-weight models”

Hyphenate compound modifiers and numbers used adjectivally.
-[Arcee Foundation Models (AFM)](https://huggingface.co/collections/arcee-ai/afm-45b-68823397c351603014963473) are a family of 4.5B parameter open weight models trained by Arcee.ai.
+[Arcee Foundation Models (AFM)](https://huggingface.co/collections/arcee-ai/afm-45b-68823397c351603014963473) are a family of 4.5B-parameter, open-weight models trained by Arcee.ai.
5-5: Tighten sentence and avoid double “with”

Small readability tweak.
-This guide shows how to fine-tune it with Axolotl with multi-turn conversations and proper masking.
+This guide shows how to fine-tune AFM with Axolotl for multi-turn conversations with proper conversation masking.
7-7: Fix phrasing for supervised fine-tuning
-Thanks to the team at Arcee.ai for using Axolotl in supervised fine-tuning the AFM model.
+Thanks to the Arcee.ai team for using Axolotl for supervised fine-tuning of the AFM model.
11-11: Clarify “nightly” wording

Make it explicit that AFM support is on main/nightly builds or Docker.
-1. Install Axolotl following the [installation guide](https://docs.axolotl.ai/docs/installation.html). You need to install from main as AFM is only on nightly or use our latest [Docker images](https://docs.axolotl.ai/docs/docker.html).
+1. Install Axolotl following the [installation guide](https://docs.axolotl.ai/docs/installation.html). Install from the main branch (nightly), where AFM support is currently available, or use our latest [Docker images](https://docs.axolotl.ai/docs/docker.html).
16-22: Installation accuracy and FlashAttention build notes

PyTorch/CUDA/FlashAttention compatibility is hardware- and environment-dependent; the “2.6.0 min” claim may not be universally correct. Consider softening and pointing to the support matrix.

Pessimistic pinning of packaging==23.2 and a very specific setuptools==75.8.0 can cause avoidable conflicts. Prefer upgrading tooling without strict pins unless required.
-# Ensure you have Pytorch installed (Pytorch 2.6.0 min)
+# Ensure you have PyTorch installed (verify supported CUDA/driver versions for your GPU; see Axolotl/FlashAttention docs)
@@
-pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
+pip3 install -U pip setuptools wheel ninja
Optional follow-up additions (no diff shown):

Note: FlashAttention generally requires Ampere/Hopper GPUs and a matching CUDA toolkit. If building flash-attn fails, try installing without the flash-attn extra, or install prebuilt wheels where available.

If needed, export CUDA_HOME and ensure nvcc is on PATH.

30-30: Qualify the VRAM figure with hardware and config context

VRAM usage depends on GPU, seq length, micro-batch size, gradient checkpointing, quantization, etc. Add the measurement context (GPU model, seq_len, micro_batch_size) or rephrase as an estimate.

32-32: Hyphenate “fine-tuning”

Also addresses the stylistic hint flagged by static analysis.
-Let us know how it goes. Happy finetuning! 🚀
+Let us know how it goes. Happy fine-tuning! 🚀
34-34: Heading case

Match typical style used elsewhere in the docs.
-### TIPS
+### Tips
36-38: Wording and hyphenation; add a caution for full FT

Polish wording and suggest noting that full FT may require adjusting memory-sensitive params.
-- For inference, the official Arcee.ai team recommends `top_p: 0.95`, `temperature: 0.5`, `top_k: 50`, and `repeat_penalty: 1.1`.
-- You can run a full finetuning by removing the `adapter: qlora` and `load_in_4bit: true` from the config.
+- For inference, the Arcee.ai team recommends `top_p: 0.95`, `temperature: 0.5`, `top_k: 50`, and `repeat_penalty: 1.1`.
+- To run full fine-tuning, remove `adapter: qlora` and `load_in_4bit: true` from the config.
Follow-up (no diff): Add a note that full FT typically increases VRAM needs and may require adjusting micro_batch_size, gradient_accumulation_steps, and enabling gradient checkpointing.

38-39: Dataset link + quick example

Consider adding a minimal OpenAI Messages JSON example for clarity.

Verify that the anchor #chat_template is correct for the linked page.

Example snippet to include:
[
  {
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "Explain QLoRA in one sentence." },
      { "role": "assistant", "content": "QLoRA fine-tunes a 4-bit quantized model using low-rank adapters to reduce memory usage." }
    ]
  }
]
41-46: Add optimization link: Cut Cross-Entropy (token pruning)

Since the PR updates CCE integration, surface it here to help users train/evaluate faster.
 ## Optimization Guides
@@
 - [Multi-GPU Training](https://docs.axolotl.ai/docs/multi-gpu.html)
 - [Multi-Node Training](https://docs.axolotl.ai/docs/multi-node.html)
 - [LoRA Optimizations](https://docs.axolotl.ai/docs/lora_optims.html)
+ - Cut Cross-Entropy (token pruning) — see the integration README in this repo (verify link target)
+   - Example link: src/axolotl/integrations/cut_cross_entropy/README.md

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 22b7b23 and eb0f860.

📒 Files selected for processing (10)

examples/arcee/README.md (1 hunks)
examples/arcee/afm-4.5b-qlora.yaml (1 hunks)
examples/colab-notebooks/colab-axolotl-example.ipynb (1 hunks)
examples/magistral/magistral-small-fsdp-qlora.yaml (0 hunks)
examples/magistral/magistral-small-qlora.yaml (0 hunks)
examples/magistral/magistral-small-think-qlora.yaml (0 hunks)
scripts/cutcrossentropy_install.py (1 hunks)
src/axolotl/integrations/cut_cross_entropy/README.md (3 hunks)
src/axolotl/integrations/cut_cross_entropy/__init__.py (1 hunks)
src/axolotl/monkeypatch/multipack.py (1 hunks)

💤 Files with no reviewable changes (3)

examples/magistral/magistral-small-fsdp-qlora.yaml
examples/magistral/magistral-small-think-qlora.yaml
examples/magistral/magistral-small-qlora.yaml

✅ Files skipped from review due to trivial changes (3)

scripts/cutcrossentropy_install.py
src/axolotl/integrations/cut_cross_entropy/init.py
examples/colab-notebooks/colab-axolotl-example.ipynb

🚧 Files skipped from review as they are similar to previous changes (3)

src/axolotl/monkeypatch/multipack.py
src/axolotl/integrations/cut_cross_entropy/README.md
examples/arcee/afm-4.5b-qlora.yaml

🧰 Additional context used

🪛 LanguageTool

examples/arcee/README.md

[style] ~31-~31: Consider using polite language here.
Context: ...` This config uses about 7.8GiB VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...

(INSERT_PLEASE)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: PyTest (3.11, 2.7.1)
GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
GitHub Check: pre-commit
GitHub Check: PyTest (3.11, 2.6.0)
GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
GitHub Check: PyTest (3.11, 2.7.0)
GitHub Check: pre-commit
GitHub Check: preview

🔇 Additional comments (1)

examples/arcee/README.md (1)

27-27: ✅ Configuration Verified

The file examples/arcee/afm-4.5b-qlora.yaml was found and successfully parsed, containing the required keys (base_model, datasets, adapter). No further changes are needed.

coderabbitai Bot reviewed Aug 7, 2025

View reviewed changes

winglian added the ready to merge label Aug 8, 2025

coderabbitai Bot reviewed Aug 8, 2025

View reviewed changes

NanoCode012 added 8 commits August 8, 2025 08:01

feat: add arcee

451fe5f

feat: add latest models supported by cce

e1a221c

feat: add arcee example config

2759ad3

chore: lint

f3c3538

fix: typo

4a26eec

feat: change to instruct

fec5230

feat: add vram usage

a2997d9

Update README.md

eb0f860

winglian force-pushed the feat/arcee branch from 22b7b23 to eb0f860 Compare August 8, 2025 12:01

coderabbitai Bot reviewed Aug 8, 2025

View reviewed changes

winglian merged commit 2974670 into main Aug 8, 2025
11 checks passed

winglian deleted the feat/arcee branch August 8, 2025 12:09

winglian removed the ready to merge label Aug 18, 2025

This was referenced Aug 25, 2025

Feat: add hunyuan v1 #3016

Merged

Feat: add seedoss #3104

Merged

coderabbitai Bot mentioned this pull request Sep 16, 2025

Feat: add qwen3-next (w packing+cce) #3150

Merged

coderabbitai Bot mentioned this pull request Sep 24, 2025

Feat: add qwen3_vl, qwen3_vl_moe, granitemoeshared, granitemoehybrid, and upgraded all cce patches #3178

Merged

	lora_model_dir:
	lora_model_dir: ./outputs/afm-4.5b-qlora-adapter

Uh oh!

Conversation

NanoCode012 commented Aug 7, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Aug 7, 2025

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

NanoCode012 commented Aug 7, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Aug 7, 2025 •

edited

Loading

github-actions Bot commented Aug 7, 2025 •

edited

Loading