Skip to content

Feat: add hunyuan v1#3016

Merged
NanoCode012 merged 13 commits into
mainfrom
feat/hunyuan_v1
Sep 10, 2025
Merged

Feat: add hunyuan v1#3016
NanoCode012 merged 13 commits into
mainfrom
feat/hunyuan_v1

Conversation

@NanoCode012
Copy link
Copy Markdown
Collaborator

@NanoCode012 NanoCode012 commented Aug 5, 2025

Description

Requires install transformers PR huggingface/transformers#39606

Includes:

  • CCE optimization for their dense models

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

  • New Features

    • Added example configuration and guidance for fine-tuning HunYuan models, including QLoRA settings and recommended inference parameters.
    • Expanded multipack support to additional model types, enabling broader training compatibility.
  • Documentation

    • New HunYuan fine-tuning README with dataset formatting examples and tuning tips.
    • Updated Devstral, Magistral, and Voxtral guides to include Cut Cross Entropy installation to reduce VRAM usage.
  • Refactor

    • Improved chat template detection to reduce unnecessary missing-template log messages.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Aug 5, 2025

📝 Walkthrough

Walkthrough

Adds HunYuan example documentation and a QLoRA config; updates multipack supported model types; tweaks tokenizer chat template logging condition; and inserts Cut Cross Entropy installation steps in multiple example READMEs.

Changes

Cohort / File(s) Summary
HunYuan example
examples/hunyuan/README.md, examples/hunyuan/hunyuan-v1-dense-qlora.yaml
New README for fine-tuning HunYuan with Axolotl and a QLoRA training config targeting tencent/Hunyuan-0.5B-Instruct.
Multipack model types update
src/axolotl/monkeypatch/multipack.py
Added supported model types: "granite", "granitemoe", "hunyuan_v1_dense", "hunyuan_v1_moe".
Tokenizer logging tweak
src/axolotl/loaders/tokenizer.py
Changed condition to log missing chat templates only when tokenizer.chat_template is None.
Docs: Cut Cross Entropy install step
examples/devstral/README.md, examples/magistral/README.md, examples/voxtral/README.md
Inserted step to install Cut Cross Entropy via `scripts/cutcrossentropy_install.py

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

ready to merge

Suggested reviewers

  • winglian
  • djsaunde
  • SalmanMohammadi

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/hunyuan_v1

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
src/axolotl/integrations/cut_cross_entropy/__init__.py (1)

35-38: Same hard-coded SHA duplication as noted in README

This string repeats the commit SHA already mentioned elsewhere. A central constant (or env var) avoids drift.

scripts/cutcrossentropy_install.py (1)

30-33: Third occurrence of the hard-coded SHA

See earlier note about consolidating the commit hash – the install script is another place that will drift.

🧹 Nitpick comments (6)
src/axolotl/integrations/cut_cross_entropy/README.md (1)

22-23: Avoid scattering hard-coded commit hashes – centralise for maintainability

The commit SHA 71c9a83 now lives in several places (README, __init__.py, install script, notebooks). Any future upgrade will require touching each file and is easy to miss. Consider introducing a single-source constant (e.g., CCE_COMMIT_SHA in a small Python module) and interpolating it wherever the command is rendered (docs can be templated when building).

examples/colab-notebooks/colab-axolotl-example.ipynb (1)

42-43: Pin Axolotl version for reproducible notebooks
axolotl[flash-attn]>=0.9.1 allows future 0.9.x releases that might introduce breaking API changes and silently break the demo. For Colab notebooks meant to be copy-pasted, a strict version pin is safer.

-!pip install --no-build-isolation axolotl[flash-attn]>=0.9.1
+!pip install --no-build-isolation axolotl[flash-attn]==0.9.1
src/axolotl/monkeypatch/multipack.py (1)

39-42: Keep SUPPORTED_MULTIPACK_MODEL_TYPES alphabetically sorted for readability

Historically this list has been kept in alphabetical order.
Inserting the new entries after smollm3 breaks that ordering, making future merges harder to eyeball.

-    "smollm3",
-    "granite",
-    "granitemoe",
-    "hunyuan_v1_dense",
-    "hunyuan_v1_moe",
+    "granite",
+    "granitemoe",
+    "hunyuan_v1_dense",
+    "hunyuan_v1_moe",
+    "smollm3",

(Swap smollm3 to the end or re-sort the whole block.)

examples/hunyuan/README.md (2)

34-36: Fill in the “(---) VRAM” placeholder

The README ships with an unfinished placeholder. New users will copy-paste this guide; leaving it blank reduces trust in the doc.


60-70: Optional: move inference parameter JSON into a fenced jsonc or bash block

Rendering it as plain json is fine, but jsonc allows trailing comments if you ever want to annotate the fields.

examples/hunyuan/hunyuan-v1-dense-qlora.yaml (1)

29-37: lora_target_linear: true duplicates explicit lora_target_modules

When lora_target_linear: true is set, Axolotl automatically targets all nn.Linear layers, making the explicit module list redundant (and potentially confusing if they diverge).

Consider keeping one approach:

- lora_target_linear: true
- lora_target_modules:
-   - gate_proj
-   - down_proj
-   - up_proj
-   - q_proj
-   - v_proj
-   - k_proj
-   - o_proj
+ # Either rely on the automatic linear selection…
+ lora_target_linear: true
+ # or comment the flag out and keep the explicit list.
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ab49d16 and 737315b.

📒 Files selected for processing (7)
  • examples/colab-notebooks/colab-axolotl-example.ipynb (1 hunks)
  • examples/hunyuan/README.md (1 hunks)
  • examples/hunyuan/hunyuan-v1-dense-qlora.yaml (1 hunks)
  • scripts/cutcrossentropy_install.py (1 hunks)
  • src/axolotl/integrations/cut_cross_entropy/README.md (2 hunks)
  • src/axolotl/integrations/cut_cross_entropy/__init__.py (1 hunks)
  • src/axolotl/monkeypatch/multipack.py (1 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/hunyuan/README.md

[style] ~35-~35: Consider using polite language here.
Context: ...`` This config uses about (---) VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...

(INSERT_PLEASE)

🔇 Additional comments (4)
src/axolotl/integrations/cut_cross_entropy/README.md (1)

46-47: Support for HunYuan model types verified

The strings "hunyuan_v1_dense" and "hunyuan_v1_moe" are already present in the multipack patch list, so the runtime will recognize them without error.

• In src/axolotl/monkeypatch/multipack.py (lines 41–42):

    "hunyuan_v1_dense",
    "hunyuan_v1_moe",

No further updates are needed.

examples/colab-notebooks/colab-axolotl-example.ipynb (1)

43-43: Commit hash missing upstream – manual verification required
The referenced SHA 71c9a83 did not appear in git ls-remote for axolotl-ai-cloud/ml-cross-entropy. Before merging, please ensure this commit exists and is reachable—otherwise the Colab install step will fail.

• File: examples/colab-notebooks/colab-axolotl-example.ipynb @ line 43
• Verify by running:

git ls-remote https://github.com/axolotl-ai-cloud/ml-cross-entropy.git 71c9a83
examples/hunyuan/README.md (1)

12-12: Verify the claimed “PyTorch 2.6.0 min” requirement

PyTorch 2.6.0 does not exist at the time of writing; the current nightly is 2.3.x.
Please confirm the minimum version actually required for HunYuan + flash-attn and adjust the doc accordingly.

examples/hunyuan/hunyuan-v1-dense-qlora.yaml (1)

58-58: Confirm HunYuan kernels actually support flash_attention: true

Not all third-party model forks expose Flash-Attention kernels.
If the underlying transformers PR or the HunYuan repo lacks them, training will fail at runtime.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Aug 5, 2025

📖 Documentation Preview: https://68abf1b2b0f0dcd1d98adc4c--resonant-treacle-0fd729.netlify.app

Deployed on Netlify from commit a6cbd43

@codecov
Copy link
Copy Markdown

codecov Bot commented Aug 5, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@NanoCode012 NanoCode012 marked this pull request as draft August 5, 2025 10:01
@NanoCode012 NanoCode012 marked this pull request as ready for review August 5, 2025 10:31
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
examples/hunyuan/README.md (2)

19-21: Duplicate Cut Cross Entropy install step—remove one for brevity.

Step 1 already installs CCE (lines 19-21); Step 3 repeats the same command (lines 32-34). Keep a single reference to avoid confusion.

-3. Install [Cut Cross Entropy](https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy) to reduce training VRAM usage
-
-```bash
-python scripts/cutcrossentropy_install.py | sh
-```

Also applies to: 32-34


25-28: Hard-coding a commit hash from an open PR can break downstream builds.

If the PR is rebased or force-pushed, the hash will disappear. Prefer using the GitHub “pull/39606/head” ref or merge the PR and pin a released tag.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 737315b and c20b3cc.

📒 Files selected for processing (5)
  • examples/devstral/README.md (1 hunks)
  • examples/hunyuan/README.md (1 hunks)
  • examples/magistral/README.md (1 hunks)
  • examples/voxtral/README.md (1 hunks)
  • src/axolotl/loaders/tokenizer.py (1 hunks)
✅ Files skipped from review due to trivial changes (4)
  • examples/magistral/README.md
  • src/axolotl/loaders/tokenizer.py
  • examples/devstral/README.md
  • examples/voxtral/README.md
🧰 Additional context used
🪛 LanguageTool
examples/hunyuan/README.md

[style] ~43-~43: Consider using polite language here.
Context: ...` This config uses about 4.7 GB VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...

(INSERT_PLEASE)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.7.0)
  • GitHub Check: PyTest (3.11, 2.6.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
  • GitHub Check: pre-commit
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: preview
  • GitHub Check: pre-commit
🔇 Additional comments (4)
examples/hunyuan/README.md (4)

3-3: Fix wording and minor grammatical issues in model description.

“opensource” → “open-source”, and the second use of “scale” is redundant.
[​suggest_nitpick]

-Tencent released a family of opensource models called HunYuan with varying parameter scales of 0.5B, 1.8B, 4B, and 7B scale for both Pre-trained and Instruct variants.
+Tencent released a family of open-source HunYuan models with parameter sizes of 0.5 B, 1.8 B, 4 B, and 7 B for both Pre-trained and Instruct variants.

7-11: Indentation renders line 9 as a code block instead of normal text.

In Markdown, four-space indentation inside a list item is interpreted as a code fence. Remove the extra spaces or add a blank line before the intro sentence so it’s not monospace.
[​suggest_nitpick]

-    Here is an example of how to install from main for pip:
+
+Here is an example of how to install from main with pip:

16-17: Pinned setuptools==75.8.0 is likely future-dated and may stall installation.

Current stable setuptools is <75; pinning an unreleased version forces pip to fall back to the source distribution, adding build time or failing outright. Consider pinning a known good release or omitting the pin.


12-13: Incorrect warning about PyTorch 2.6.0

PyTorch 2.6.0 is available on PyPI, so the “Pytorch 2.6.0 min” recommendation in examples/hunyuan/README.md is valid and does not need to be changed.

Likely an incorrect or invalid review comment.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
examples/hunyuan/README.md (1)

36-38: Polish wording for professional tone

“Let us know how it goes. Happy finetuning! 🚀”

Consider adding “please” and a full stop for a more formal README style:

-Let us know how it goes. Happy finetuning! 🚀
+Please let us know how it goes. Happy finetuning! 🚀
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c20b3cc and 8b60aa0.

📒 Files selected for processing (1)
  • examples/hunyuan/README.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/hunyuan/README.md

[style] ~37-~37: Consider using polite language here.
Context: ...` This config uses about 4.7 GB VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...

(INSERT_PLEASE)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.6.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
  • GitHub Check: PyTest (3.11, 2.7.0)
  • GitHub Check: PyTest (3.11, 2.7.1)
🔇 Additional comments (1)
examples/hunyuan/README.md (1)

12-18: Hard-pinning build-tool versions & unreleased PyTorch 2.6 may hurt reproducibility

Pinning packaging==23.2, setuptools==75.8.0, wheel, and requiring “Pytorch 2.6.0 min” forces users onto very specific (and in PyTorch’s case currently unreleased) versions.
This is likely unnecessary for most users, complicates downstream dependency resolution, and breaks on mirrors where those wheels are not yet available.

Consider loosening to supported major ranges (e.g. setuptools>=65) and reference the latest stable PyTorch release that is known to work instead of an unreleased 2.6 tag.

Do any Axolotl features truly depend on these exact versions? If not, please relax them; if yes, document the rationale.

Comment thread examples/hunyuan/README.md Outdated
Comment on lines +23 to +29
2. Please install HunYuan's [transformers PR](https://github.com/huggingface/transformers/pull/39606)

```bash
pip3 uninstall transformers
pip3 install git+https://github.com/huggingface/transformers@refs/pull/39606/head
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Install step pulls a moving PR HEAD – pin a commit for deterministic builds

pip install git+https://github.com/huggingface/transformers@refs/pull/39606/head
tracks the tip of the PR branch, so every re-run may yield a different binary and silently change behaviour (or break).

Pin to a specific commit SHA or wait until the PR is merged and released. Add a short note on when to update.

Example:

pip3 install git+https://github.com/huggingface/transformers@<commit_sha>
🤖 Prompt for AI Agents
In examples/hunyuan/README.md around lines 23 to 29, the installation command
uses a moving PR HEAD reference which can cause non-deterministic builds.
Replace the current pip install URL that points to the PR head with a URL pinned
to a specific commit SHA of that PR. Add a note advising to update the commit
SHA when the PR is merged or updated to ensure reproducible installs.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c3c9e3 and 8ca931f.

📒 Files selected for processing (1)
  • examples/hunyuan/README.md (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: when a method has a todo comment indicating it's a temporary override from upstream (like "todo(wing...
Learnt from: winglian
PR: axolotl-ai-cloud/axolotl#3019
File: src/axolotl/core/trainers/base.py:610-669
Timestamp: 2025-08-07T01:12:27.272Z
Learning: When a method has a TODO comment indicating it's a temporary override from upstream (like "TODO(wing): remove once https://github.com/huggingface/transformers/pull/39866/files is merged"), extensive refactoring suggestions may not be worthwhile since the code will be removed once the upstream changes are available.

Applied to files:

  • examples/hunyuan/README.md
🪛 LanguageTool
examples/hunyuan/README.md

[style] ~37-~37: Consider using polite language here.
Context: ...` This config uses about 4.7 GB VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...

(INSERT_PLEASE)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.6.0)
  • GitHub Check: pre-commit
  • GitHub Check: PyTest (3.11, 2.7.0)
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: pre-commit
  • GitHub Check: preview
🔇 Additional comments (2)
examples/hunyuan/README.md (2)

12-12: PyTorch 2.6.0 minimum requirement is speculative – verify and reference an available release
At the time of writing, the latest stable PyTorch version is < 2.6. Mandating “2.6.0 min” will break the install script for anybody on current releases.

-# Ensure you have Pytorch installed (Pytorch 2.6.0 min)
+# Ensure you have PyTorch ≥ 2.3 installed

25-28: Great – transformers PR is now pinned to a commit for deterministic builds
The switch from tracking refs/pull/39606/head to a specific SHA (06b8c13…) removes nondeterminism and aligns with prior review feedback.

Comment on lines +16 to +17
pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
pip3 install --no-build-isolation -e '.[flash-attn]'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Pinning setuptools 75.8.0 is likely impossible – confirm the version exists
setuptools==75.8.0 does not (yet) exist on PyPI. The hard pin will cause resolution errors. Either:

  1. Drop the pin entirely, or
  2. Replace with the latest published version (e.g. 70.* today).
-pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
+pip3 install packaging==23.2 setuptools wheel ninja
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
pip3 install --no-build-isolation -e '.[flash-attn]'
pip3 install packaging==23.2 setuptools wheel ninja
pip3 install --no-build-isolation -e '.[flash-attn]'
🤖 Prompt for AI Agents
In examples/hunyuan/README.md around lines 16 to 17, the setuptools version is
pinned to 75.8.0, which does not exist on PyPI and will cause installation
errors. Fix this by either removing the setuptools version pin entirely or
replacing it with a valid, currently published version such as 70.* to ensure
successful package installation.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
examples/hunyuan/README.md (1)

16-17: Hard-pinning setuptools to 75.8.0 breaks installs (version likely non-existent on PyPI)

This pin is very likely invalid and will cause resolution failures. Drop the hard pin or loosen it.

Apply this diff:

-pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
+pip3 install packaging==23.2 setuptools wheel ninja
🧹 Nitpick comments (8)
examples/hunyuan/README.md (8)

12-12: Confirm the PyTorch minimum version and qualify GPU/CUDA requirements

“PyTorch 2.6.0 min” may be inaccurate or overly strict. Flash-Attn and CCE have specific CUDA/compute capability constraints that typically matter more than a single PyTorch floor. Recommend qualifying this statement or linking to a compatibility matrix.

Proposed tweak:

-# Ensure you have Pytorch installed (Pytorch 2.6.0 min)
+# Ensure you have a recent PyTorch/CUDA stack compatible with FlashAttention and your GPU.
+# See Axolotl + FlashAttention compatibility notes for supported versions.

If you want, I can draft a small compatibility table and link targets to make this future-proof.


7-9: Clarify installation channel (“main” vs “nightly”) and expected availability window

“Install from main as HunYuan is only on nightly” reads contradictory. Tighten the wording so users know exactly which channel to use until a release is cut.

Suggested edit:

-1. Install Axolotl following the installation guide. You need to install from main as HunYuan is only on nightly or use our latest Docker images.
+1. Install Axolotl following the installation guide. Until a stable release includes HunYuan, use the latest main branch (or the corresponding nightly Docker image).

After merge/tag, update this to reference the first release version that includes HunYuan.


29-31: Qualify the VRAM estimate with config and hardware assumptions

“About 4.7 GB VRAM” is helpful but ambiguous. Tie it to the provided config (model, seq length, adapters, batch/accumulation) and note hardware variability.

Proposed edit:

-This config uses about 4.7 GB VRAM.
+On the provided QLoRA config (4-bit, seq_len=2048, bs=2, grad_accum=4) with the 0.5B Instruct model, expect ~4.7 GB VRAM (varies by GPU/driver/CUDA).

Confirm the exact settings in the YAML so the estimate stays accurate.


33-37: Call out the required chat template and masking explicitly

Since correct masking/chat formatting is critical, add a one-liner stating that the HunYuan chat template must be active (or how Axolotl infers it), with a pointer to the tokenizer logs/users’ action if no template is set.

Proposed addition right above the code block:

+Ensure the HunYuan chat template is active (Axolotl will use the model’s tokenizer chat_template if present). If no template is detected, set an appropriate template or update the tokenizer before training.

55-67: Consider adding max_new_tokens to the inference preset

A length setting is commonly needed for predictable generations.

Diff:

 {
   "do_sample": true,
   "top_k": 20,
   "top_p": 0.8,
   "repetition_penalty": 1.05,
-  "temperature": 0.7
+  "temperature": 0.7,
+  "max_new_tokens": 512
 }

69-71: Minor style/grammar nits in bullets

Tighten phrasing and align with docs style.

Diff:

-- You can run a full finetuning by removing the `adapter: qlora` and `load_in_4bit: true` from the config.
-- Read more on how to load your own dataset at [docs](https://docs.axolotl.ai/docs/dataset_loading.html).
+- You can run full fine-tuning by removing `adapter: qlora` and `load_in_4bit: true` from the config.
+- Read more about loading your own dataset in the [docs](https://docs.axolotl.ai/docs/dataset_loading.html).

75-77: Link labels: use consistent casing and hyphenation

Nit: “LoRA Optimizations” vs “Multi-Node/ Multi-GPU” headings — keep capitalization consistent.

Diff:

-- [LoRA Optimizations](https://docs.axolotl.ai/docs/lora_optims.html)
+- [LoRA optimizations](https://docs.axolotl.ai/docs/lora_optims.html)

(Optional: also normalize “Multi-GPU”/“Multi-Node” capitalization across the list.)


23-27: Add a quick pointer to the config file and model it targets

A direct link helps readers jump to the exact config and understand which model it’s sized for.

Diff:

 ```bash
-axolotl train examples/hunyuan/hunyuan-v1-dense-qlora.yaml
+axolotl train examples/hunyuan/hunyuan-v1-dense-qlora.yaml  # targets tencent/Hunyuan-0.5B-Instruct
Consider adding a markdown link to the YAML path if this README is viewed on GitHub.

</blockquote></details>

</blockquote></details>

<details>
<summary>📜 Review details</summary>

**Configuration used**: Path: .coderabbit.yaml

**Review profile**: CHILL

**Plan**: Pro

**💡 Knowledge Base configuration:**

- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 8ca931f1099f68495be9f5216792bc9eedc2f06a and a6cbd433716fd5931239d711aa602765df65e68b.

</details>

<details>
<summary>📒 Files selected for processing (6)</summary>

* `examples/devstral/README.md` (1 hunks)
* `examples/hunyuan/README.md` (1 hunks)
* `examples/magistral/README.md` (1 hunks)
* `examples/voxtral/README.md` (1 hunks)
* `src/axolotl/loaders/tokenizer.py` (1 hunks)
* `src/axolotl/monkeypatch/multipack.py` (1 hunks)

</details>

<details>
<summary>✅ Files skipped from review due to trivial changes (1)</summary>

* examples/voxtral/README.md

</details>

<details>
<summary>🚧 Files skipped from review as they are similar to previous changes (4)</summary>

* examples/magistral/README.md
* src/axolotl/monkeypatch/multipack.py
* examples/devstral/README.md
* src/axolotl/loaders/tokenizer.py

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>🪛 LanguageTool</summary>

<details>
<summary>examples/hunyuan/README.md</summary>

[style] ~30-~30: Consider using polite language here.
Context: ...`  This config uses about 4.7 GB VRAM.  Let us know how it goes. Happy finetuning! 🚀  ### ...

(INSERT_PLEASE)

---

[grammar] ~31-~31: There might be a mistake here.
Context: ...s know how it goes. Happy finetuning! 🚀  ### Dataset  HunYuan Instruct models can choo...

(QB_NEW_EN)

---

[grammar] ~69-~69: There might be a mistake here.
Context: ...nd `load_in_4bit: true` from the config. - Read more on how to load your own datase...

(QB_NEW_EN)

---

[grammar] ~75-~75: There might be a mistake here.
Context: ...timization Guides  - [Multi-GPU Training](https://docs.axolotl.ai/docs/multi-gpu.html) - [Multi-Node Training](https://docs.axolot...

(QB_NEW_EN)

---

[grammar] ~76-~76: There might be a mistake here.
Context: ...s/multi-gpu.html) - [Multi-Node Training](https://docs.axolotl.ai/docs/multi-node.html) - [LoRA Optimizations](https://docs.axolotl...

(QB_NEW_EN)

</details>

</details>

</details>

<details>
<summary>⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)</summary>

* GitHub Check: preview

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment on lines +11 to +21
```bash
# Ensure you have Pytorch installed (Pytorch 2.6.0 min)
git clone https://github.com/axolotl-ai-cloud/axolotl.git
cd axolotl

pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
pip3 install --no-build-isolation -e '.[flash-attn]'

# Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy
python scripts/cutcrossentropy_install.py | sh
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Missing Transformers PR install step required for HunYuan – add it and pin to a commit SHA

The PR description states HunYuan support requires Hugging Face transformers PR 39606. The README doesn’t instruct users to install it. Add an explicit, commit-pinned install to avoid non-deterministic builds and breakages.

Apply this diff (replace <commit_sha> with a known good commit from PR 39606):

 ```bash
 # Ensure you have Pytorch installed (Pytorch 2.6.0 min)
 git clone https://github.com/axolotl-ai-cloud/axolotl.git
 cd axolotl

-pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
+pip3 install packaging==23.2 setuptools wheel ninja
 pip3 install --no-build-isolation -e '.[flash-attn]'
 
+# Install Transformers PR 39606 (pin to a specific commit for reproducibility)
+# NOTE: replace <commit_sha> with the exact SHA you have validated with this guide.
+pip3 install --no-deps --upgrade \
+  git+https://github.com/huggingface/transformers@<commit_sha>
+
 # Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy
 python scripts/cutcrossentropy_install.py | sh
Also consider adding a short note explaining how/when to update the pinned SHA after upstream changes.

<details>
<summary>🤖 Prompt for AI Agents</summary>

In examples/hunyuan/README.md around lines 11-21, add an explicit install step
that pins the required Hugging Face transformers PR commit: insert a pip3
install --no-deps --upgrade
git+https://github.com/huggingface/transformers@<commit_sha> (replace
<commit_sha> with the validated commit SHA from PR 39606) after the existing
pip3 installs and before running the cutcrossentropy_install script; also update
the packaging/setuptools line per the suggested diff (remove the hard-set
setuptools version) and add one short note telling users how to update the
pinned SHA when upstream changes are validated.


</details>

<!-- fingerprinting:phantom:triton:chinchilla -->

<!-- This is an auto-generated comment by CodeRabbit -->

Comment on lines +35 to +51
HunYuan Instruct models can choose to enter a slow think or fast think pattern. For best performance on fine-tuning their Instruct models, your dataset should be adjusted to match their pattern.

```python
# fast think pattern
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "/no_think What color is the sun?" },
{"role": "assistant", "content": "<think>\n\n</think>\n<answer>\nThe sun is yellow.\n</answer>"}
]

# slow think pattern
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "/no_think What color is the sun?" },
{"role": "assistant", "content": "<think>\nThe user is asking about the color of the sun. I need to ...\n</think>\n<answer>\nThe sun is yellow.\n</answer>"}
]
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

“Fast think” vs “Slow think” examples look inconsistent with the /no_think directive

Both examples show the user sending “/no_think”, which typically indicates “don’t think.” For a slow-think example, drop “/no_think” (and for fast-think, consider omitting the block entirely or keeping it empty by design—confirm expected chat template behavior).

Apply this diff:

 # fast think pattern
 messages = [
     {"role": "system", "content": "You are a helpful assistant."},
     {"role": "user", "content": "/no_think What color is the sun?" },
-    {"role": "assistant", "content": "<think>\n\n</think>\n<answer>\nThe sun is yellow.\n</answer>"}
+    {"role": "assistant", "content": "<answer>\nThe sun is yellow.\n</answer>"}
 ]
 
 # slow think pattern
 messages = [
     {"role": "system", "content": "You are a helpful assistant."},
-    {"role": "user", "content": "/no_think What color is the sun?" },
+    {"role": "user", "content": "What color is the sun?" },
     {"role": "assistant", "content": "<think>\nThe user is asking about the color of the sun. I need to ...\n</think>\n<answer>\nThe sun is yellow.\n</answer>"}
 ]

If the tokenizer/chat template requires always emitting tags (even when empty) for “fast think,” note that explicitly in the text above the snippet.


🏁 Script executed:

#!/bin/bash
# Locate occurrences of the /no_think directive
grep -n "/no_think" examples/hunyuan/README.md || true

echo "----- Context up through line 100 -----"
sed -n '1,100p' examples/hunyuan/README.md || true

Length of output: 3544


Align /no_think and <think> tags with intended reasoning modes

The /no_think prefix forces fast-thinking (skipping CoT reasoning), while omitting it (or using /think) lets the model run in slow-thinking mode (huggingface.co). In the current examples both patterns use /no_think, which contradicts the slow-think case. Please update as follows:

• File: examples/hunyuan/README.md (around lines 35–51)
• Remove /no_think from the slow-think example so the model defaults to detailed reasoning.
• For the fast-think example, either drop the <think></think> block entirely (if your chat template omits empty think tags when enable_thinking=False) or keep it empty but note that it’s optional.

Apply this diff:

### Dataset

 HunYuan Instruct models can choose to enter a slow think or fast think pattern. For best performance on fine-tuning their Instruct models, your dataset should be adjusted to match their pattern.

 ```python
 # fast think pattern
 messages = [
     {"role": "system", "content": "You are a helpful assistant."},
     {"role": "user",   "content": "/no_think What color is the sun?" },
-    {"role": "assistant", "content": "<think>\n\n</think>\n<answer>\nThe sun is yellow.\n</answer>"}
+    {"role": "assistant", "content": "<answer>\nThe sun is yellow.\n</answer>"}  # empty <think> is optional if your template omits empty think blocks
 ]

 # slow think pattern
 messages = [
     {"role": "system", "content": "You are a helpful assistant."},
-    {"role": "user",   "content": "/no_think What color is the sun?" },
+    {"role": "user",   "content": "What color is the sun?" },
     {"role": "assistant", "content": "<think>\nThe user is asking about the color of the sun. I need to …\n</think>\n<answer>\nThe sun is yellow.\n</answer>"}
 ]

If your chat template always emits <think></think> (even when thinking is disabled), please note that above the examples.

🤖 Prompt for AI Agents
examples/hunyuan/README.md around lines 35–51: the slow-think example
incorrectly uses the "/no_think" prefix (which forces fast thinking) and the
fast-think example includes an empty <think> block; remove "/no_think" from the
slow-think user message so the model will perform detailed reasoning, and for
the fast-think example remove the empty <think></think> block (or replace the
assistant content to only include the <answer>...</answer>) and add a brief note
above the examples stating that if your chat template always emits empty
<think></think> it should be considered optional/ignored.

@NanoCode012 NanoCode012 merged commit b71482c into main Sep 10, 2025
5 of 15 checks passed
@NanoCode012 NanoCode012 deleted the feat/hunyuan_v1 branch September 10, 2025 02:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant