Feat: add hunyuan v1#3016
Conversation
📝 WalkthroughWalkthroughAdds HunYuan example documentation and a QLoRA config; updates multipack supported model types; tweaks tokenizer chat template logging condition; and inserts Cut Cross Entropy installation steps in multiple example READMEs. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (2)
src/axolotl/integrations/cut_cross_entropy/__init__.py (1)
35-38: Same hard-coded SHA duplication as noted in READMEThis string repeats the commit SHA already mentioned elsewhere. A central constant (or env var) avoids drift.
scripts/cutcrossentropy_install.py (1)
30-33: Third occurrence of the hard-coded SHASee earlier note about consolidating the commit hash – the install script is another place that will drift.
🧹 Nitpick comments (6)
src/axolotl/integrations/cut_cross_entropy/README.md (1)
22-23: Avoid scattering hard-coded commit hashes – centralise for maintainabilityThe commit SHA
71c9a83now lives in several places (README,__init__.py, install script, notebooks). Any future upgrade will require touching each file and is easy to miss. Consider introducing a single-source constant (e.g.,CCE_COMMIT_SHAin a small Python module) and interpolating it wherever the command is rendered (docs can be templated when building).examples/colab-notebooks/colab-axolotl-example.ipynb (1)
42-43: Pin Axolotl version for reproducible notebooks
axolotl[flash-attn]>=0.9.1allows future 0.9.x releases that might introduce breaking API changes and silently break the demo. For Colab notebooks meant to be copy-pasted, a strict version pin is safer.-!pip install --no-build-isolation axolotl[flash-attn]>=0.9.1 +!pip install --no-build-isolation axolotl[flash-attn]==0.9.1src/axolotl/monkeypatch/multipack.py (1)
39-42: KeepSUPPORTED_MULTIPACK_MODEL_TYPESalphabetically sorted for readabilityHistorically this list has been kept in alphabetical order.
Inserting the new entries aftersmollm3breaks that ordering, making future merges harder to eyeball.- "smollm3", - "granite", - "granitemoe", - "hunyuan_v1_dense", - "hunyuan_v1_moe", + "granite", + "granitemoe", + "hunyuan_v1_dense", + "hunyuan_v1_moe", + "smollm3",(Swap
smollm3to the end or re-sort the whole block.)examples/hunyuan/README.md (2)
34-36: Fill in the “(---) VRAM” placeholderThe README ships with an unfinished placeholder. New users will copy-paste this guide; leaving it blank reduces trust in the doc.
60-70: Optional: move inference parameter JSON into a fencedjsoncorbashblockRendering it as plain
jsonis fine, butjsoncallows trailing comments if you ever want to annotate the fields.examples/hunyuan/hunyuan-v1-dense-qlora.yaml (1)
29-37:lora_target_linear: trueduplicates explicitlora_target_modulesWhen
lora_target_linear: trueis set, Axolotl automatically targets allnn.Linearlayers, making the explicit module list redundant (and potentially confusing if they diverge).Consider keeping one approach:
- lora_target_linear: true - lora_target_modules: - - gate_proj - - down_proj - - up_proj - - q_proj - - v_proj - - k_proj - - o_proj + # Either rely on the automatic linear selection… + lora_target_linear: true + # or comment the flag out and keep the explicit list.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
examples/colab-notebooks/colab-axolotl-example.ipynb(1 hunks)examples/hunyuan/README.md(1 hunks)examples/hunyuan/hunyuan-v1-dense-qlora.yaml(1 hunks)scripts/cutcrossentropy_install.py(1 hunks)src/axolotl/integrations/cut_cross_entropy/README.md(2 hunks)src/axolotl/integrations/cut_cross_entropy/__init__.py(1 hunks)src/axolotl/monkeypatch/multipack.py(1 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/hunyuan/README.md
[style] ~35-~35: Consider using polite language here.
Context: ...`` This config uses about (---) VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...
(INSERT_PLEASE)
🔇 Additional comments (4)
src/axolotl/integrations/cut_cross_entropy/README.md (1)
46-47: Support for HunYuan model types verifiedThe strings
"hunyuan_v1_dense"and"hunyuan_v1_moe"are already present in the multipack patch list, so the runtime will recognize them without error.• In src/axolotl/monkeypatch/multipack.py (lines 41–42):
"hunyuan_v1_dense", "hunyuan_v1_moe",No further updates are needed.
examples/colab-notebooks/colab-axolotl-example.ipynb (1)
43-43: Commit hash missing upstream – manual verification required
The referenced SHA71c9a83did not appear ingit ls-remoteforaxolotl-ai-cloud/ml-cross-entropy. Before merging, please ensure this commit exists and is reachable—otherwise the Colab install step will fail.• File: examples/colab-notebooks/colab-axolotl-example.ipynb @ line 43
• Verify by running:git ls-remote https://github.com/axolotl-ai-cloud/ml-cross-entropy.git 71c9a83examples/hunyuan/README.md (1)
12-12: Verify the claimed “PyTorch 2.6.0 min” requirementPyTorch 2.6.0 does not exist at the time of writing; the current nightly is 2.3.x.
Please confirm the minimum version actually required for HunYuan + flash-attn and adjust the doc accordingly.examples/hunyuan/hunyuan-v1-dense-qlora.yaml (1)
58-58: Confirm HunYuan kernels actually supportflash_attention: trueNot all third-party model forks expose Flash-Attention kernels.
If the underlyingtransformersPR or the HunYuan repo lacks them, training will fail at runtime.
|
📖 Documentation Preview: https://68abf1b2b0f0dcd1d98adc4c--resonant-treacle-0fd729.netlify.app Deployed on Netlify from commit a6cbd43 |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
examples/hunyuan/README.md (2)
19-21: Duplicate Cut Cross Entropy install step—remove one for brevity.Step 1 already installs CCE (lines 19-21); Step 3 repeats the same command (lines 32-34). Keep a single reference to avoid confusion.
-3. Install [Cut Cross Entropy](https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy) to reduce training VRAM usage - -```bash -python scripts/cutcrossentropy_install.py | sh -```Also applies to: 32-34
25-28: Hard-coding a commit hash from an open PR can break downstream builds.If the PR is rebased or force-pushed, the hash will disappear. Prefer using the GitHub “pull/39606/head” ref or merge the PR and pin a released tag.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
examples/devstral/README.md(1 hunks)examples/hunyuan/README.md(1 hunks)examples/magistral/README.md(1 hunks)examples/voxtral/README.md(1 hunks)src/axolotl/loaders/tokenizer.py(1 hunks)
✅ Files skipped from review due to trivial changes (4)
- examples/magistral/README.md
- src/axolotl/loaders/tokenizer.py
- examples/devstral/README.md
- examples/voxtral/README.md
🧰 Additional context used
🪛 LanguageTool
examples/hunyuan/README.md
[style] ~43-~43: Consider using polite language here.
Context: ...` This config uses about 4.7 GB VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...
(INSERT_PLEASE)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
- GitHub Check: PyTest (3.11, 2.7.0)
- GitHub Check: PyTest (3.11, 2.6.0)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
- GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
- GitHub Check: pre-commit
- GitHub Check: PyTest (3.11, 2.7.1)
- GitHub Check: preview
- GitHub Check: pre-commit
🔇 Additional comments (4)
examples/hunyuan/README.md (4)
3-3: Fix wording and minor grammatical issues in model description.“opensource” → “open-source”, and the second use of “scale” is redundant.
[suggest_nitpick]-Tencent released a family of opensource models called HunYuan with varying parameter scales of 0.5B, 1.8B, 4B, and 7B scale for both Pre-trained and Instruct variants. +Tencent released a family of open-source HunYuan models with parameter sizes of 0.5 B, 1.8 B, 4 B, and 7 B for both Pre-trained and Instruct variants.
7-11: Indentation renders line 9 as a code block instead of normal text.In Markdown, four-space indentation inside a list item is interpreted as a code fence. Remove the extra spaces or add a blank line before the intro sentence so it’s not monospace.
[suggest_nitpick]- Here is an example of how to install from main for pip: + +Here is an example of how to install from main with pip:
16-17: Pinnedsetuptools==75.8.0is likely future-dated and may stall installation.Current stable
setuptoolsis <75; pinning an unreleased version forcespipto fall back to the source distribution, adding build time or failing outright. Consider pinning a known good release or omitting the pin.
12-13: Incorrect warning about PyTorch 2.6.0PyTorch 2.6.0 is available on PyPI, so the “Pytorch 2.6.0 min” recommendation in
examples/hunyuan/README.mdis valid and does not need to be changed.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
examples/hunyuan/README.md (1)
36-38: Polish wording for professional tone“Let us know how it goes. Happy finetuning! 🚀”
Consider adding “please” and a full stop for a more formal README style:
-Let us know how it goes. Happy finetuning! 🚀 +Please let us know how it goes. Happy finetuning! 🚀
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
examples/hunyuan/README.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/hunyuan/README.md
[style] ~37-~37: Consider using polite language here.
Context: ...` This config uses about 4.7 GB VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...
(INSERT_PLEASE)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
- GitHub Check: PyTest (3.11, 2.6.0)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
- GitHub Check: PyTest (3.11, 2.7.0)
- GitHub Check: PyTest (3.11, 2.7.1)
🔇 Additional comments (1)
examples/hunyuan/README.md (1)
12-18: Hard-pinning build-tool versions & unreleased PyTorch 2.6 may hurt reproducibilityPinning
packaging==23.2,setuptools==75.8.0,wheel, and requiring “Pytorch 2.6.0 min” forces users onto very specific (and in PyTorch’s case currently unreleased) versions.
This is likely unnecessary for most users, complicates downstream dependency resolution, and breaks on mirrors where those wheels are not yet available.Consider loosening to supported major ranges (e.g.
setuptools>=65) and reference the latest stable PyTorch release that is known to work instead of an unreleased 2.6 tag.Do any Axolotl features truly depend on these exact versions? If not, please relax them; if yes, document the rationale.
| 2. Please install HunYuan's [transformers PR](https://github.com/huggingface/transformers/pull/39606) | ||
|
|
||
| ```bash | ||
| pip3 uninstall transformers | ||
| pip3 install git+https://github.com/huggingface/transformers@refs/pull/39606/head | ||
| ``` | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Install step pulls a moving PR HEAD – pin a commit for deterministic builds
pip install git+https://github.com/huggingface/transformers@refs/pull/39606/head
tracks the tip of the PR branch, so every re-run may yield a different binary and silently change behaviour (or break).
Pin to a specific commit SHA or wait until the PR is merged and released. Add a short note on when to update.
Example:
pip3 install git+https://github.com/huggingface/transformers@<commit_sha>🤖 Prompt for AI Agents
In examples/hunyuan/README.md around lines 23 to 29, the installation command
uses a moving PR HEAD reference which can cause non-deterministic builds.
Replace the current pip install URL that points to the PR head with a URL pinned
to a specific commit SHA of that PR. Add a note advising to update the commit
SHA when the PR is merged or updated to ensure reproducible installs.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
examples/hunyuan/README.md(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: when a method has a todo comment indicating it's a temporary override from upstream (like "todo(wing...
Learnt from: winglian
PR: axolotl-ai-cloud/axolotl#3019
File: src/axolotl/core/trainers/base.py:610-669
Timestamp: 2025-08-07T01:12:27.272Z
Learning: When a method has a TODO comment indicating it's a temporary override from upstream (like "TODO(wing): remove once https://github.com/huggingface/transformers/pull/39866/files is merged"), extensive refactoring suggestions may not be worthwhile since the code will be removed once the upstream changes are available.
Applied to files:
examples/hunyuan/README.md
🪛 LanguageTool
examples/hunyuan/README.md
[style] ~37-~37: Consider using polite language here.
Context: ...` This config uses about 4.7 GB VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...
(INSERT_PLEASE)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
- GitHub Check: PyTest (3.11, 2.6.0)
- GitHub Check: pre-commit
- GitHub Check: PyTest (3.11, 2.7.0)
- GitHub Check: PyTest (3.11, 2.7.1)
- GitHub Check: pre-commit
- GitHub Check: preview
🔇 Additional comments (2)
examples/hunyuan/README.md (2)
12-12: PyTorch 2.6.0 minimum requirement is speculative – verify and reference an available release
At the time of writing, the latest stable PyTorch version is < 2.6. Mandating “2.6.0 min” will break the install script for anybody on current releases.-# Ensure you have Pytorch installed (Pytorch 2.6.0 min) +# Ensure you have PyTorch ≥ 2.3 installed
25-28: Great – transformers PR is now pinned to a commit for deterministic builds
The switch from trackingrefs/pull/39606/headto a specific SHA (06b8c13…) removes nondeterminism and aligns with prior review feedback.
| pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja | ||
| pip3 install --no-build-isolation -e '.[flash-attn]' |
There was a problem hiding this comment.
Pinning setuptools 75.8.0 is likely impossible – confirm the version exists
setuptools==75.8.0 does not (yet) exist on PyPI. The hard pin will cause resolution errors. Either:
- Drop the pin entirely, or
- Replace with the latest published version (e.g. 70.* today).
-pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
+pip3 install packaging==23.2 setuptools wheel ninja📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja | |
| pip3 install --no-build-isolation -e '.[flash-attn]' | |
| pip3 install packaging==23.2 setuptools wheel ninja | |
| pip3 install --no-build-isolation -e '.[flash-attn]' |
🤖 Prompt for AI Agents
In examples/hunyuan/README.md around lines 16 to 17, the setuptools version is
pinned to 75.8.0, which does not exist on PyPI and will cause installation
errors. Fix this by either removing the setuptools version pin entirely or
replacing it with a valid, currently published version such as 70.* to ensure
successful package installation.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
examples/hunyuan/README.md (1)
16-17: Hard-pinning setuptools to 75.8.0 breaks installs (version likely non-existent on PyPI)This pin is very likely invalid and will cause resolution failures. Drop the hard pin or loosen it.
Apply this diff:
-pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja +pip3 install packaging==23.2 setuptools wheel ninja
🧹 Nitpick comments (8)
examples/hunyuan/README.md (8)
12-12: Confirm the PyTorch minimum version and qualify GPU/CUDA requirements“PyTorch 2.6.0 min” may be inaccurate or overly strict. Flash-Attn and CCE have specific CUDA/compute capability constraints that typically matter more than a single PyTorch floor. Recommend qualifying this statement or linking to a compatibility matrix.
Proposed tweak:
-# Ensure you have Pytorch installed (Pytorch 2.6.0 min) +# Ensure you have a recent PyTorch/CUDA stack compatible with FlashAttention and your GPU. +# See Axolotl + FlashAttention compatibility notes for supported versions.If you want, I can draft a small compatibility table and link targets to make this future-proof.
7-9: Clarify installation channel (“main” vs “nightly”) and expected availability window“Install from main as HunYuan is only on nightly” reads contradictory. Tighten the wording so users know exactly which channel to use until a release is cut.
Suggested edit:
-1. Install Axolotl following the installation guide. You need to install from main as HunYuan is only on nightly or use our latest Docker images. +1. Install Axolotl following the installation guide. Until a stable release includes HunYuan, use the latest main branch (or the corresponding nightly Docker image).After merge/tag, update this to reference the first release version that includes HunYuan.
29-31: Qualify the VRAM estimate with config and hardware assumptions“About 4.7 GB VRAM” is helpful but ambiguous. Tie it to the provided config (model, seq length, adapters, batch/accumulation) and note hardware variability.
Proposed edit:
-This config uses about 4.7 GB VRAM. +On the provided QLoRA config (4-bit, seq_len=2048, bs=2, grad_accum=4) with the 0.5B Instruct model, expect ~4.7 GB VRAM (varies by GPU/driver/CUDA).Confirm the exact settings in the YAML so the estimate stays accurate.
33-37: Call out the required chat template and masking explicitlySince correct masking/chat formatting is critical, add a one-liner stating that the HunYuan chat template must be active (or how Axolotl infers it), with a pointer to the tokenizer logs/users’ action if no template is set.
Proposed addition right above the code block:
+Ensure the HunYuan chat template is active (Axolotl will use the model’s tokenizer chat_template if present). If no template is detected, set an appropriate template or update the tokenizer before training.
55-67: Consider adding max_new_tokens to the inference presetA length setting is commonly needed for predictable generations.
Diff:
{ "do_sample": true, "top_k": 20, "top_p": 0.8, "repetition_penalty": 1.05, - "temperature": 0.7 + "temperature": 0.7, + "max_new_tokens": 512 }
69-71: Minor style/grammar nits in bulletsTighten phrasing and align with docs style.
Diff:
-- You can run a full finetuning by removing the `adapter: qlora` and `load_in_4bit: true` from the config. -- Read more on how to load your own dataset at [docs](https://docs.axolotl.ai/docs/dataset_loading.html). +- You can run full fine-tuning by removing `adapter: qlora` and `load_in_4bit: true` from the config. +- Read more about loading your own dataset in the [docs](https://docs.axolotl.ai/docs/dataset_loading.html).
75-77: Link labels: use consistent casing and hyphenationNit: “LoRA Optimizations” vs “Multi-Node/ Multi-GPU” headings — keep capitalization consistent.
Diff:
-- [LoRA Optimizations](https://docs.axolotl.ai/docs/lora_optims.html) +- [LoRA optimizations](https://docs.axolotl.ai/docs/lora_optims.html)(Optional: also normalize “Multi-GPU”/“Multi-Node” capitalization across the list.)
23-27: Add a quick pointer to the config file and model it targetsA direct link helps readers jump to the exact config and understand which model it’s sized for.
Diff:
```bash -axolotl train examples/hunyuan/hunyuan-v1-dense-qlora.yaml +axolotl train examples/hunyuan/hunyuan-v1-dense-qlora.yaml # targets tencent/Hunyuan-0.5B-InstructConsider adding a markdown link to the YAML path if this README is viewed on GitHub. </blockquote></details> </blockquote></details> <details> <summary>📜 Review details</summary> **Configuration used**: Path: .coderabbit.yaml **Review profile**: CHILL **Plan**: Pro **💡 Knowledge Base configuration:** - MCP integration is disabled by default for public repositories - Jira integration is disabled by default for public repositories - Linear integration is disabled by default for public repositories You can enable these sources in your CodeRabbit configuration. <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 8ca931f1099f68495be9f5216792bc9eedc2f06a and a6cbd433716fd5931239d711aa602765df65e68b. </details> <details> <summary>📒 Files selected for processing (6)</summary> * `examples/devstral/README.md` (1 hunks) * `examples/hunyuan/README.md` (1 hunks) * `examples/magistral/README.md` (1 hunks) * `examples/voxtral/README.md` (1 hunks) * `src/axolotl/loaders/tokenizer.py` (1 hunks) * `src/axolotl/monkeypatch/multipack.py` (1 hunks) </details> <details> <summary>✅ Files skipped from review due to trivial changes (1)</summary> * examples/voxtral/README.md </details> <details> <summary>🚧 Files skipped from review as they are similar to previous changes (4)</summary> * examples/magistral/README.md * src/axolotl/monkeypatch/multipack.py * examples/devstral/README.md * src/axolotl/loaders/tokenizer.py </details> <details> <summary>🧰 Additional context used</summary> <details> <summary>🪛 LanguageTool</summary> <details> <summary>examples/hunyuan/README.md</summary> [style] ~30-~30: Consider using polite language here. Context: ...` This config uses about 4.7 GB VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ... (INSERT_PLEASE) --- [grammar] ~31-~31: There might be a mistake here. Context: ...s know how it goes. Happy finetuning! 🚀 ### Dataset HunYuan Instruct models can choo... (QB_NEW_EN) --- [grammar] ~69-~69: There might be a mistake here. Context: ...nd `load_in_4bit: true` from the config. - Read more on how to load your own datase... (QB_NEW_EN) --- [grammar] ~75-~75: There might be a mistake here. Context: ...timization Guides - [Multi-GPU Training](https://docs.axolotl.ai/docs/multi-gpu.html) - [Multi-Node Training](https://docs.axolot... (QB_NEW_EN) --- [grammar] ~76-~76: There might be a mistake here. Context: ...s/multi-gpu.html) - [Multi-Node Training](https://docs.axolotl.ai/docs/multi-node.html) - [LoRA Optimizations](https://docs.axolotl... (QB_NEW_EN) </details> </details> </details> <details> <summary>⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)</summary> * GitHub Check: preview </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
| ```bash | ||
| # Ensure you have Pytorch installed (Pytorch 2.6.0 min) | ||
| git clone https://github.com/axolotl-ai-cloud/axolotl.git | ||
| cd axolotl | ||
|
|
||
| pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja | ||
| pip3 install --no-build-isolation -e '.[flash-attn]' | ||
|
|
||
| # Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy | ||
| python scripts/cutcrossentropy_install.py | sh | ||
| ``` |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Missing Transformers PR install step required for HunYuan – add it and pin to a commit SHA
The PR description states HunYuan support requires Hugging Face transformers PR 39606. The README doesn’t instruct users to install it. Add an explicit, commit-pinned install to avoid non-deterministic builds and breakages.
Apply this diff (replace <commit_sha> with a known good commit from PR 39606):
```bash
# Ensure you have Pytorch installed (Pytorch 2.6.0 min)
git clone https://github.com/axolotl-ai-cloud/axolotl.git
cd axolotl
-pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
+pip3 install packaging==23.2 setuptools wheel ninja
pip3 install --no-build-isolation -e '.[flash-attn]'
+# Install Transformers PR 39606 (pin to a specific commit for reproducibility)
+# NOTE: replace <commit_sha> with the exact SHA you have validated with this guide.
+pip3 install --no-deps --upgrade \
+ git+https://github.com/huggingface/transformers@<commit_sha>
+
# Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy
python scripts/cutcrossentropy_install.py | shAlso consider adding a short note explaining how/when to update the pinned SHA after upstream changes.
<details>
<summary>🤖 Prompt for AI Agents</summary>
In examples/hunyuan/README.md around lines 11-21, add an explicit install step
that pins the required Hugging Face transformers PR commit: insert a pip3
install --no-deps --upgrade
git+https://github.com/huggingface/transformers@<commit_sha> (replace
<commit_sha> with the validated commit SHA from PR 39606) after the existing
pip3 installs and before running the cutcrossentropy_install script; also update
the packaging/setuptools line per the suggested diff (remove the hard-set
setuptools version) and add one short note telling users how to update the
pinned SHA when upstream changes are validated.
</details>
<!-- fingerprinting:phantom:triton:chinchilla -->
<!-- This is an auto-generated comment by CodeRabbit -->
| HunYuan Instruct models can choose to enter a slow think or fast think pattern. For best performance on fine-tuning their Instruct models, your dataset should be adjusted to match their pattern. | ||
|
|
||
| ```python | ||
| # fast think pattern | ||
| messages = [ | ||
| {"role": "system", "content": "You are a helpful assistant."}, | ||
| {"role": "user", "content": "/no_think What color is the sun?" }, | ||
| {"role": "assistant", "content": "<think>\n\n</think>\n<answer>\nThe sun is yellow.\n</answer>"} | ||
| ] | ||
|
|
||
| # slow think pattern | ||
| messages = [ | ||
| {"role": "system", "content": "You are a helpful assistant."}, | ||
| {"role": "user", "content": "/no_think What color is the sun?" }, | ||
| {"role": "assistant", "content": "<think>\nThe user is asking about the color of the sun. I need to ...\n</think>\n<answer>\nThe sun is yellow.\n</answer>"} | ||
| ] | ||
| ``` |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
“Fast think” vs “Slow think” examples look inconsistent with the /no_think directive
Both examples show the user sending “/no_think”, which typically indicates “don’t think.” For a slow-think example, drop “/no_think” (and for fast-think, consider omitting the block entirely or keeping it empty by design—confirm expected chat template behavior).
Apply this diff:
# fast think pattern
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "/no_think What color is the sun?" },
- {"role": "assistant", "content": "<think>\n\n</think>\n<answer>\nThe sun is yellow.\n</answer>"}
+ {"role": "assistant", "content": "<answer>\nThe sun is yellow.\n</answer>"}
]
# slow think pattern
messages = [
{"role": "system", "content": "You are a helpful assistant."},
- {"role": "user", "content": "/no_think What color is the sun?" },
+ {"role": "user", "content": "What color is the sun?" },
{"role": "assistant", "content": "<think>\nThe user is asking about the color of the sun. I need to ...\n</think>\n<answer>\nThe sun is yellow.\n</answer>"}
]If the tokenizer/chat template requires always emitting tags (even when empty) for “fast think,” note that explicitly in the text above the snippet.
🏁 Script executed:
#!/bin/bash
# Locate occurrences of the /no_think directive
grep -n "/no_think" examples/hunyuan/README.md || true
echo "----- Context up through line 100 -----"
sed -n '1,100p' examples/hunyuan/README.md || trueLength of output: 3544
Align /no_think and <think> tags with intended reasoning modes
The /no_think prefix forces fast-thinking (skipping CoT reasoning), while omitting it (or using /think) lets the model run in slow-thinking mode (huggingface.co). In the current examples both patterns use /no_think, which contradicts the slow-think case. Please update as follows:
• File: examples/hunyuan/README.md (around lines 35–51)
• Remove /no_think from the slow-think example so the model defaults to detailed reasoning.
• For the fast-think example, either drop the <think></think> block entirely (if your chat template omits empty think tags when enable_thinking=False) or keep it empty but note that it’s optional.
Apply this diff:
### Dataset
HunYuan Instruct models can choose to enter a slow think or fast think pattern. For best performance on fine-tuning their Instruct models, your dataset should be adjusted to match their pattern.
```python
# fast think pattern
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "/no_think What color is the sun?" },
- {"role": "assistant", "content": "<think>\n\n</think>\n<answer>\nThe sun is yellow.\n</answer>"}
+ {"role": "assistant", "content": "<answer>\nThe sun is yellow.\n</answer>"} # empty <think> is optional if your template omits empty think blocks
]
# slow think pattern
messages = [
{"role": "system", "content": "You are a helpful assistant."},
- {"role": "user", "content": "/no_think What color is the sun?" },
+ {"role": "user", "content": "What color is the sun?" },
{"role": "assistant", "content": "<think>\nThe user is asking about the color of the sun. I need to …\n</think>\n<answer>\nThe sun is yellow.\n</answer>"}
]If your chat template always emits <think></think> (even when thinking is disabled), please note that above the examples.
🤖 Prompt for AI Agents
examples/hunyuan/README.md around lines 35–51: the slow-think example
incorrectly uses the "/no_think" prefix (which forces fast thinking) and the
fast-think example includes an empty <think> block; remove "/no_think" from the
slow-think user message so the model will perform detailed reasoning, and for
the fast-think example remove the empty <think></think> block (or replace the
assistant content to only include the <answer>...</answer>) and add a brief note
above the examples stating that if your chat template always emits empty
<think></think> it should be considered optional/ignored.
Description
Requires install transformers PR huggingface/transformers#39606
Includes:
Motivation and Context
How has this been tested?
Screenshots (if appropriate)
Types of changes
Social Handles (Optional)
Summary by CodeRabbit
New Features
Documentation
Refactor