Skip to content

Feat: Add voxtral, magistral small 1.1, and misc gemma3n fixes#2979

Merged
NanoCode012 merged 34 commits into
mainfrom
feat/voxtral
Jul 30, 2025
Merged

Feat: Add voxtral, magistral small 1.1, and misc gemma3n fixes#2979
NanoCode012 merged 34 commits into
mainfrom
feat/voxtral

Conversation

@NanoCode012
Copy link
Copy Markdown
Collaborator

@NanoCode012 NanoCode012 commented Jul 24, 2025

Description

  • Added Voxtral support (both text & text+audio)
  • Added [THINK] tag support for Magistral Small 1.1 Training
  • Misc gemma3n doc fixes
  • Update CCE commit to include Voxtral
  • Refactored MistralTokenizer to subclass upstream
  • Removes Gemma3 loss patch (as is merged upstream)

Requires:

pip3 install git+https://github.com/huggingface/transformers.git@967045082faaaaf3d653bfe665080fd746b2bb60

See README in each model dir for more info

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

  • New Features

    • Added support for fine-tuning and processing the Voxtral model, including audio integration and chat template handling.
    • Introduced new configuration files and comprehensive documentation for Voxtral and Magistral models, with step-by-step guides and tips.
  • Bug Fixes

    • Improved handling of None values in processed examples to ensure cleaner data structures.
  • Documentation

    • Expanded and updated READMEs for Gemma-3n, Magistral, and Voxtral, providing detailed instructions, usage tips, and resource links.
    • Updated installation instructions and commit hashes for dependencies.
  • Refactor

    • Replaced and reorganized the Mistral tokenizer wrapper for improved compatibility and maintainability.
    • Updated processing and patching strategies to streamline model support and remove deprecated patches.
  • Tests

    • Adjusted test imports and temporarily skipped select tests pending updates for the new tokenizer wrapper.
  • Chores

    • Updated configuration files to include new plugins and reflect latest model and dataset options.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jul 24, 2025

📝 Walkthrough

Walkthrough

This change introduces full support for the Voxtral model and the mistral-common tokenizer in Axolotl, including new configuration files, processing strategies, and a HuggingFace-compatible tokenizer wrapper. It removes legacy Gemma3 patching, updates all cut-cross-entropy installation references to a new commit, and reorganizes tokenizer utilities for clarity and maintainability.

Changes

Cohort / Files Change Summary
Voxtral Model Support & Processing
src/axolotl/loaders/constants.py
src/axolotl/monkeypatch/models/voxtral/modeling.py
src/axolotl/processing_strategies.py
src/axolotl/loaders/patch_manager.py
Added Voxtral model mapping, monkeypatch for Voxtral forward method, a Voxtral-specific processing strategy, and patch management for Voxtral models. Removed Gemma3 patching logic.
Mistral-Common Tokenizer Refactor
src/axolotl/utils/mistral_tokenizer.py (deleted)
src/axolotl/utils/mistral/mistral_tokenizer.py (added)
src/axolotl/utils/mistral/__init__.py (added)
src/axolotl/loaders/tokenizer.py
src/axolotl/prompt_strategies/chat_template.py
tests/prompt_strategies/conftest.py
tests/prompt_strategies/test_chat_templates_mistral.py
Replaced the old tokenizer wrapper with a new HuggingFace-compatible HFMistralTokenizer in a reorganized module; updated all imports accordingly; skipped some tests pending fixes.
Cut-Cross-Entropy Integration Update
src/axolotl/integrations/cut_cross_entropy/README.md
src/axolotl/integrations/cut_cross_entropy/__init__.py
scripts/cutcrossentropy_install.py
examples/colab-notebooks/colab-axolotl-example.ipynb
Updated all installation instructions and references for the cut-cross-entropy package to use a new commit hash.
New/Updated Voxtral Example Files
examples/voxtral/README.md
examples/voxtral/voxtral-mini-qlora.yml
examples/voxtral/voxtral-mini-audio-qlora.yml
Added a Voxtral fine-tuning guide and two new configuration files for text-only and text+audio QLoRA training.
Magistral Example Updates
examples/magistral/README.md
examples/magistral/magistral-small-qlora.yaml
examples/magistral/magistral-small-fsdp-qlora.yaml
examples/magistral/magistral-small-think-qlora.yaml (added)
Expanded Magistral README, added new config for "thinking" mode, and updated configs to include the cut-cross-entropy plugin.
Gemma3 Example & Patch Removal
examples/gemma3n/README.md
examples/gemma3n/gemma-3n-e2b-vision-audio-qlora.yml
src/axolotl/monkeypatch/models/gemma3/modeling.py (deleted)
src/axolotl/loaders/patch_manager.py
Rewrote Gemma3n README, removed explicit data_files in config, and deleted the Gemma3 forward patch. Removed Gemma3 patch application from patch manager.
Utilities & Data Cleaning
src/axolotl/utils/dict.py
Added a utility function to recursively remove None values from dictionaries, used in processing strategies.
General Documentation
README.md
Added a "Latest Updates" entry about Voxtral and mistral-common tokenizer integration.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

  • Complexity: High, due to the introduction of a new tokenizer wrapper, Voxtral model integration, patching logic, new processing strategy, and broad refactoring across multiple modules and tests.

Possibly related PRs

  • axolotl-ai-cloud/axolotl#2780: Implements mistral-common tokenizer and Magistral model support, closely related to the tokenizer and processing strategy changes in this PR.
  • axolotl-ai-cloud/axolotl#2817: Modifies patching for Gemma3 and Voxtral models, directly related to the patch management updates here.
  • axolotl-ai-cloud/axolotl#2871: Updates cut-cross-entropy commit hash, overlapping with this PR's update to the same integration.

Suggested labels

ready to merge, scheduled_release

Suggested reviewers

  • NanoCode012

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/voxtral

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jul 24, 2025

📖 Documentation Preview: https://6882a732b2e6700e77e85dc7--resonant-treacle-0fd729.netlify.app

Deployed on Netlify from commit e3578e1

@NanoCode012 NanoCode012 mentioned this pull request Jul 24, 2025
5 tasks
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (8)
scripts/cutcrossentropy_install.py (1)

30-33: Avoid string duplication of the pinned CCE commit.

The exact install command (with SHA) is now duplicated here, in README, and in __init__.py. A single authoritative constant reduces maintenance risk.

- + f'{UV_PREFIX}pip install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@010c3ac3f1e725098961832830303eeb4142dd88"'
+ + f'{UV_PREFIX}pip install "cut-cross-entropy[transformers] @ '
+   f'git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@{PINNED_CCE_SHA}"'

Define PINNED_CCE_SHA once (e.g., at top of this script or in a small shared constants.py) and import where needed. This keeps the version bump to a one-liner next time.

tests/prompt_strategies/test_chat_templates_mistral.py (1)

311-311: Temporarily skipped tests need follow-up.

The tests are properly skipped with clear reasoning during the HF wrapper refactoring. Ensure these tests are re-enabled once the new wrapper API is stabilized.

Would you like me to help create an issue to track re-enabling these tests, or verify if there are existing plans to address them?

Also applies to: 754-754

examples/voxtral/README.md (1)

1-84: Comprehensive and well-structured documentation for Voxtral integration.

The README provides excellent coverage of:

  • Clear installation steps with specific version requirements
  • Both text-only and multimodal configuration examples
  • Helpful tips for inference settings and dataset formats
  • Proper documentation of limitations and future work

The documentation will effectively guide users through Voxtral model fine-tuning.

Minor style improvements suggested by static analysis:

  • Line 48: Consider adding "please" before "Let us know how it goes"
  • Line 68: Consider replacing "at the moment" with "currently" for conciseness
src/axolotl/monkeypatch/models/voxtral/modeling.py (1)

10-67: Solid monkeypatch implementation for Voxtral model forward method.

The patch properly addresses:

  • Leaf tensor issues through inputs_embeds.clone()
  • Dtype compatibility with audio_embeds.to(inputs_embeds.dtype)
  • Clean audio token replacement logic
  • Reversible patching with unpatch functionality

The implementation follows good practices for monkeypatching and handles edge cases appropriately.

Consider adding validation for audio token mask to ensure audio embeddings and tokens align:

 # replace text-audio token placeholders with audio embeddings
 audio_token_mask = input_ids == self.config.audio_token_id
+
+# Validate audio embeddings match masked positions
+if audio_token_mask.sum() != audio_embeds.shape[0]:
+    raise ValueError(f"Audio embeddings count ({audio_embeds.shape[0]}) doesn't match audio token positions ({audio_token_mask.sum()})")

 inputs_embeds = inputs_embeds.clone()
 inputs_embeds[audio_token_mask] = audio_embeds
examples/voxtral/voxtral-mini-audio-qlora.yml (2)

26-27: Remove outdated comment about Gemma3.

The comment references Gemma3 but this is a Voxtral configuration file. This appears to be copied from another config without updating the comment.

-# gemma3 doesn't seem to play nice with ddp
+# may need ddp find unused parameters for some models
 ddp_find_unused_parameters: true

70-71: Consider adding explanation for gradient checkpointing kwargs.

The use_reentrant: false setting is important for compatibility but may not be obvious to users.

 gradient_checkpointing: true
 gradient_checkpointing_kwargs:
+  # use_reentrant: false recommended for some model architectures
   use_reentrant: false
examples/magistral/magistral-small-think-qlora.yaml (1)

56-57: Consider consistency in precision settings.

This config uses bf16: auto and tf32: false while the Voxtral configs use bf16: true and tf32: true. Consider whether these differences are intentional or should be standardized.

examples/gemma3n/README.md (1)

16-17: Consider adding version compatibility note.

The installation instructions specify exact versions for packaging and setuptools. Consider adding a note about why these specific versions are required or if they're minimum requirements.

 pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
+# Note: Specific versions required for compatibility with Gemma3n
 pip3 install --no-build-isolation -e '.[flash-attn]'
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1407aac and 5b325d9.

📒 Files selected for processing (26)
  • README.md (1 hunks)
  • examples/colab-notebooks/colab-axolotl-example.ipynb (1 hunks)
  • examples/gemma3n/README.md (1 hunks)
  • examples/gemma3n/gemma-3n-e2b-vision-audio-qlora.yml (0 hunks)
  • examples/magistral/README.md (2 hunks)
  • examples/magistral/magistral-small-fsdp-qlora.yaml (1 hunks)
  • examples/magistral/magistral-small-qlora.yaml (1 hunks)
  • examples/magistral/magistral-small-think-qlora.yaml (1 hunks)
  • examples/voxtral/README.md (1 hunks)
  • examples/voxtral/voxtral-mini-audio-qlora.yml (1 hunks)
  • examples/voxtral/voxtral-mini-qlora.yml (1 hunks)
  • requirements.txt (1 hunks)
  • scripts/cutcrossentropy_install.py (1 hunks)
  • src/axolotl/integrations/cut_cross_entropy/README.md (1 hunks)
  • src/axolotl/integrations/cut_cross_entropy/__init__.py (1 hunks)
  • src/axolotl/loaders/constants.py (1 hunks)
  • src/axolotl/loaders/patch_manager.py (2 hunks)
  • src/axolotl/loaders/tokenizer.py (1 hunks)
  • src/axolotl/monkeypatch/models/gemma3/modeling.py (0 hunks)
  • src/axolotl/monkeypatch/models/voxtral/modeling.py (1 hunks)
  • src/axolotl/processing_strategies.py (5 hunks)
  • src/axolotl/prompt_strategies/chat_template.py (2 hunks)
  • src/axolotl/utils/mistral/mistral_tokenizer.py (1 hunks)
  • src/axolotl/utils/mistral_tokenizer.py (0 hunks)
  • tests/prompt_strategies/conftest.py (1 hunks)
  • tests/prompt_strategies/test_chat_templates_mistral.py (3 hunks)
💤 Files with no reviewable changes (3)
  • examples/gemma3n/gemma-3n-e2b-vision-audio-qlora.yml
  • src/axolotl/monkeypatch/models/gemma3/modeling.py
  • src/axolotl/utils/mistral_tokenizer.py
🧰 Additional context used
🧠 Learnings (1)
src/axolotl/processing_strategies.py (1)

Learnt from: greenhestu
PR: #2909
File: src/axolotl/prompt_strategies/chat_template.py:381-386
Timestamp: 2025-07-12T13:14:29.261Z
Learning: In src/axolotl/prompt_strategies/chat_template.py, the _remove_none_values function uses hasattr(obj, "items") instead of isinstance(obj, dict) for dictionary detection to support duck typing. This allows the function to work with any object that implements a dictionary-like interface, including custom classes, OrderedDict, defaultdict, etc.

🧬 Code Graph Analysis (5)
tests/prompt_strategies/conftest.py (2)
src/axolotl/utils/mistral/mistral_tokenizer.py (2)
  • HFMistralTokenizer (14-218)
  • from_pretrained (111-218)
tests/test_datasets.py (1)
  • tokenizer (31-35)
src/axolotl/loaders/tokenizer.py (1)
src/axolotl/utils/mistral/mistral_tokenizer.py (1)
  • HFMistralTokenizer (14-218)
src/axolotl/prompt_strategies/chat_template.py (1)
src/axolotl/utils/mistral/mistral_tokenizer.py (1)
  • HFMistralTokenizer (14-218)
src/axolotl/processing_strategies.py (1)
src/axolotl/utils/mistral/mistral_tokenizer.py (1)
  • chat_template (41-43)
src/axolotl/utils/mistral/mistral_tokenizer.py (1)
src/axolotl/utils/mistral_tokenizer.py (1)
  • HFMistralTokenizer (31-627)
🪛 LanguageTool
examples/voxtral/README.md

[style] ~48-~48: Consider using polite language here.
Context: ... These configs use about 4.8 GB VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...

(INSERT_PLEASE)


[style] ~68-~68: For conciseness, consider replacing this expression with an adverb.
Context: ...ntokenizer for Supervised Fine-tuning at the moment and fortype: chat_template` only. In...

(AT_THE_MOMENT)

examples/gemma3n/README.md

[style] ~41-~41: Consider using polite language here.
Context: ...emma-3n-e2b-vision-audio-qlora.yml ``` Let us know how it goes. Happy finetuning! 🚀 WARN...

(INSERT_PLEASE)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
  • GitHub Check: PyTest (3.11, 2.6.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.7.0)
  • GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.7.1, 2, true)
  • GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.6.0, 2, true)
  • GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.7.0, vllm, 2, true)
  • GitHub Check: preview
🔇 Additional comments (29)
README.md (1)

28-28: README entry looks good – no action needed.

The new Voxtral bullet is consistent with the other update entries and links to documentation using the established pattern.

src/axolotl/integrations/cut_cross_entropy/README.md (1)

22-23: Pin hash change: verify that the referenced commit is still reachable.

Hard-pinning to an exact commit is great for reproducibility, but GitHub may prune force-pushed SHAs or make them private. Consider:

  1. Verifying that 010c3ac3f1e725098961832830303eeb4142dd88 is present in the public repo and not from a fork.
  2. Mirroring/tagging the commit in your own org to avoid future 404s.

Would you like an automated script that checks the SHA’s availability and fetches the commit date?

examples/magistral/magistral-small-qlora.yaml (1)

9-11: Plugin order & duplication check.

Good to see CCE enabled. Make sure:

  1. No other plugins: blocks exist later in the file (YAML will silently override earlier keys).
  2. The plugin appears before any strategy that relies on it (order sometimes matters in Axolotl).

If both are satisfied, nothing else to change.

requirements.txt (1)

71-71: Confirm downstream compatibility of mistral-common[audio]==1.8.2

We attempted to list its extra dependencies but couldn’t retrieve metadata locally:

python - <<'PY'
import importlib.metadata as md, json
print(json.dumps(md.requires('mistral-common==1.8.2'), indent=2))
PY

This raised PackageNotFoundError. Please run the above snippet in an environment where mistral-common[audio]==1.8.2 is installed (or first pip install mistral-common[audio]==1.8.2) to verify which heavy binaries (e.g., torchaudio, ffmpeg wheels) are pulled in. If you identify extras you don’t need, consider removing [audio] in non-multimodal contexts.

• File: requirements.txt
• Line: 71

examples/magistral/magistral-small-fsdp-qlora.yaml (1)

9-11: CutCrossEntropyPlugin must be present in the runtime environment

Good to see the plugin explicitly enabled. Just make sure the training environment (Docker image, Colab, CI, etc.) actually installs the cut-cross-entropy wheel/commit that provides axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin, otherwise the run will fail at launch time with an ImportError.

If you rely on scripts/cutcrossentropy_install.py to do this, double-check that the script is executed before axolotl train … and that the commit hash matches the version expected by the code in src/axolotl/integrations/cut_cross_entropy/__init__.py.

src/axolotl/integrations/cut_cross_entropy/__init__.py (1)

37-37: LGTM: Version update for cut_cross_entropy fork.

The commit hash update aligns with the broader changes in this PR and maintains consistency across the codebase.

tests/prompt_strategies/conftest.py (1)

161-161: LGTM: Import path updates for reorganized tokenizer module.

The import path changes correctly reflect the reorganization of the HFMistralTokenizer from axolotl.utils.mistral_tokenizer to axolotl.utils.mistral.mistral_tokenizer. The fixture functionality remains unchanged.

Also applies to: 169-169, 177-177

src/axolotl/prompt_strategies/chat_template.py (2)

21-21: LGTM: Import path update for reorganized tokenizer module.

The import path change correctly reflects the reorganization of the HFMistralTokenizer module.


505-509: LGTM: Added validation for train_detail with non-string content.

The validation correctly enforces the restriction that train_detail cannot be used when content is not a string, which aligns with the documented limitations for the new tokenizer capabilities.

src/axolotl/loaders/constants.py (1)

25-31: LGTM: Conditional import pattern for Voxtral model support.

The try/except block correctly handles the optional import of VoxtralForConditionalGeneration, adding it to the model mapping only when available (transformers > 4.53.2). This defensive pattern ensures backward compatibility.

src/axolotl/loaders/tokenizer.py (1)

127-132: LGTM: Updated tokenizer loading to use reorganized HFMistralTokenizer.

The changes correctly update the import path and patching logic to use the new HFMistralTokenizer from axolotl.utils.mistral.mistral_tokenizer. The patching approach maintains compatibility with the transformers library while enabling the enhanced tokenizer functionality.

examples/magistral/README.md (3)

3-3: Good addition of multiple model version options.

Providing links to both the 2506 and 2507 versions gives users more flexibility in model selection.


34-55: Excellent documentation for the new think mode feature.

The section provides clear guidance with:

  • Comprehensive JSON example showing the expected format
  • Reference to example config file
  • Well-documented limitations to help users avoid common pitfalls

This will help users successfully implement the thinking capabilities.


58-58: Helpful tip for optimal model performance.

Recommending the use of the tuned SystemPrompt with a clear file reference will help users achieve better results.

tests/prompt_strategies/test_chat_templates_mistral.py (1)

11-11: Correct import path update.

The import path correction aligns with the tokenizer module restructuring to axolotl.utils.mistral.mistral_tokenizer.

src/axolotl/loaders/patch_manager.py (2)

69-69: Proper integration of Voxtral patches into the patch sequence.

The method call is correctly placed in the pre-model load patch sequence alongside other model-specific patches.


272-279: Clean implementation following established patterns.

The _apply_voxtral_patches method:

  • Uses conditional model type checking
  • Employs local imports to avoid unnecessary dependencies
  • Follows the same structure and naming convention as other patch methods
  • Maintains consistency with the existing codebase architecture
examples/magistral/magistral-small-think-qlora.yaml (2)

33-41: LGTM: LoRA target modules configuration.

The LoRA target modules are properly configured to target all the key projection layers for effective fine-tuning.


16-16: Confirmed correct dataset identifier

I verified that “Nanobit/text-think-2k-test” exists on Hugging Face and that “NanoBit/text-think-2k-test” does not. The current dataset path is correct—no changes needed.

examples/voxtral/voxtral-mini-qlora.yml (1)

24-30: LGTM: Dataset configuration with proper field mappings.

The dataset configuration correctly maps the SlimOrcaDedupCleaned format to the expected chat template format with proper role and content field mappings.

examples/gemma3n/README.md (2)

42-44: LGTM: Important warning about model behavior.

The warning about unusually high loss and gradient norms is valuable information for users and appropriately prominent.


48-51: LGTM: Helpful tips section with good documentation links.

The tips provide practical guidance for users, including links to relevant documentation for different use cases.

src/axolotl/processing_strategies.py (3)

111-124: LGTM: Well-implemented helper function with proper duck typing.

The _remove_none_values function correctly uses duck typing with hasattr(obj, "items") to detect dictionary-like objects, which aligns with the retrieved learning about supporting custom dictionary classes. The recursive implementation properly handles nested structures.


401-408: LGTM: Proper label masking for audio tokens.

The process_labels method correctly masks padding tokens and audio-specific tokens by setting them to -100, which is the standard approach for ignoring tokens in loss computation.


441-444: LGTM: Proper factory function integration.

The factory function correctly checks for VoxtralProcessor instance and returns the appropriate strategy. The placement after the chat_template_type checks ensures it acts as a fallback for Voxtral processors.

src/axolotl/utils/mistral/mistral_tokenizer.py (4)

1-18: LGTM!

The module structure and imports are well-organized and appropriate for the tokenizer wrapper implementation.


36-73: Well-implemented mode management with appropriate error handling.

The _set_mode method properly validates the presence of private APIs before attempting to use them, providing clear error messages if they're missing. The dummy chat template property is appropriately documented.


94-109: LGTM!

The decode method correctly handles single integer token IDs by converting them to a list before processing.


110-219: Well-implemented factory method with comprehensive documentation.

The from_pretrained method properly handles both local and remote tokenizer loading, validates input parameters, and provides clear error messages for unsupported arguments.

Comment thread examples/colab-notebooks/colab-axolotl-example.ipynb
Comment on lines +32 to +35
# sample dataset below requires downloading audio/image in advance
# wget https://huggingface.co/datasets/Nanobit/text-audio-2k-test/resolve/main/En-us-African_elephant.oga
datasets:
- path: NanoBit/text-audio-2k-test
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify dataset path and audio file requirements.

The dataset path "NanoBit/text-audio-2k-test" appears to have inconsistent casing with the wget command which references "Nanobit". Additionally, requiring manual audio file downloads could be problematic for users.

Please verify the correct dataset identifier and consider documenting the audio download requirements more clearly or providing an automated solution.


🏁 Script executed:

#!/bin/bash
# Verify the correct dataset identifier on HuggingFace
curl -s "https://huggingface.co/api/datasets/NanoBit/text-audio-2k-test" | jq '.id' 2>/dev/null || echo "Dataset not found with this casing"
curl -s "https://huggingface.co/api/datasets/Nanobit/text-audio-2k-test" | jq '.id' 2>/dev/null || echo "Dataset not found with this casing"

Length of output: 271


Update dataset path casing and document audio download

The dataset identifier on Hugging Face uses “Nanobit” (lowercase b), but the example config uses “NanoBit”. Please update the path and add clearer download instructions or an automated step:

– File: examples/voxtral/voxtral-mini-audio-qlora.yml
– Lines: 32

Proposed diff:

- datasets:
-   - path: NanoBit/text-audio-2k-test
+ datasets:
+   - path: Nanobit/text-audio-2k-test

Additionally, consider adding a pre_download section or a brief note above the dataset entry:

# Ensure the audio file is downloaded before running:
# wget https://huggingface.co/datasets/Nanobit/text-audio-2k-test/resolve/main/En-us-African_elephant.oga

or embed a simple script snippet to automate this step.

🤖 Prompt for AI Agents
In examples/voxtral/voxtral-mini-audio-qlora.yml around lines 32 to 35, update
the dataset path to use the correct casing "Nanobit" instead of "NanoBit".
Additionally, add a comment above the dataset entry with clear instructions to
download the required audio file using wget, or include a pre_download step or
script snippet to automate the audio file download before running the example.

lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules: 'language_model.model.layers.[\d]+.(mlp|self_attn).(up|down|gate|q|k|v|o)_proj'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify LoRA target modules regex pattern.

The regex pattern for targeting LoRA modules is more complex than the explicit list used in the Magistral config. Ensure this pattern correctly matches the intended layers.


🏁 Script executed:

#!/bin/bash
# Search for similar LoRA target module patterns in the codebase to verify consistency
rg -A 2 -B 2 "lora_target_modules.*language_model" --type yaml

Length of output: 4780


Align LoRA target‐modules regex prefix in Voxtral examples

The regex in the Voxtral mini configs is using language_model.model.layers… while every other example uses model.language_model.layers…. Please update the two files to match the common pattern and ensure it still reflects the actual module path.

Affected files:

  • examples/voxtral/voxtral-mini-qlora.yml
  • examples/voxtral/voxtral-mini-audio-qlora.yml

Suggested diff:

-lora_target_modules: 'language_model.model.layers.[\d]+.(mlp|self_attn).(up|down|gate|q|k|v|o)_proj'
+lora_target_modules: 'model.language_model.layers.[\d]+.(mlp|self_attn).(up|down|gate|q|k|v|o)_proj'

After updating, verify that the module attribute path (model.language_model.layers…) matches your model’s actual structure.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
lora_target_modules: 'language_model.model.layers.[\d]+.(mlp|self_attn).(up|down|gate|q|k|v|o)_proj'
lora_target_modules: 'model.language_model.layers.[\d]+.(mlp|self_attn).(up|down|gate|q|k|v|o)_proj'
🤖 Prompt for AI Agents
In examples/voxtral/voxtral-mini-qlora.yml at line 39, update the
lora_target_modules regex prefix from 'language_model.model.layers' to
'model.language_model.layers' to align with other Voxtral examples. Confirm this
new prefix matches the actual module path in your model. Repeat the same update
in examples/voxtral/voxtral-mini-audio-qlora.yml to maintain consistency across
both files.

flash_attention: true

warmup_ratio: 0.1
evals_per_epoch:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix missing value for evals_per_epoch.

The evals_per_epoch field is present but has no value assigned, which could cause configuration parsing issues.

-evals_per_epoch:
+evals_per_epoch: 1
🤖 Prompt for AI Agents
In examples/voxtral/voxtral-mini-qlora.yml at line 70, the field evals_per_epoch
is declared but missing a value, which can cause configuration parsing errors.
Assign an appropriate integer value to evals_per_epoch based on how many
evaluations should run per epoch to fix the issue.

Comment thread src/axolotl/processing_strategies.py
Comment on lines +394 to +387
special_ids = (
processor.tokenizer.tokenizer.instruct_tokenizer.audio_encoder.special_ids
)

self.audio_token = special_ids.audio
self.begin_audio_token = special_ids.begin_audio

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify audio token extraction logic.

The code accesses audio tokens through a deeply nested path: processor.tokenizer.tokenizer.instruct_tokenizer.audio_encoder.special_ids. This brittle access pattern could break if the internal structure changes.


🏁 Script executed:

#!/bin/bash
# Search for similar audio token access patterns in the codebase
rg -A 3 -B 3 "audio_encoder\.special_ids" --type py
rg -A 3 -B 3 "instruct_tokenizer" --type py

Length of output: 1137


Refactor audio‐token lookup into a stable API

Directly digging through
processor.tokenizer.tokenizer.instruct_tokenizer.audio_encoder.special_ids
to pull out .audio and .begin_audio is brittle and prone to break if the tokenizer’s internals are refactored. Instead, encapsulate that lookup behind a well-defined method or property on your processor (or tokenizer) class and use it here.

• File: src/axolotl/processing_strategies.py (lines 394–400)
• Suggestion: in your processor class (e.g. src/axolotl/processor.py), add something like:

@property
def audio_special_ids(self):
    return (
        self.tokenizer
            .tokenizer
            .instruct_tokenizer
            .audio_encoder
            .special_ids
    )

• Then update this block to:

special_ids = processor.audio_special_ids
self.audio_token       = special_ids.audio
self.begin_audio_token = special_ids.begin_audio

Encapsulating the path in one place makes future refactors safer and the intent clearer.

🤖 Prompt for AI Agents
In src/axolotl/processing_strategies.py around lines 394 to 400, the code
directly accesses nested tokenizer internals to get audio special IDs, which is
fragile. To fix this, add a property method named audio_special_ids in the
processor class (e.g., in src/axolotl/processor.py) that returns
processor.tokenizer.tokenizer.instruct_tokenizer.audio_encoder.special_ids. Then
update the code here to use processor.audio_special_ids instead of the long
nested access, assigning self.audio_token and self.begin_audio_token from that
property. This encapsulates the lookup and improves maintainability.

Comment thread src/axolotl/utils/mistral/mistral_tokenizer.py
Comment thread src/axolotl/utils/mistral/mistral_tokenizer.py Outdated
Copy link
Copy Markdown
Collaborator

@djsaunde djsaunde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nicely done!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice README!

Comment thread examples/magistral/README.md Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside, I wonder if these example-level READMEs should be propagated to our docs somewhere.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, we should have a "Model Guides" section which pulls the README for any existing models

Comment thread requirements.txt Outdated
Comment thread tests/prompt_strategies/conftest.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/axolotl/utils/dict.py (1)

41-51: Well-implemented utility function with good duck typing approach.

The implementation correctly uses duck typing with hasattr(obj, "items") which aligns with established patterns in the codebase and supports any dictionary-like interface. The recursive approach properly handles nested structures and the documentation clearly explains the purpose.

Consider these optional enhancements for improved robustness:

-def remove_none_values(obj):
+def remove_none_values(obj) -> any:
     """
     Remove null from a dictionary-like obj or list.
     These can appear due to Dataset loading causing schema merge.
     See https://github.com/axolotl-ai-cloud/axolotl/pull/2909
+    
+    Args:
+        obj: Dictionary-like object, list, or primitive value to process
+        
+    Returns:
+        Processed object with None values removed from nested structures
     """
     if hasattr(obj, "items"):
         return {k: remove_none_values(v) for k, v in obj.items() if v is not None}
     if isinstance(obj, list):
         return [remove_none_values(elem) for elem in obj]
+    if isinstance(obj, tuple):
+        return tuple(remove_none_values(elem) for elem in obj)
     return obj

The additions include type hints, enhanced documentation, and tuple support for completeness.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5b325d9 and 4060592.

📒 Files selected for processing (3)
  • src/axolotl/processing_strategies.py (4 hunks)
  • src/axolotl/prompt_strategies/chat_template.py (3 hunks)
  • src/axolotl/utils/dict.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/axolotl/prompt_strategies/chat_template.py
  • src/axolotl/processing_strategies.py
🧰 Additional context used
🧠 Learnings (1)
src/axolotl/utils/dict.py (1)

Learnt from: greenhestu
PR: #2909
File: src/axolotl/prompt_strategies/chat_template.py:381-386
Timestamp: 2025-07-12T13:14:29.261Z
Learning: In src/axolotl/prompt_strategies/chat_template.py, the _remove_none_values function uses hasattr(obj, "items") instead of isinstance(obj, dict) for dictionary detection to support duck typing. This allows the function to work with any object that implements a dictionary-like interface, including custom classes, OrderedDict, defaultdict, etc.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.7.0, vllm, 2, true)
  • GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.6.0, 2, true)
  • GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.7.1, 2, true)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: preview
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.7.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
  • GitHub Check: PyTest (3.11, 2.6.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.6.0)

@NanoCode012
Copy link
Copy Markdown
Collaborator Author

Test failing with 'transformers.tokenization_mistral_common' which is expected as we wait for a transformers release

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
examples/voxtral/README.md (2)

49-49: Consider more polite language.

-Let us know how it goes. Happy finetuning! 🚀
+Please let us know how it goes. Happy finetuning! 🚀

68-68: Consider more concise language.

-We only support the `mistral-common` tokenizer for Supervised Fine-tuning at the moment and for `type: chat_template` only.
+We currently support only the `mistral-common` tokenizer for Supervised Fine-tuning and for `type: chat_template` only.
examples/gemma3n/README.md (1)

1-44: Excellent documentation improvements!

The comprehensive guide with installation steps, examples, and the important warning about high loss/gradient norms is very helpful for users.

-Let us know how it goes. Happy finetuning! 🚀
+Please let us know how it goes. Happy finetuning! 🚀
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e3578e1 and 8f2a5ac.

📒 Files selected for processing (27)
  • README.md (1 hunks)
  • examples/colab-notebooks/colab-axolotl-example.ipynb (1 hunks)
  • examples/gemma3n/README.md (1 hunks)
  • examples/gemma3n/gemma-3n-e2b-vision-audio-qlora.yml (0 hunks)
  • examples/magistral/README.md (2 hunks)
  • examples/magistral/magistral-small-fsdp-qlora.yaml (1 hunks)
  • examples/magistral/magistral-small-qlora.yaml (1 hunks)
  • examples/magistral/magistral-small-think-qlora.yaml (1 hunks)
  • examples/voxtral/README.md (1 hunks)
  • examples/voxtral/voxtral-mini-audio-qlora.yml (1 hunks)
  • examples/voxtral/voxtral-mini-qlora.yml (1 hunks)
  • scripts/cutcrossentropy_install.py (1 hunks)
  • src/axolotl/integrations/cut_cross_entropy/README.md (1 hunks)
  • src/axolotl/integrations/cut_cross_entropy/__init__.py (1 hunks)
  • src/axolotl/loaders/constants.py (1 hunks)
  • src/axolotl/loaders/patch_manager.py (2 hunks)
  • src/axolotl/loaders/tokenizer.py (1 hunks)
  • src/axolotl/monkeypatch/models/gemma3/modeling.py (0 hunks)
  • src/axolotl/monkeypatch/models/voxtral/modeling.py (1 hunks)
  • src/axolotl/processing_strategies.py (4 hunks)
  • src/axolotl/prompt_strategies/chat_template.py (3 hunks)
  • src/axolotl/utils/dict.py (1 hunks)
  • src/axolotl/utils/mistral/__init__.py (1 hunks)
  • src/axolotl/utils/mistral/mistral_tokenizer.py (1 hunks)
  • src/axolotl/utils/mistral_tokenizer.py (0 hunks)
  • tests/prompt_strategies/conftest.py (1 hunks)
  • tests/prompt_strategies/test_chat_templates_mistral.py (3 hunks)
💤 Files with no reviewable changes (3)
  • examples/gemma3n/gemma-3n-e2b-vision-audio-qlora.yml
  • src/axolotl/monkeypatch/models/gemma3/modeling.py
  • src/axolotl/utils/mistral_tokenizer.py
✅ Files skipped from review due to trivial changes (6)
  • scripts/cutcrossentropy_install.py
  • examples/magistral/magistral-small-qlora.yaml
  • src/axolotl/integrations/cut_cross_entropy/README.md
  • src/axolotl/integrations/cut_cross_entropy/init.py
  • tests/prompt_strategies/conftest.py
  • src/axolotl/prompt_strategies/chat_template.py
🚧 Files skipped from review as they are similar to previous changes (15)
  • README.md
  • examples/magistral/magistral-small-fsdp-qlora.yaml
  • src/axolotl/utils/mistral/init.py
  • src/axolotl/utils/dict.py
  • src/axolotl/loaders/constants.py
  • src/axolotl/loaders/tokenizer.py
  • examples/magistral/README.md
  • examples/voxtral/voxtral-mini-audio-qlora.yml
  • src/axolotl/monkeypatch/models/voxtral/modeling.py
  • src/axolotl/loaders/patch_manager.py
  • examples/voxtral/voxtral-mini-qlora.yml
  • tests/prompt_strategies/test_chat_templates_mistral.py
  • src/axolotl/processing_strategies.py
  • examples/magistral/magistral-small-think-qlora.yaml
  • examples/colab-notebooks/colab-axolotl-example.ipynb
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/axolotl/utils/mistral/mistral_tokenizer.py (1)
src/axolotl/utils/mistral_tokenizer.py (1)
  • HFMistralTokenizer (31-627)
🪛 LanguageTool
examples/gemma3n/README.md

[style] ~41-~41: Consider using polite language here.
Context: ...emma-3n-e2b-vision-audio-qlora.yml ``` Let us know how it goes. Happy finetuning! 🚀 WARN...

(INSERT_PLEASE)

examples/voxtral/README.md

[style] ~48-~48: Consider using polite language here.
Context: ... These configs use about 4.8 GB VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...

(INSERT_PLEASE)


[style] ~68-~68: For conciseness, consider replacing this expression with an adverb.
Context: ...ntokenizer for Supervised Fine-tuning at the moment and fortype: chat_template` only. In...

(AT_THE_MOMENT)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: PyTest (3.11, 2.7.0)
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.6.0)
  • GitHub Check: pre-commit
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
  • GitHub Check: pre-commit
🔇 Additional comments (6)
src/axolotl/utils/mistral/mistral_tokenizer.py (5)

20-35: Constructor implementation looks good.

The method properly handles mode initialization and the docstring correctly documents the parameters. Past review issues have been addressed.


45-73: Private API access is handled appropriately.

The method includes proper error handling and clear documentation about the dependency on private APIs. Consider monitoring for changes in the mistral-common library that might break this functionality.


74-95: Exception-safe mode handling implemented correctly.

The try-finally block ensures that the mode is always restored to finetuning, addressing the previous review comment about exception safety.


96-111: LGTM!

Simple and effective override to handle single integer token IDs.


112-221: Well-structured tokenizer loading implementation.

The method properly handles both local and remote tokenizer loading with appropriate parameter validation and cleanup of unsupported kwargs.

examples/gemma3n/README.md (1)

46-58: Helpful tips and optimization guides.

The tips section provides clear guidance for users, including how to switch from QLoRA to full fine-tuning and links to relevant documentation.

Comment thread examples/voxtral/README.md Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Jul 27, 2025

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
examples/voxtral/README.md (4)

18-19: Remove unnecessary hard-pins on build tooling

Hard-locking packaging==23.2 and setuptools==75.8.0 needlessly blocks future bug-fixes and security patches. Unless Axolotl truly breaks with newer versions, prefer an unpinned or upper-bounded spec:

-pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
+pip3 install packaging setuptools wheel ninja

25-27: Relax runtime dependency pins to allow bug-fix upgrades

Pinning librosa==0.11.0 and mistral_common[audio]==1.8.3 prevents consumers from getting backward-compatible fixes. Use a lower bound instead:

- pip3 install librosa==0.11.0
- pip3 install 'mistral_common[audio]==1.8.3'
+ pip3 install 'librosa>=0.11.0'
+ pip3 install 'mistral_common[audio]>=1.8.3'

If you rely on exact versions, document the rationale.


42-42: Tone tweak for clarity

Add “please” for a friendlier call-to-action:

- Let us know how it goes. Happy finetuning! 🚀
+ Please let us know how it goes. Happy finetuning! 🚀

61-61: Reword to avoid duplicate “only” and improve flow

-We only support the `mistral-common` tokenizer for Supervised Fine-tuning at the moment and for `type: chat_template` only.
+Currently, only the `mistral-common` tokenizer is supported for supervised fine-tuning with `type: chat_template`.
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f2a5ac and 6a2bd24.

📒 Files selected for processing (2)
  • examples/magistral/README.md (3 hunks)
  • examples/voxtral/README.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • examples/magistral/README.md
🧰 Additional context used
🪛 LanguageTool
examples/voxtral/README.md

[style] ~41-~41: Consider using polite language here.
Context: ... These configs use about 4.8 GB VRAM. Let us know how it goes. Happy finetuning! 🚀 ### ...

(INSERT_PLEASE)


[style] ~61-~61: For conciseness, consider replacing this expression with an adverb.
Context: ...ntokenizer for Supervised Fine-tuning at the moment and fortype: chat_template` only. In...

(AT_THE_MOMENT)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
  • GitHub Check: pre-commit
  • GitHub Check: PyTest (3.11, 2.7.0)
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.6.0)
  • GitHub Check: pre-commit
🔇 Additional comments (1)
examples/voxtral/README.md (1)

14-15: Ignore incorrect PyTorch requirement change – 2.6.0 exists

A quick check of the official PyPI torch releases confirms that version 2.6.0 is available, so the existing line is valid and does not need to be downgraded.

Likely an incorrect or invalid review comment.

@NanoCode012 NanoCode012 merged commit 90e5598 into main Jul 30, 2025
22 of 25 checks passed
@NanoCode012 NanoCode012 deleted the feat/voxtral branch July 30, 2025 08:57
@coderabbitai coderabbitai Bot mentioned this pull request Aug 13, 2025
@coderabbitai coderabbitai Bot mentioned this pull request Aug 25, 2025
@coderabbitai coderabbitai Bot mentioned this pull request Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants