Skip to content

Save processor in quantizer CLI#3290

Merged
salmanmohammadi merged 2 commits into
mainfrom
quantize_save_processor
Dec 6, 2025
Merged

Save processor in quantizer CLI#3290
salmanmohammadi merged 2 commits into
mainfrom
quantize_save_processor

Conversation

@salmanmohammadi

@salmanmohammadi salmanmohammadi commented Dec 1, 2025

Copy link
Copy Markdown
Contributor

Summary by CodeRabbit

  • New Features
    • Quantization workflow now supports multimodal models with integrated processor management
    • Processors are automatically loaded, managed, and saved alongside quantized model artifacts
    • Processor artifacts can be pushed to model hub when hub integration is enabled, mirroring model and tokenizer handling

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai

coderabbitai Bot commented Dec 1, 2025

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

The quantize command module now supports multimodal model quantization by conditionally loading, saving, and syncing processors alongside models and tokenizers when multimodal configurations are present.

Changes

Cohort / File(s) Summary
Multimodal processor support
src/axolotl/cli/quantize.py
Added conditional processor loading via load_processor() when cfg.is_multimodal is true. Processor is now saved to the output directory and pushed to hub alongside model and tokenizer. Imports expanded to include AutoProcessor and load_processor.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Verify processor loading logic integrates correctly with existing multimodal configuration handling
  • Confirm processor save/push operations mirror model and tokenizer handling patterns accurately
  • Check that conditional logic correctly gates processor operations only when multimodal is enabled
  • Validate that processor lifecycle (load → save → push) doesn't introduce file I/O or hub API errors

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Save processor in quantizer CLI' directly and accurately describes the main change: adding processor saving functionality to the quantize CLI module.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/axolotl/cli/quantize.py (1)

70-73: Multimodal processor loading looks correct, with a small robustness tweak possible

Conditionally initializing processor and calling load_processor(cfg, tokenizer) only when cfg.is_multimodal is true cleanly gates the new behavior and keeps the non‑multimodal path unchanged.

To avoid any edge cases with truthiness on custom processor types, you might consider using an explicit is not None check later:

-    processor = None
-    if cfg.is_multimodal:
-        processor = load_processor(cfg, tokenizer)
+    processor = None
+    if cfg.is_multimodal:
+        processor = load_processor(cfg, tokenizer)

and pair it with:

-    if processor:
+    if processor is not None:

This is optional but removes any coupling to how __len__/truthiness is implemented on processor classes.

Please confirm that cfg.is_multimodal is always defined in your quantization configs (or via defaults); otherwise you may want to guard it via getattr(cfg, "is_multimodal", False).

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c6ddcdd and e4b6e38.

📒 Files selected for processing (1)
  • src/axolotl/cli/quantize.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/axolotl/cli/quantize.py (1)
src/axolotl/loaders/processor.py (1)
  • load_processor (17-82)
🪛 GitHub Actions: lint
src/axolotl/cli/quantize.py

[error] 1-1: pre-commit (ruff) failed. 1 issue fixed by the hook: removed unused import 'TorchAoConfig' from the imports in quantize.py. The hook exited with code 1 to indicate the modification.

🪛 Ruff (0.14.6)
src/axolotl/cli/quantize.py

116-116: Use explicit conversion flag

Replace with conversion flag

(RUF010)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: PyTest from Source Dist (3.11, 2.8.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.9.0)
  • GitHub Check: PyTest (3.11, 2.8.0)
  • GitHub Check: PyTest (3.11, 2.9.0)
  • GitHub Check: PyTest (3.11, 2.7.1)
🔇 Additional comments (3)
src/axolotl/cli/quantize.py (3)

8-11: Imports align with the new multimodal processor flow

Bringing load_processor into the CLI and wiring it next to load_tokenizer is consistent with how processors are handled elsewhere; nothing blocking here from a dependency/wiring standpoint.

If you haven’t already, re-run your pre-commit hooks (ruff in particular) to ensure there are no new import-related lint issues with the added symbols.


126-127: Processor push_to_hub mirrors model/tokenizer behavior appropriately

Conditionally pushing the processor to the same hub_model_id when present keeps the artifact set on the Hub in sync for multimodal models and does not affect the non‑multimodal path.

No changes needed here.

It’s worth verifying with at least one multimodal config (e.g., one that uses load_processor in axolotl.loaders.processor) that processor.push_to_hub works as expected and that all three artifacts (model, tokenizer, processor) appear under the same repo on the Hub.


115-118: RUF010 is not enabled in this project's Ruff configuration

The Ruff configuration in pyproject.toml selects only ["E", "F", "W", "C90", "B", "I"] rules, which does not include the "RUF" ruleset. RUF010 (explicit-f-string-type-conversion) is part of the RUF rules and is therefore not being flagged by Ruff in CI or pre-commit hooks. The code at lines 116 and 130 using str(Path(...)) inside f-strings is not currently triggering a lint error in this project.

The suggested refactors are stylistically sound, but there is no existing lint violation to fix.

Likely an incorrect or invalid review comment.

@salmanmohammadi salmanmohammadi merged commit 75b20fb into main Dec 6, 2025
15 of 16 checks passed
@salmanmohammadi salmanmohammadi deleted the quantize_save_processor branch December 6, 2025 16:27
@codecov

codecov Bot commented Dec 8, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 9 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/axolotl/cli/quantize.py 0.00% 9 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants