fix: patch PEFT for Gemma4ClippableLinear in loader checkpoint path (fixes export) by rolandtannous · Pull Request #4807 · unslothai/unsloth

rolandtannous · 2026-04-02T23:36:28Z

Problem

Exporting or loading a Gemma-4 LoRA checkpoint fails with:

ValueError: Target module Gemma4ClippableLinear(
  (linear): Linear(in_features=768, out_features=768, bias=False)
) is not supported. Currently, only the following modules are supported:
`torch.nn.Linear`, `torch.nn.Embedding`, `torch.nn.Conv1d`, ...

PEFT doesn't support Gemma4ClippableLinear yet — it wraps nn.Linear but doesn't subclass it, so PEFT's LoRA injection doesn't recognize it as a valid target.

Tracked upstream:

Why training works but export/inference doesn't

Training works because unsloth/models/vision.py (lines 1301-1353) already has a monkey-patch that intercepts LoraModel._create_and_replace and redirects Gemma4ClippableLinear to its inner .linear child:

# vision.py — training path (already exists)
_clippable_linear_cls = None
try:
    from transformers.models.gemma4.modeling_gemma4 import (
        Gemma4ClippableLinear as _clippable_linear_cls,
    )
except ImportError:
    pass
if _clippable_linear_cls is not None:
    from peft.tuners.lora.model import LoraModel as _LoraModel
    _original_car = _LoraModel._create_and_replace

    def _patched_car(self, peft_config, adapter_name, target,
                     target_name, parent, current_key=None, **kwargs):
        if isinstance(target, _clippable_linear_cls):
            return _original_car(
                self, peft_config, adapter_name,
                target.linear, "linear", target,  # redirect to inner nn.Linear
                current_key=current_key, **kwargs,
            )
        return _original_car(
            self, peft_config, adapter_name,
            target, target_name, parent,
            current_key=current_key, **kwargs,
        )

    _LoraModel._create_and_replace = _patched_car

model = _get_peft_model(model, lora_config)

# Restore original PEFT method
if _clippable_linear_cls is not None:
    _LoraModel._create_and_replace = _original_car

But when loading an existing checkpoint (export, inference), the code goes through loader.py → PeftModel.from_pretrained() which is a completely separate code path. The monkey-patch from vision.py is not active there.

Fix

Apply the same monkey-patch pattern from vision.py around the PeftModel.from_pretrained() call in loader.py's vision model loading path. The patch:

Saves the original _create_and_replace method
Installs a wrapper that checks if the target is Gemma4ClippableLinear — if so, redirects to its inner .linear attribute
Calls PeftModel.from_pretrained() with the patch active
Restores the original method immediately after

This is a temporary workaround until PEFT adds native support for Gemma4ClippableLinear.

Test plan

Export a Gemma-4 LoRA checkpoint (previously crashed with ValueError)
Load a Gemma-4 LoRA checkpoint for inference
Verify Gemma-4 training still works (unchanged path in vision.py)
Verify non-Gemma-4 models are unaffected (patch is guarded by ImportError + isinstance check)

export_gemma4.mp4

The same Gemma4ClippableLinear monkey-patch that exists in vision.py for training is needed in loader.py for loading existing checkpoints (used by export and inference). Gemma4ClippableLinear wraps nn.Linear but does not subclass it, so PEFT's LoRA injection fails with "Target module not supported". The patch redirects PEFT to target the inner .linear child instead. Applied only to the vision model PeftModel.from_pretrained path. Temporary fix until PEFT adds native support (peft#3129).

gemini-code-assist · 2026-04-02T23:36:31Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

for more information, see https://pre-commit.ci

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5847fbe2b5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Ensures _create_and_replace is restored even if PeftModel.from_pretrained raises, preventing leaked global state across subsequent model loads.

rolandtannous · 2026-04-03T00:03:32Z

resolved reviewer comments and performed additional end to end tests:

1- finetuned text model requiring transformers 4.57.6
2- finetuned gemma4
3-exported gemma4
4-exported gemma3
5- chat with gguf
6- chat with gemma4
7-chat with Phi model

BenjaminBossan · 2026-04-03T00:06:03Z

Shouldn't it be possible to target the linear submodule of Gemma4ClippableLinear by setting a proper target_modules? It might require a bit of regex acrobatics to achieve that, but it should work, e.g. to target q_proj and v_proj of the audio tower:

target_modules=r".*\.audio_tower.*\.(q_proj|v_proj)\.linear"

…fixes export) (unslothai#4807) * fix: patch PEFT for Gemma4ClippableLinear in loader checkpoint path The same Gemma4ClippableLinear monkey-patch that exists in vision.py for training is needed in loader.py for loading existing checkpoints (used by export and inference). Gemma4ClippableLinear wraps nn.Linear but does not subclass it, so PEFT's LoRA injection fails with "Target module not supported". The patch redirects PEFT to target the inner .linear child instead. Applied only to the vision model PeftModel.from_pretrained path. Temporary fix until PEFT adds native support (peft#3129). * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: wrap ClippableLinear patch in try/finally to always restore Ensures _create_and_replace is restored even if PeftModel.from_pretrained raises, preventing leaked global state across subsequent model loads. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Extend the cross-version compat canary to catch ~80% of upstream drift before a user hits it. Static checks only (GitHub raw fetch + grep), CPU-only, runs PR-time + daily cron. 906 pass, 73 skipped. TRL coverage extended: - TRL_TAGS expanded from 12 to 28 (every stable release >=0.18.2, including the broken 0.19.0, plus main). Anchors: 0.22.2 / 0.27.1 / 1.0.0 marked. - Fix `__version__` parser to handle the TRL 0.22.x pattern (`__version__ = f.read()` from sibling VERSION file). - Fix `has_def` in _fetch.py to allow indented matches so class methods are detected (the original anchored ^def only matched module-scope definitions). - New tests for symbols the audit found we touch but didn't check: is_conversational, sft_trainer module + neftune_post_forward_hook, dpo_trainer module + MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES, trl.trainer.utils.ConstantLengthDataset (gated), trl.models.utils.disable_gradient_checkpointing (gated >=1.0.0), trl.import_utils + _*_available cache pattern, trl.experimental.openenv.utils generators (one of two names), GRPOTrainer required methods (_prepare_inputs, _generate_and_score_completions, compute_loss; per-token-logps legacy/new dispatch), GRPOTrainer source must contain torch.inference_mode + accelerator.unwrap_model fingerprints, KTOTrainer.get_batch_logps (now lives at trl.experimental.kto on TRL 0.27+ — accept either path), SFTTrainer class existence, DPOTrainer methods (informational), chat-template propagation (legacy maybe_apply_chat_template OR successor apply_chat_template + chat_template_kwargs), truncate_with_protected_tokens informational. - Tighten test_unwrap_model_for_generation_either_path to mirror the prod fallback exactly (drop unused trl/extras/profiling.py candidate). - Replace test_trl_generation_vllm_generation_gated symbol set with the actual unsloth dependency (VLLMGeneration class + _init_vllm / sync_weights / generate methods, not VLLMClient/etc). PEFT coverage extended (driven by the 8 PR audit unsloth#5015, #5167, #5036, #4807 + unsloth-zoo#618, #596, #482, #430): - VARIANT_KWARG_KEYS const (peft 0.18+; injected by zoo#430) - ParamWrapper class + members (peft 0.18+; needed by zoo#618) - LoraConfig.target_parameters (peft 0.19+) - LoraModel._create_and_replace (signature pin for unsloth#4807) - transformers_weight_conversion module + build_peft_weight_mapping (unsloth#5167 wraps this) - integrations.dequantize_module_weight (3 callsites) - PeftType.LORA (vllm_utils.py:2520) - ModulesToSaveWrapper (both peft.utils.* paths) - PeftModel.from_pretrained method exists - peft.__version__ parseable Transformers coverage added (driven by the 16-PR audit): - New file test_transformers_pinned_symbols.py with 19 test categories x 12 transformers tags (4.57.6 floor + 5.0..5.8 + main). Anchors: 4.57.6 + 5.5.0. - Trainer surface (compute_loss num_items_in_batch param, training_step grad-accum fingerprints, get_batch_samples num_items contract, inner_training_loop _tr_loss inplace v5) - modeling_utils.checkpoint alias for unsloth-zoo#549 - PushToHubMixin._create_repo presence (unsloth-zoo#393) - integrations.bitsandbytes module + Linear4bit reference - quantizers.should_convert_module signature (zoo#491/#488) - FP8Linear bias/has_bias rename (zoo#572) - processing_utils.Unpack importable (zoo#583/584) - gemma3 Gemma3Attention class + gpt_oss GptOssModel class - auto_factory _LazyAutoMapping private API (unsloth#5155) - configuration_utils PretrainedConfig/PreTrainedConfig alias - tokenization_utils_base.apply_chat_template - modeling_attn_mask_utils symbols - cache_utils Cache + DynamicCache classes - training_args.ParallelMode importable Wire the new transformers job into version-compat-ci.yml (matrix of 5 PR-time symbol jobs + zoo-imports under spoof + daily fresh- fetch cron). Local smoke: 906 pass, 73 skipped (gated optional features) across vLLM + TRL + PEFT + ST + bnb + transformers suites.

Modal run ap-j2YVBnp42L3qIxNFC6vh4c failed at _attach_lora with 'Target module Gemma4ClippableLinear is not supported'. Root cause: Gemma 4 wraps every nn.Linear projection in a Gemma4ClippableLinear for activation clamping. The wrapper does NOT subclass nn.Linear (transformers PR #45388 was closed, not merged), so PEFT's exact-class-match for LoRA targets rejects it. Fix: lift the recipe from unslothai/unsloth#4807 — monkey-patch LoraModel._create_and_replace to recurse into target.linear whenever the target is a Gemma4ClippableLinear instance. Restored after get_peft_model returns. Bug A mirror confirmed working on the failed run: final_logit_softcapping=30.0 in stdout. Version pins also confirmed: trl=1.1.0, transformers=5.9.0, vllm=0.20.2.

rolandtannous requested review from danielhanchen and mmathew23 as code owners April 2, 2026 23:36

[pre-commit.ci] auto fixes from pre-commit.com hooks

6b1542a

for more information, see https://pre-commit.ci

chatgpt-codex-connector Bot reviewed Apr 2, 2026

View reviewed changes

Comment thread unsloth/models/loader.py Outdated

fix: wrap ClippableLinear patch in try/finally to always restore

9755e6e

Ensures _create_and_replace is restored even if PeftModel.from_pretrained raises, preventing leaked global state across subsequent model loads.

rolandtannous changed the title ~~fix: patch PEFT for Gemma4ClippableLinear in loader checkpoint path~~ fix: patch PEFT for Gemma4ClippableLinear in loader checkpoint path (fixes export) Apr 2, 2026

rolandtannous merged commit 6644a77 into main Apr 3, 2026
5 checks passed

rolandtannous deleted the fix/gemma4-export-clippable-linear branch April 3, 2026 00:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: patch PEFT for Gemma4ClippableLinear in loader checkpoint path (fixes export)#4807

fix: patch PEFT for Gemma4ClippableLinear in loader checkpoint path (fixes export)#4807
rolandtannous merged 3 commits into
mainfrom
fix/gemma4-export-clippable-linear

rolandtannous commented Apr 2, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Apr 2, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

rolandtannous commented Apr 3, 2026

Uh oh!

Uh oh!

BenjaminBossan commented Apr 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

rolandtannous commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Why training works but export/inference doesn't

Fix

Test plan

Uh oh!

gemini-code-assist Bot commented Apr 2, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

rolandtannous commented Apr 3, 2026

Uh oh!

Uh oh!

BenjaminBossan commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rolandtannous commented Apr 2, 2026 •

edited

Loading

BenjaminBossan commented Apr 3, 2026 •

edited

Loading