Skip to content

fix: patch PEFT for Gemma4ClippableLinear in loader checkpoint path (fixes export)#4807

Merged
rolandtannous merged 3 commits into
mainfrom
fix/gemma4-export-clippable-linear
Apr 3, 2026
Merged

fix: patch PEFT for Gemma4ClippableLinear in loader checkpoint path (fixes export)#4807
rolandtannous merged 3 commits into
mainfrom
fix/gemma4-export-clippable-linear

Conversation

@rolandtannous
Copy link
Copy Markdown
Collaborator

@rolandtannous rolandtannous commented Apr 2, 2026

Problem

Exporting or loading a Gemma-4 LoRA checkpoint fails with:

ValueError: Target module Gemma4ClippableLinear(
  (linear): Linear(in_features=768, out_features=768, bias=False)
) is not supported. Currently, only the following modules are supported:
`torch.nn.Linear`, `torch.nn.Embedding`, `torch.nn.Conv1d`, ...

PEFT doesn't support Gemma4ClippableLinear yet — it wraps nn.Linear but doesn't subclass it, so PEFT's LoRA injection doesn't recognize it as a valid target.

Tracked upstream:

Why training works but export/inference doesn't

Training works because unsloth/models/vision.py (lines 1301-1353) already has a monkey-patch that intercepts LoraModel._create_and_replace and redirects Gemma4ClippableLinear to its inner .linear child:

# vision.py — training path (already exists)
_clippable_linear_cls = None
try:
    from transformers.models.gemma4.modeling_gemma4 import (
        Gemma4ClippableLinear as _clippable_linear_cls,
    )
except ImportError:
    pass
if _clippable_linear_cls is not None:
    from peft.tuners.lora.model import LoraModel as _LoraModel
    _original_car = _LoraModel._create_and_replace

    def _patched_car(self, peft_config, adapter_name, target,
                     target_name, parent, current_key=None, **kwargs):
        if isinstance(target, _clippable_linear_cls):
            return _original_car(
                self, peft_config, adapter_name,
                target.linear, "linear", target,  # redirect to inner nn.Linear
                current_key=current_key, **kwargs,
            )
        return _original_car(
            self, peft_config, adapter_name,
            target, target_name, parent,
            current_key=current_key, **kwargs,
        )

    _LoraModel._create_and_replace = _patched_car

model = _get_peft_model(model, lora_config)

# Restore original PEFT method
if _clippable_linear_cls is not None:
    _LoraModel._create_and_replace = _original_car

But when loading an existing checkpoint (export, inference), the code goes through loader.pyPeftModel.from_pretrained() which is a completely separate code path. The monkey-patch from vision.py is not active there.

Fix

Apply the same monkey-patch pattern from vision.py around the PeftModel.from_pretrained() call in loader.py's vision model loading path. The patch:

  1. Saves the original _create_and_replace method
  2. Installs a wrapper that checks if the target is Gemma4ClippableLinear — if so, redirects to its inner .linear attribute
  3. Calls PeftModel.from_pretrained() with the patch active
  4. Restores the original method immediately after

This is a temporary workaround until PEFT adds native support for Gemma4ClippableLinear.

Test plan

  • Export a Gemma-4 LoRA checkpoint (previously crashed with ValueError)
  • Load a Gemma-4 LoRA checkpoint for inference
  • Verify Gemma-4 training still works (unchanged path in vision.py)
  • Verify non-Gemma-4 models are unaffected (patch is guarded by ImportError + isinstance check)
export_gemma4.mp4

The same Gemma4ClippableLinear monkey-patch that exists in vision.py
for training is needed in loader.py for loading existing checkpoints
(used by export and inference).

Gemma4ClippableLinear wraps nn.Linear but does not subclass it, so
PEFT's LoRA injection fails with "Target module not supported".
The patch redirects PEFT to target the inner .linear child instead.

Applied only to the vision model PeftModel.from_pretrained path.
Temporary fix until PEFT adds native support (peft#3129).
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5847fbe2b5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread unsloth/models/loader.py Outdated
Ensures _create_and_replace is restored even if PeftModel.from_pretrained
raises, preventing leaked global state across subsequent model loads.
@rolandtannous rolandtannous changed the title fix: patch PEFT for Gemma4ClippableLinear in loader checkpoint path fix: patch PEFT for Gemma4ClippableLinear in loader checkpoint path (fixes export) Apr 2, 2026
@rolandtannous
Copy link
Copy Markdown
Collaborator Author

resolved reviewer comments and performed additional end to end tests:

1- finetuned text model requiring transformers 4.57.6
2- finetuned gemma4
3-exported gemma4
4-exported gemma3
5- chat with gguf
6- chat with gemma4
7-chat with Phi model

@rolandtannous rolandtannous merged commit 6644a77 into main Apr 3, 2026
5 checks passed
@rolandtannous rolandtannous deleted the fix/gemma4-export-clippable-linear branch April 3, 2026 00:03
@BenjaminBossan
Copy link
Copy Markdown

BenjaminBossan commented Apr 3, 2026

Shouldn't it be possible to target the linear submodule of Gemma4ClippableLinear by setting a proper target_modules? It might require a bit of regex acrobatics to achieve that, but it should work, e.g. to target q_proj and v_proj of the audio tower:

target_modules=r".*\.audio_tower.*\.(q_proj|v_proj)\.linear"

shibizhao pushed a commit to shibizhao/unsloth-npu that referenced this pull request Apr 7, 2026
…fixes export) (unslothai#4807)

* fix: patch PEFT for Gemma4ClippableLinear in loader checkpoint path

The same Gemma4ClippableLinear monkey-patch that exists in vision.py
for training is needed in loader.py for loading existing checkpoints
(used by export and inference).

Gemma4ClippableLinear wraps nn.Linear but does not subclass it, so
PEFT's LoRA injection fails with "Target module not supported".
The patch redirects PEFT to target the inner .linear child instead.

Applied only to the vision model PeftModel.from_pretrained path.
Temporary fix until PEFT adds native support (peft#3129).

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix: wrap ClippableLinear patch in try/finally to always restore

Ensures _create_and_replace is restored even if PeftModel.from_pretrained
raises, preventing leaked global state across subsequent model loads.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
danielhanchen added a commit that referenced this pull request May 9, 2026
Extend the cross-version compat canary to catch ~80% of upstream
drift before a user hits it. Static checks only (GitHub raw fetch +
grep), CPU-only, runs PR-time + daily cron. 906 pass, 73 skipped.

TRL coverage extended:
- TRL_TAGS expanded from 12 to 28 (every stable release >=0.18.2,
  including the broken 0.19.0, plus main). Anchors: 0.22.2 / 0.27.1
  / 1.0.0 marked.
- Fix `__version__` parser to handle the TRL 0.22.x pattern
  (`__version__ = f.read()` from sibling VERSION file).
- Fix `has_def` in _fetch.py to allow indented matches so class
  methods are detected (the original anchored ^def only matched
  module-scope definitions).
- New tests for symbols the audit found we touch but didn't check:
  is_conversational, sft_trainer module + neftune_post_forward_hook,
  dpo_trainer module + MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES,
  trl.trainer.utils.ConstantLengthDataset (gated),
  trl.models.utils.disable_gradient_checkpointing (gated >=1.0.0),
  trl.import_utils + _*_available cache pattern,
  trl.experimental.openenv.utils generators (one of two names),
  GRPOTrainer required methods (_prepare_inputs,
  _generate_and_score_completions, compute_loss; per-token-logps
  legacy/new dispatch), GRPOTrainer source must contain
  torch.inference_mode + accelerator.unwrap_model fingerprints,
  KTOTrainer.get_batch_logps (now lives at trl.experimental.kto
  on TRL 0.27+ — accept either path),
  SFTTrainer class existence, DPOTrainer methods (informational),
  chat-template propagation (legacy maybe_apply_chat_template OR
  successor apply_chat_template + chat_template_kwargs),
  truncate_with_protected_tokens informational.
- Tighten test_unwrap_model_for_generation_either_path to mirror
  the prod fallback exactly (drop unused trl/extras/profiling.py
  candidate).
- Replace test_trl_generation_vllm_generation_gated symbol set with
  the actual unsloth dependency (VLLMGeneration class + _init_vllm
  / sync_weights / generate methods, not VLLMClient/etc).

PEFT coverage extended (driven by the 8 PR audit unsloth#5015,
#5167, #5036, #4807 + unsloth-zoo#618, #596, #482, #430):
- VARIANT_KWARG_KEYS const (peft 0.18+; injected by zoo#430)
- ParamWrapper class + members (peft 0.18+; needed by zoo#618)
- LoraConfig.target_parameters (peft 0.19+)
- LoraModel._create_and_replace (signature pin for unsloth#4807)
- transformers_weight_conversion module + build_peft_weight_mapping
  (unsloth#5167 wraps this)
- integrations.dequantize_module_weight (3 callsites)
- PeftType.LORA (vllm_utils.py:2520)
- ModulesToSaveWrapper (both peft.utils.* paths)
- PeftModel.from_pretrained method exists
- peft.__version__ parseable

Transformers coverage added (driven by the 16-PR audit):
- New file test_transformers_pinned_symbols.py with 19 test
  categories x 12 transformers tags (4.57.6 floor + 5.0..5.8 + main).
  Anchors: 4.57.6 + 5.5.0.
- Trainer surface (compute_loss num_items_in_batch param,
  training_step grad-accum fingerprints, get_batch_samples
  num_items contract, inner_training_loop _tr_loss inplace v5)
- modeling_utils.checkpoint alias for unsloth-zoo#549
- PushToHubMixin._create_repo presence (unsloth-zoo#393)
- integrations.bitsandbytes module + Linear4bit reference
- quantizers.should_convert_module signature (zoo#491/#488)
- FP8Linear bias/has_bias rename (zoo#572)
- processing_utils.Unpack importable (zoo#583/584)
- gemma3 Gemma3Attention class + gpt_oss GptOssModel class
- auto_factory _LazyAutoMapping private API (unsloth#5155)
- configuration_utils PretrainedConfig/PreTrainedConfig alias
- tokenization_utils_base.apply_chat_template
- modeling_attn_mask_utils symbols
- cache_utils Cache + DynamicCache classes
- training_args.ParallelMode importable

Wire the new transformers job into version-compat-ci.yml (matrix
of 5 PR-time symbol jobs + zoo-imports under spoof + daily fresh-
fetch cron).

Local smoke: 906 pass, 73 skipped (gated optional features) across
vLLM + TRL + PEFT + ST + bnb + transformers suites.
jayshah5696 added a commit to jayshah5696/humanize-rl that referenced this pull request May 25, 2026
Modal run ap-j2YVBnp42L3qIxNFC6vh4c failed at _attach_lora with
'Target module Gemma4ClippableLinear is not supported'.

Root cause: Gemma 4 wraps every nn.Linear projection in a
Gemma4ClippableLinear for activation clamping. The wrapper does NOT
subclass nn.Linear (transformers PR #45388 was closed, not merged),
so PEFT's exact-class-match for LoRA targets rejects it.

Fix: lift the recipe from unslothai/unsloth#4807 — monkey-patch
LoraModel._create_and_replace to recurse into target.linear whenever
the target is a Gemma4ClippableLinear instance. Restored after
get_peft_model returns.

Bug A mirror confirmed working on the failed run:
  final_logit_softcapping=30.0 in stdout.
Version pins also confirmed: trl=1.1.0, transformers=5.9.0, vllm=0.20.2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants