[tests] Review tests for PR #679 by danielhanchen · Pull Request #24 · shimmyshimmer/unsloth-zoo-staging-2

danielhanchen · 2026-05-24T14:56:03Z

Automated test files from review process

Fixes a "The size of tensor a (N) must match the size of tensor b (M)" RuntimeError in gpt-oss inference when generation crosses the sliding window (e.g. short prompt with max_new_tokens that pushes the sequence past 128 tokens) on transformers 5.x running below its required torch (< 2.11). In that case the KV cache returns more positions for a full-attention layer than the causal mask covers (pre-allocated cache slots), so the eager path's `attn_weights += attention_mask` crashes. The surplus key positions are masked out anyway, so attend only the overlap: trim key/value (or the mask) to the shorter length in both eager attention variants. This keeps inference correct and shape consistent across transformers 4.57.x / 5.5.x and torch 2.9 - 2.11. Also drop the dead singular `past_key_value` cache_position-free forward variant: the singular naming predates transformers dropping `cache_position`, so that signature combination does not exist in any release. Only the `past_key_values` variant is reachable. Verified on gpt-oss-20b across the full matrix (transformers {4.57.6, 5.5.0} x torch {2.9.1, 2.10.0, 2.11.0}): all reasoning efforts generate coherent output, and greedy continuations match across torch versions for a given transformers version.

…metadata

Reload + metadata follow-ups from review feedback: 1. loader._apply_lora_at_paths() now also recreates LoRASwitchLinear for SwitchLinear / QuantizedSwitchLinear and LoRAEmbedding for Embedding / QuantizedEmbedding. The previous version saved switch rank metadata but the reload helper only wrapped Linear, so MoE adapter weights were silently dropped at load time. Switch/embedding imports are wrapped in try/except for older mlx-lm. 2. utils._get_mlx_dropout_probability() now reads MLX _p_1 keep-prob first and falls back to .p. Compat shims that expose both (.p=0.0 default plus a real _p_1) previously wrote stale dropout=0.0. 3. utils._infer_mlx_lora_rank() returns None instead of falling back to lora_a.shape[-1] when lora_a and lora_b disagree on the rank dimension. The previous fallback wrote the input dim as rank for partially materialized or non-LoRA shapes. 4. utils._enrich_mlx_adapter_config() only infers rank/scale/dropout from modules that appear in the caller-provided unsloth_mlx_lora_module_paths set. Previously an unrelated earlier LoRA in named_modules() would write the wrong language-tower params when the caller selected a vision/projector path. Also distinguishes "caller passed nothing" from "caller passed [] or None" so explicit empty values are preserved. 5. utils._enrich_mlx_adapter_config() now fills num_layers (derived via _get_transformer_layers) and fine_tune_type when callers invoke save_lora_adapters() without a trainer-built config. mlx-lm load_adapters() dereferences config.num_layers and previously crashed with AttributeError on these direct-save artifacts.

The 3D rank-consistency check in _infer_mlx_lora_rank() used lora_b_shape[-2] which is the out_features axis. mlx-lm's LoRASwitchLinear stores lora_b as (num_experts, out_dims, rank) so rank lives on lora_b_shape[-1]. Using [-2] caused valid MoE adapters to be rejected as contradicted and rank to be left unrecorded in adapter_config.json.

- _enrich_mlx_adapter_config: an explicit empty unsloth_mlx_lora_module_paths list now preserves the caller topology but no longer suppresses live rank/scale/dropout inference (treat empty as no filter, with a fallback module captured before filtering). - MLXTrainer.save_model: skip lora_a-bearing modules whose rank cannot be inferred instead of silently falling back to rank 8 while still copying the bad module scale and dropout. - _apply_lora_at_paths: when from_base does not accept scale/dropout kwargs on older mlx-lm, retry with r=rank alone and restore the saved scale on the wrapped layer (scale is a Python attribute, not loaded by load_weights).

- _enrich_mlx_adapter_config: capture the fallback rank/scale/dropout only from modules that pass the explicit_path_set filter, so an explicit caller-supplied path list whose modules are not inferable no longer borrows metadata from unrelated LoRA layers. Auto-discovery and empty explicit-list paths still benefit from the fallback unchanged. - _apply_lora_at_paths: when from_base() cannot accept the dropout kwarg on older mlx-lm, patch wrapped.dropout._p_1 (or .p on shims) after the fallback so the saved adapter dropout is faithfully restored alongside the scale that was already being patched.

- _infer_mlx_lora_rank: replace getattr(lora_a, 'shape', ()) with explicit None and hasattr guards so a None tensor attribute can no longer raise AttributeError on callers that share this helper outside the existing hasattr-gated sites. - _enrich_mlx_adapter_config: remove fallback_rank/scale/dropout variables and the post-loop rescue. After the explicit-path filter was moved ahead of fallback capture, both fallback_* and lora_rank are assigned from the same filtered module pool, making the rescue unreachable. The scoping decision is intentional: an explicit-filtered save must not borrow metadata from unselected modules.

danielhanchen · 2026-05-24T15:45:20Z

Fixes pushed to unslothai#679.

Lyxot and others added 14 commits May 20, 2026 01:31

fix: persist MLX LoRA adapter metadata

9f155fc

fix: preserve MLX LoRA dropout metadata

a0bce35

fix: sync MLX LoRA adapter config fields

f0dd900

fix: prefer live MLX LoRA metadata

fc12979

fix: handle MLX adapter reload edge cases

9f3d11c

Merge remote-tracking branch 'origin/main' into fix/mlx-lora-adapter-…

aa78390

…metadata

Scrub .github/workflows for staging push (matches staging base)

d5e4436

Add MLX LoRA adapter metadata persistence test coverage

470d5e9

danielhanchen closed this May 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tests] Review tests for PR #679#24

[tests] Review tests for PR #679#24
danielhanchen wants to merge 14 commits into
mainfrom
pr-679-tests

danielhanchen commented May 24, 2026

Uh oh!

danielhanchen commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

danielhanchen commented May 24, 2026

Uh oh!

danielhanchen commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants