fix: 3 patch_* helpers — fast_lora import, sft_trainer Union, openenv OSError#5319
Conversation
patch_fast_lora has referenced an unbound `fast_lora_forward` since ddf118a (2024-11-21). The function is defined at unsloth/kernels/fast_lora.py:652 and re-exported through unsloth/kernels/__init__.py:45, but it was never imported into unsloth/models/_utils.py, so calling patch_fast_lora() raises NameError: name 'fast_lora_forward' is not defined. The bug went unnoticed because no production code path calls patch_fast_lora() unconditionally. Surfaced by a new CPU-CI check that invokes every zero-arg patch_* helper across unsloth + unsloth_zoo (consolidated-tests-ci.yml on PR #5312). Importing inside the function (rather than at module top) keeps the import surface narrow and avoids a circular-import risk if unsloth.kernels.fast_lora ever needs to import from unsloth.models._utils.
…mespace patch_sft_trainer_tokenizer rewrites the source of TRL's SFTTrainer methods (_prepare_non_packed_dataloader, _prepare_dataset) and re-execs them. With TRL 1.x, those methods carry `Union[...]` type hints in their signatures. The current rewrite only injects identifiers found by `dir(trl.trainer.sft_trainer)` into the exec namespace, which does not include `Union`, so exec(function, ...) raises NameError: name 'Union' is not defined. Fix: import Union, Optional, List, Any, Callable, Tuple, Dict, Iterator inside the function. exec receives `locals()` as its globals dict, so those names are visible to the executed source body. Same pattern as unsloth/models/_utils.py:patch_linear_scaling, which already injects `from typing import Union, Optional, List, Any, Callable, Tuple` into its own exec_code. Surfaced by the consolidated CPU-CI runtime patch_* check on PR #5312 in the matrix cell `transformers>=5,<6 + trl>=1,<2`.
…etsource
TRL 0.29.1 and the 1.x line ship some openenv helpers as compiled
bytecode without accessible source on disk. inspect.getsource(patch_target)
raises OSError("could not get source code") in that case, which surfaces
as a hard failure in patch_trl_openenv() and aborts the rest of the
RL_ADDITIONAL_FUNCTIONS["openenv"] iteration.
Wrap the getsource call in a try/except OSError and log a warning
instead. The wake_up(tags=...) rewrite is the only thing skipped; the
core weight-reload patch path stays functional.
Surfaced by the consolidated CPU-CI runtime patch_* check on PR #5312
in matrix cells running TRL 0.29.1 (latest <1.0.0) and TRL 1.3.0
(latest 1.x). The pyproject pin (TRL 0.18.2-0.24.0) still gets source
for this function so the original code path runs unchanged there.
There was a problem hiding this comment.
Code Review
This pull request introduces several fixes to improve compatibility with newer versions of TRL. Key changes include adding a missing import for fast_lora_forward, implementing error handling for inspect.getsource to prevent crashes when source code is unavailable in compiled bytecode, and including necessary typing imports to avoid NameError during dynamic code execution in patch_sft_trainer_tokenizer. I have no feedback to provide.
… job-level continue-on-error
Two cleanups derived from review of the matrix output:
1. Skip false-positive zero-arg patches in the runtime ledger.
Three patches have all-defaulted signatures but require either
runtime args or real CUDA, so calling them in isolation produces
a meaningless failure:
- patch_linear_scaling: defaults are None placeholders;
body starts with `assert rope_module is not None` etc.
- patch_llama_rope_scaling: same shape.
- patch_unsloth_smart_gradient_checkpointing: legitimately
allocates CUDA tensors via aten::empty.memory_format inside
initialize_unsloth_gradient_checkpointing(); the torch.cuda.*
Python spoof can't intercept that at the dispatcher level.
Add NEEDS_PRECONDITION = {...} to the shim and skip those by name.
Symbol presence is still verified via REQUIRED.
2. Drop the job-level `continue-on-error: true`.
Previously the cell reported SUCCESS even when steps failed, which
made the PR check UI lie. Real failures now turn the cell red.
Per-step `continue-on-error: true` stays so a single failed step
does not cascade and skip the rest of the ledger.
Three other failures the matrix surfaced are addressed by separate PRs
to source:
- #5319 (patch_fast_lora missing import,
patch_sft_trainer_tokenizer Union NameError, openenv OSError)
- unslothai/unsloth-zoo#628 (skip MoE coverage on older transformers)
There was a problem hiding this comment.
Code Review
This pull request improves compatibility with newer TRL versions by adding a missing import in the fast LoRA patch, handling cases where source code is unavailable for inspection in RL replacements, and including necessary typing imports to prevent NameErrors during dynamic execution in tokenizer utilities. I have no feedback to provide.
…dated CI cells exercise the fixed patches under continue-on-error=false.
Now that the upstream patch fixes have landed (#5319 for the three patch_* helpers, unsloth-zoo#628 for the MoE coverage canary), every observed cell-level red was one of those two things. Both are fixed, so re-run the matrix in strict mode: - Removed every per-step `continue-on-error: true`. A failing test step fails the cell. The previous green-with-fail-prints lie is gone. - Runtime patch ledger: was `assert REQUIRED helpers exist by name` (an inventory walk). Now also `assert len(fail) == 0` -- any zero-arg patch that raises is a real regression. NEEDS_PRECONDITION still skips the three patches that legitimately need real CUDA / runtime args. - patch_tiled_mlp shim: bumped seq_len from 4 to 192 with hidden=64 so divmod(192, 64) = (3, 0) and the tiled path actually runs 3 shards instead of degenerating to n_shards=1 (which is bit-exact and only confirms patching installed something). Added an explicit pre-assertion that we are exercising multi-shard. - openenv graceful-skip warning: previous text said "Weight reload still functional" which over-promised. Replaced with the literal consequence: duplicate `collective_rpc("reload_weights")` is not stripped and `wake_up(tags=["kv_cache"])` is not retagged. Most users are unaffected; openenv GRPO users on this TRL build may see redundant reload_weights or partial wake_up. Includes a merge of main into this branch so the consolidated cells pip-install the post-#5319 unsloth tree.
Summary
Three independent fixes for
patch_*helpers, all surfaced by the newconsolidated CPU-CI matrix on #5312 that invokes every zero-arg
patch_*acrossunsloth+unsloth_zoo@mainagainst three(transformers, trl)combinations.1.
patch_fast_lora—NameError: name 'fast_lora_forward' is not definedunsloth/models/_utils.py:patch_fast_lorareferencesfast_lora_forwardas if it were imported, but the import was never added. Bug present
since
ddf118a8f(2024-11-21). Function is defined atunsloth/kernels/fast_lora.py:652and re-exported throughunsloth/kernels/__init__.py:45.Fix:
from ..kernels.fast_lora import fast_lora_forwardinside thefunction body. Importing inside (rather than at module top) keeps the
import surface narrow and avoids a circular-import risk.
2.
patch_sft_trainer_tokenizer—NameError: name 'Union' is not definedunsloth/tokenizer_utils.py:patch_sft_trainer_tokenizerrewrites thesource of TRL's
SFTTrainer._prepare_non_packed_dataloaderand_prepare_datasetmethods and re-execs them. With TRL 1.x thosemethods carry
Union[...]type hints in their signatures. The currentrewrite only injects identifiers found by
dir(trl.trainer.sft_trainer),which does not include
Union, soexec(function, ...)raisesNameError.
Fix: import
Union, Optional, List, Any, Callable, Tuple, Dict, Iteratorinside the function.
execreceiveslocals()as its globals, sothose names are visible to the executed source body. Same pattern as
unsloth/models/_utils.py:patch_linear_scaling'sexec_code.3.
openenv_vllm_reload_weights—OSError: could not get source codeunsloth/models/rl_replacements.py:openenv_vllm_reload_weightscallsinspect.getsource(patch_target)on a TRL openenv helper. TRL 0.29.1+and the 1.x line ship some openenv helpers as compiled bytecode without
accessible source on disk;
inspect.getsourceraises OSError. Theunguarded call surfaces as a hard failure inside
patch_trl_openenv()and aborts the rest of the
RL_ADDITIONAL_FUNCTIONS["openenv"]iteration.
Fix: wrap the
getsourcecall in a try/except OSError and log awarning. Only the
wake_up(tags=...)rewrite is skipped; the coreweight-reload patch path stays functional. With TRL 0.18.2-0.24.0 (the
range pyproject.toml currently pins) source is still accessible, so the
original code path runs unchanged.
Test plan
patch_fast_lora()→ returns without exception (import resolves).patch_sft_trainer_tokenizer()→ returns withoutNameError.openenv_vllm_reload_weights()on TRL 0.29.1 → logs the warningand returns gracefully instead of raising.
Consolidated CPU testsmatrix cells on CI: scope GITHUB_TOKEN permissions, add MLX CI, unblock ~60 skipped tests #5312 shoulddrop these three from the failure ledger.