tests: contain security-conftest network block; fix stale mlx paths; skip GPU import in trainer-exec-marker by danielhanchen · Pull Request #648 · unslothai/unsloth-zoo

danielhanchen · 2026-05-15T03:32:01Z

Summary

Three CI harness bugs surfaced once the unsloth-side matrix started running zoo's full pytest tree on every PR (the unsloth_zoo @ main - full pytest (CPU) step on unslothai/unsloth PR #5428). They account for ~50 of the 61 failures on that step. Pure harness fixes, no production code touched.

1. `tests/security/conftest.py` - contain the network-block scope

The network_blocker fixture was scope="session" + autouse=True and monkey-patched the module-global socket.socket. Once any security test ran, the patch stayed live for every subsequent test in the same pytest session, silently breaking every non-security test that needed the network. test_upstream_pinned_symbols_transformers alone has ~40 parametrised cases that fetch HF modeling source over HTTPS; each one failed with network access blocked by tests/security/conftest.py.

Drop the scope to function: setup/teardown happens per-test, cost is ~10us, and the blast radius is contained to the security tree only.

2. `tests/test_upstream_pinned_symbols_accelerator.py` - fix stale MLX paths

Two tests still read unsloth_zoo/mlx_trainer.py and unsloth_zoo/mlx_loader.py as flat modules. They were promoted to the mlx/ subpackage in e6d8f7f (mlx/trainer.py, mlx/loader.py); the tests have been failing with FileNotFoundError ever since. Accept either layout so the test survives the rename and still pins the deprecated-API regression both originally guarded.

3. `tests/test_compiler_rewriter_exhaustive.py::test_unsloth_trainer_exec_marker` - handle no-accelerator import

Did pytest.importorskip("unsloth") then a plain import unsloth.trainer. On a CPU-only CI runner without zoo's GPU-spoof harness applied, import unsloth raises NotImplementedError("Unsloth cannot find any torch accelerator? You need a GPU.") -- neither importorskip nor the existing ImportError branch handle that. Convert that exact exception to pytest.skip; the GPU cell still exercises the real import.

Verification

Single pytest session interleaving security + previously-blocked transformers tests:

python -m pytest tests/security \
  'tests/test_upstream_pinned_symbols_transformers.py::test_gemma3_modeling_required_classes[v5.5.0]' \
  'tests/test_upstream_pinned_symbols_transformers.py::test_ministral_attention_module_present[v5.5.0]' \
  -v
# 17 passed in 0.90s

Pre-fix the two transformers tests failed with the leaked network block; post-fix they pass.

Out of scope

This PR does not touch the remaining ~10 failures in the same CI step that are legitimate DRIFT DETECTED signals from test_temporary_patches_exhaustive.py and test_compiler_rewriter_exhaustive.py firing on transformers 5.5.0 (the pyproject's <=5.5.0 cap floats default to 5.5.0). Those need either zoo-side patch updates for 5.5.0 compat or a narrower pyproject upper bound -- the drift signal itself is working as designed and shouldn't be muted.

Test plan

python -m pytest tests/security -q -- 15 passed
python -m pytest tests/test_upstream_pinned_symbols_accelerator.py::test_mlx_trainer_uses_modern_memory_apis_only tests/test_upstream_pinned_symbols_accelerator.py::test_get_existing_mlx_quantization_detects_both_keys -v -- 2 passed
Single-session interleave above -- 17 passed

…+ skip GPU import in trainer-exec-marker Three CI harness bugs surfaced once the unsloth-side matrix started running zoo's full pytest tree on every PR (the "unsloth_zoo @ main - full pytest (CPU)" step on unslothai/unsloth PR #5428). 1. tests/security/conftest.py The network_blocker fixture was scope="session" + autouse=True and monkey-patched the module-global socket.socket. Once any security test ran, the patch stayed live for every subsequent test in the same pytest session - silently breaking every non-security test that needed the network (test_upstream_pinned_symbols_transformers alone has ~40 parametrised cases that fetch HF modeling source over HTTPS). Drop the scope to "function" so setup/teardown happens per-test; cost is ~10us and blast radius is contained to the security tree only. 2. tests/test_upstream_pinned_symbols_accelerator.py Two tests read unsloth_zoo/mlx_trainer.py and unsloth_zoo/mlx_loader.py as flat modules. They were promoted to the mlx/ subpackage in e6d8f7f (mlx/trainer.py, mlx/loader.py); the tests have been failing with FileNotFoundError ever since. Accept either layout so the test survives the rename and still pins the deprecated-API regression both originally guarded. 3. tests/test_compiler_rewriter_exhaustive.py::test_unsloth_trainer_exec_marker Did pytest.importorskip("unsloth") then a plain import unsloth.trainer. On a CPU-only CI runner without zoo's GPU-spoof harness applied, import unsloth raises NotImplementedError("Unsloth cannot find any torch accelerator? You need a GPU.") - which neither importorskip nor the existing ImportError branch handle. Convert that exact exception to a pytest.skip; the GPU cell still exercises the real import. Local validation (single pytest session, proves the leak is gone): python -m pytest tests/security \ 'tests/test_upstream_pinned_symbols_transformers.py::test_gemma3_modeling_required_classes[v5.5.0]' \ 'tests/test_upstream_pinned_symbols_transformers.py::test_ministral_attention_module_present[v5.5.0]' \ -v -> 17 passed in 0.90s

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2eb71f12b5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-15T03:33:16Z

+    except NotImplementedError as e:
+        if "accelerator" in str(e) or "GPU" in str(e):
+            pytest.skip(f"No accelerator visible to unsloth import: {e}")


Move no-accelerator catch around importorskip

In the CPU-only CI case described here, import unsloth is executed by pytest.importorskip("unsloth") before this try block, and pytest only converts import failures into skips. When unsloth is installed but raises NotImplementedError because no accelerator is visible, the exception escapes before this new handler runs, so the test still fails instead of skipping; wrap the importorskip call or perform the initial import inside this handler.

Useful? React with 👍 / 👎.

gemini-code-assist

Code Review

This pull request fixes a regression where session-scoped network blockers interfered with other tests and updates MLX tests to support a new subpackage layout. It also introduces logic to skip unsloth-related tests on CPU-only runners. Feedback was provided regarding the placement of pytest.importorskip in tests/test_compiler_rewriter_exhaustive.py, as it currently sits outside the try block and will cause the test to crash before the skip logic can execute.

gemini-code-assist · 2026-05-15T03:37:54Z

    pytest.importorskip("unsloth")
    try:
        import unsloth.trainer as trainer_mod


The pytest.importorskip("unsloth") call at line 2045 will trigger the top-level import of the unsloth package. On CPU-only environments without a GPU-spoofing harness, this import raises the NotImplementedError that this PR intends to handle. Since this call is currently outside the try...except block, the test will still crash with an unhandled exception before reaching the skip logic. Moving the importorskip call inside the try block ensures the exception is caught and the test is skipped as intended.

Suggested change

pytest.importorskip("unsloth")

try:

import unsloth.trainer as trainer_mod

try:

pytest.importorskip("unsloth")

import unsloth.trainer as trainer_mod

…ortorskip path Two follow-up fixes after the first push: 1. .github/workflows/consolidated-tests-ci.yml The Core matrix step installed torch but not torchvision. transformers chains torchvision in at module top for the qwen2_vl / qwen2_5_vl image processors, so the drift-existence probe in test_zoo_source_upstream_refs fired RED on "ModuleNotFoundError: No module named 'torchvision'". The drift message itself says "Either install the dep in CI or remove the zoo reference" -- installing is the right move since zoo legitimately references the image processor. CPU build only. 2. tests/test_compiler_rewriter_exhaustive.py::test_unsloth_trainer_exec_marker First pass wrapped the post-importorskip `import unsloth.trainer` call. But unsloth.__init__ raises NotImplementedError at line 144 (`from ._gpu_init import *`), which means importorskip itself propagates the exception -- it only converts ImportError to skip. Move both `import unsloth` and `import unsloth.trainer` inside the try-block so the NotImplementedError branch catches both. Local verification: python -m pytest tests/test_compiler_rewriter_exhaustive.py::test_unsloth_trainer_exec_marker -v -> 1 passed

* fix CI fallout from MLX subpackage refactor (#634) Three small fixes triggered by tests on PRs #645 / unsloth#5426: 1) tests/test_upstream_pinned_symbols_accelerator.py read the legacy flat unsloth_zoo/mlx_trainer.py and mlx_loader.py paths directly. #634 moved those to unsloth_zoo/mlx/trainer.py and mlx/loader.py; point the tests at the new paths so they exercise the post-refactor sources. 2) unsloth_zoo/vision_utils.py:68 did an unconditional top-level import torchvision while torchvision is not a declared dependency. On a CPU-only install that lacks torchvision (e.g. zoo's Core CI matrix), every import of unsloth_zoo.vision_utils raises ModuleNotFoundError. Guard the import so the module loads; existing _read_video_torchvision call sites already only fire when video reading is requested. 3) tests/test_zoo_source_upstream_refs.py tries to resolve transformers qwen2_vl / qwen2_5_vl image-processing modules, both of which transitively `import torchvision`. Skip via pytest.importorskip when torchvision is not installed instead of failing the run. Each fix is independent and minimal; no behavior change for installs that have torchvision (the eager import path is identical), and no behavior change for the MLX trainer / loader (the tests still pin the exact same source-level invariants on the new file paths). * tests/conftest: also stub unsloth.device_type on CPU-only runners The post-#634 conftest stubs ``unsloth_zoo.device_type`` so importing the package on a CPU-only CI runner doesn't trip ``get_device_type()``. ``test_unsloth_trainer_exec_marker`` (and any future test that does ``import unsloth.*``) goes one level up: it imports ``unsloth.trainer``, which traverses ``unsloth/__init__.py`` → ``_gpu_init`` → ``unsloth/ device_type.py:get_device_type()``, raising the same NotImplementedError zoo's stub was meant to suppress. Parameterise ``_preload_real_device_type(package, prereqs)`` so the same harness works for both packages, factor the fallback into ``_install_device_type_stub(name)``, and call both. unsloth's ``get_device_type()`` consumes ``unsloth_zoo`` helpers but has no intra-unsloth prereqs, hence ``prereqs=()`` there. * tests/conftest: don't stub unsloth.device_type after all The previous commit (196a78b) stubbed unsloth.device_type so test_unsloth_trainer_exec_marker could complete its ``import unsloth.trainer`` smoke. That makes ``import unsloth`` succeed on CPU-only CI, which then runs ``unsloth/_gpu_init.py:_patch_trl_trainer()`` and rebinds ``trl.trainer.sft_trainer.SFTTrainer`` / ``transformers.models.ministral.MinistralAttention`` to Unsloth's compiled wrappers. ``inspect.getsource(...)`` on those classes follows the wrapper's ``__init__.__code__.co_filename`` and returns the rewritten source -- which doesn't contain the literals the drift detectors anchor on (``self._signature_columns``, ``self._prepare_dataset(``, ``class SFTTrainer``, named ``hidden_states``/``attention_mask``). The HF=4.57.6 job went from -1 pre-existing failure (``test_unsloth_trainer_exec_marker``, also failing on main) to -4 new failures across ``test_MinistralAttention_forward_signature`` and the three ``test_unsloth_rl_trainer_*`` checks. That's a regression. Keep parameterised ``_preload_real_device_type`` and ``_install_device_type_stub`` -- they're harmless infrastructure and useful if a future test needs the unsloth.device_type stub specifically. Just don't fire the second preload at session setup. ``test_unsloth_trainer_exec_marker`` will continue to fail on CPU-only CI as it does on main; that's a separate pre-existing issue with ``unsloth/device_type.py`` raising on accelerator-less hosts and should be fixed upstream in unsloth itself, not papered over in zoo's conftest. * device_type: mirror UNSLOTH_ALLOW_CPU=1 gate from unsloth #5429 unsloth/_gpu_init.py:125 imports DEVICE_TYPE from unsloth_zoo, not local unsloth/device_type.py, so the env-var gate has to live here too for `import unsloth.trainer` to succeed on CPU-only CI. The companion unsloth PR landed as cb15a7a (#5429). Pairs with the conftest change in this branch that sets UNSLOTH_ALLOW_CPU=1 before triggering `import unsloth` from _apply_upstream_import_fixes_for_tests. @functools.cache on get_device_type plus the module-level DEVICE_TYPE = get_device_type() assignment mean the os.environ.get runs exactly once per process. Production hosts with a real accelerator hit the `cuda`/`xpu`/`hip` branches above and never reach the env-var short-circuit. * prod: graceful no-op for two transformers 5.x runtime bugs Two upstream changes broke zoo at runtime (not just CI drift detection): 1. transformers 5.x removed transformers.cache_utils.HybridCache. zoo's utils.py falls back to `HybridCache = typing.Any`, which made `gemma.py:260 isinstance(past_key_values, HybridCache)` raise TypeError ("isinstance() arg 2 must be a type, a tuple of types, or a union") at runtime on the gemma mask path. Fix: expose HAS_HYBRID_CACHE alongside the existing HybridCache fallback in utils.py, and gate the isinstance call in gemma.py:260 on the flag. When HybridCache is genuinely missing, the elif chain falls through to the dynamic-cache branch -- same observable behavior as the pre-5.x path that never matched the type. 2. transformers 5.x rewrote PreTrainedModel.save_pretrained. The source-string anchors zoo's LoRA-merge-on-save optimization patches in `merge_and_dequantize_lora` (state_dict_split assignment, `state_dict[tensor].contiguous()`, `def save_pretrained`, and the filename_to_tensors for-loop on the push_to_hub path) are all gone. Each downstream `raise RuntimeError("Failed to find ...")` would fire, killing `merge_and_dequantize_lora(push_to_hub=True)` on 5.x. Fix: scan all required anchors upfront in saving_utils.py. If any are missing on installed transformers, emit a one-time warnings.warn and fall back to vanilla `model.save_pretrained(...)`. The end-user sees the warning, no crash, and is told to call `model.merge_and_unload()` (or equivalent) before saving if they want merged LoRA weights on disk. The per-anchor RuntimeError raises downstream are kept as defense-in-depth for partial drift (one anchor renamed, others intact) that the upfront check might miss. The two corresponding drift detectors in tests/ are rewritten in the companion commit as positive assertions: 4.57.6 keeps the strict existence check; 5.x asserts the anchor is gone AND zoo's fallback covers it (`HAS_HYBRID_CACHE` flag / `_required_anchors` list). * tests: version-gate transformers-5.x drift detectors Sixteen drift detectors in the Core HF=default / HF=latest matrix slices were pointing at upstream symbols / source strings that transformers 5.x intentionally removed: * 9 `temporary_patches/*_for_causal_lm_forward_named_params` -- cache_position moved from named forward param into **kwargs: Unpack[TransformersKwargs] (DeepseekV3, GptOss, CsmDepth, Csm, Qwen3Moe, Qwen3Next, Qwen3VLMoe, MinistralModel, GraniteMoeHybrid). * 4 compiler/source rewriter probes -- output_attentions branching removed, MoE routing_weights cast refactored, Trainer is_torch_tpu_available/is_torch_xla_available both removed, _update_causal_mask hasattr probe all-False. * 1 routing_weights.to substring probe in compiler_rewriter_exhaustive. * 1 GptOssConfig dedent-compare -- rope_theta/initial_context_length/ rope_scaling collapsed into rope_parameters dict. * 1 GptOssConfig kwarg signature -- same rope_theta rename. For all 16, the zoo runtime patch already gracefully no-ops on 5.x (verified by the Explore audit: try/except + relaxed patch_function + hasattr guards + str.replace's silent no-op-on-miss). Skip the detector on 5.x with a clear "<symbol> removed on transformers {version}" message; keep it strict on 4.57.6 where real drift could still surface. Two more drift detectors -- test_hybrid_cache_class_present and the two saving_utils.py pinned-string tests -- become positive assertions: on 5.x they confirm zoo's prod fix correctly identifies the missing anchor (HAS_HYBRID_CACHE is False / `_required_anchors` covers the missing strings) and the graceful fallback is wired up. Conftest also sets UNSLOTH_ALLOW_CPU=1 before the existing `import unsloth` trigger, which unblocks `test_unsloth_trainer_exec_marker` on CPU-only CI runners (the companion unsloth PR #5429 lets that import succeed without an accelerator). * tests: also gate test_MinistralAttention_forward_signature on 5.x Missed this one in 18c9f62. Same root cause as MinistralModel: transformers 5.x reflowed MinistralAttention.forward, zoo's ``patch_function(..., match_level='relaxed')`` falls back to a ``(self, *args, **kwargs)`` wrapper, the named-param probe sees the wrapper signature and drift-fails. Runtime call still works because the wrapper forwards via kwargs. Skip on 5.x, keep strict on 4.57.6. * tests: probe Ministral stash + skip relaxed-wrapper case Zoo's ministral.py:94-96 wraps the actual implementation in a generic ``def forward(self, *args, **kwargs): return _full_forward(...)`` adapter before calling patch_function -- this lets `check_args_kwargs` accept params transformers 5.x removed (cache_position) by routing them through **kwargs. The side effect: `inspect.signature( MinistralAttention.forward)` after patching shows the wrapper, not the real named params. This test used to pass only because the test runner couldn't fully trigger zoo's TEMPORARY_PATCHES loop (`unsloth/models/_utils.py:580` runs at import time but unsloth import was being silently aborted). With UNSLOTH_ALLOW_CPU=1 now making `import unsloth` succeed, the patch fires correctly and the live signature is the wrapper. Fix: probe the `_original_modeling_ministral_MinistralAttention_forward` stash that `patch_function(store_original=True)` writes, falling back to the live attribute. If the live attr is the relaxed wrapper and there's no stash, the named-param probe isn't meaningful -- the runtime kwargs contract is enforced elsewhere. Skip with a clear message. 4.57.6: stash is present and the named-param probe still runs strict; real drift in the upstream signature still surfaces. * tests: gate qwen2_5_vl image-processor drift on 5.x transformers 5.x dropped the slow image processors entirely (image_processing_qwen2_5_vl.py and image_processing_qwen2_5_vl_fast.py are both gone in v5.5.0). Qwen2.5-VL now reuses Qwen2VLImageProcessor directly. zoo's misc.py:1500-1506 patch site for Qwen2_5_VLImageProcessor is try/except ImportError-wrapped, so it silently no-ops on 5.x and the runtime shim still fires via the Qwen2VLImageProcessor patch at misc.py:1485-1498 (the class Qwen2.5-VL inherits at runtime). The Qwen2.5-VL image-processor path was previously hidden by ``pytest.importorskip("torchvision")`` -- PR #648 added torchvision to the upstream-regression matrix install, which unmasked the drift. Runtime is unaffected. Add the same 5.x skip pattern used by the other 16 gated detectors so the dead-code patch site doesn't block CI. 4.57.6 keeps the strict path-existence check.

chatgpt-codex-connector Bot reviewed May 15, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 15, 2026

View reviewed changes

danielhanchen merged commit 9fe4396 into main May 15, 2026
17 of 19 checks passed

danielhanchen deleted the sec/zoo-ci-harness-fixes branch May 15, 2026 04:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests: contain security-conftest network block; fix stale mlx paths; skip GPU import in trainer-exec-marker#648

tests: contain security-conftest network block; fix stale mlx paths; skip GPU import in trainer-exec-marker#648
danielhanchen merged 2 commits into
mainfrom
sec/zoo-ci-harness-fixes

danielhanchen commented May 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 15, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

danielhanchen commented May 15, 2026

Summary

1. tests/security/conftest.py - contain the network-block scope

2. tests/test_upstream_pinned_symbols_accelerator.py - fix stale MLX paths

3. tests/test_compiler_rewriter_exhaustive.py::test_unsloth_trainer_exec_marker - handle no-accelerator import

Verification

Out of scope

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `tests/security/conftest.py` - contain the network-block scope

2. `tests/test_upstream_pinned_symbols_accelerator.py` - fix stale MLX paths

3. `tests/test_compiler_rewriter_exhaustive.py::test_unsloth_trainer_exec_marker` - handle no-accelerator import