Add unit tests for loader glob skip guard (from PR #4852)#4854
Conversation
Tests verifying that HfFileSystem().glob() is correctly skipped when is_model or is_peft is False, matching the guard added in PR #4852.
There was a problem hiding this comment.
Code Review
This pull request introduces a new test suite, tests/test_loader_glob_skip.py, designed to verify that HfFileSystem().glob() calls are skipped in loader.py when certain conditions are not met. However, several issues were identified: the tests are expected to fail in CI because the required implementation changes in loader.py are missing from this PR; the tests currently validate a simulation of the logic rather than the actual code; and the method of verifying source code via string matching is brittle and should be replaced with functional testing or a more robust approach.
| if "SUPPORTS_LLAMA32" in line and "if " in line and "is_model" in line | ||
| ] | ||
| # There should be exactly 2 guarded checks (one per from_pretrained method) | ||
| self.assertEqual( |
There was a problem hiding this comment.
This assertion will fail because the current version of unsloth/models/loader.py (as seen in lines 511 and 1285) does not yet include the is_model and is_peft variables in the if SUPPORTS_LLAMA32: guard. Since this PR only adds the tests without the corresponding implementation changes in the loader, the test suite will break in CI. You should include the loader changes in this PR or ensure they are merged simultaneously.
| def _run_both_exist_block( | ||
| self, is_model, is_peft, supports_llama32, model_name, is_local_dir = False | ||
| ): | ||
| """Simulate the both_exist detection block from loader.py. | ||
|
|
||
| This mirrors the exact logic at lines 500-517 / 1276-1292 of loader.py. | ||
| Returns (both_exist, glob_called). | ||
| """ | ||
| from unittest.mock import MagicMock | ||
|
|
||
| both_exist = (is_model and is_peft) and not supports_llama32 | ||
| glob_mock = MagicMock( | ||
| return_value = [ | ||
| f"{model_name}/config.json", | ||
| f"{model_name}/adapter_config.json", | ||
| ] | ||
| ) | ||
|
|
||
| # This mirrors the guarded block in loader.py | ||
| if supports_llama32 and is_model and is_peft: | ||
| if is_local_dir: | ||
| # Local path branch — would use os.path.exists in real code | ||
| both_exist = True # simulate both files present locally | ||
| else: | ||
| files = glob_mock(f"{model_name}/*.json") | ||
| files = list(os.path.split(x)[-1] for x in files) | ||
| if ( | ||
| sum(x == "adapter_config.json" or x == "config.json" for x in files) | ||
| >= 2 | ||
| ): | ||
| both_exist = True | ||
|
|
||
| return both_exist, glob_mock.called |
There was a problem hiding this comment.
The _run_both_exist_block helper re-implements the logic from loader.py instead of testing the actual code. This creates a "tautological test" where you are testing your own simulation of the logic rather than the actual implementation in unsloth/models/loader.py. If the real code diverges from this simulation, the tests will still pass while the bug remains. It is recommended to test the actual FastLanguageModel.from_pretrained method by mocking its external dependencies (like HfFileSystem and AutoConfig).
| guard_lines = [ | ||
| line.strip() | ||
| for line in lines | ||
| if "SUPPORTS_LLAMA32" in line and "if " in line and "is_model" in line |
There was a problem hiding this comment.
Checking the source code via string matching is brittle and the current filter is incomplete as it only checks for is_model. A more robust approach would be to use the ast module to parse the source, or preferably, rely on functional tests that verify the behavior by mocking HfFileSystem.
| if "SUPPORTS_LLAMA32" in line and "if " in line and "is_model" in line | |
| if all(x in line for x in ["if ", "SUPPORTS_LLAMA32", "is_model", "is_peft"]) |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f6f201ea7d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| self.assertEqual( | ||
| len(guard_lines), | ||
| 2, | ||
| f"Expected 2 guarded SUPPORTS_LLAMA32 checks with is_model/is_peft, found {len(guard_lines)}: {guard_lines}", |
There was a problem hiding this comment.
Drop hard-coded guarded-line count assertion
This assertion requires loader.py to contain exactly two SUPPORTS_LLAMA32 condition lines that already include is_model/is_peft, but the parent branch (6100867) still has both checks as plain if SUPPORTS_LLAMA32:. As a result, guard_lines is always empty and this new test fails consistently (pytest tests/test_loader_glob_skip.py), making the commit non-mergeable unless the loader guard implementation lands in the same change.
Useful? React with 👍 / 👎.
Tests verifying that HfFileSystem().glob() is correctly skipped when is_model or is_peft is False, matching the guard added in PR unslothai#4852.
Tests verifying that HfFileSystem().glob() is correctly skipped when is_model or is_peft is False, matching the guard added in PR unslothai#4852.
Summary
HfFileSystem().glob()is correctly skipped whenis_modeloris_peftisFalseTest plan
pytest tests/test_loader_glob_skip.py -v-- all 7 tests pass