fix: add missing return in load_model_with_fallback#243
fix: add missing return in load_model_with_fallback#243wayne-o wants to merge 1 commit intowaybarrios:mainfrom
Conversation
When `mlx_lm.load()` succeeds without raising a `ValueError`,
`load_model_with_fallback()` falls through the try/except block
without returning the (model, tokenizer) tuple, causing the caller
to receive `None` and crash with:
TypeError: cannot unpack non-iterable NoneType object
This affects all models that load successfully on the first try
(i.e., most models that don't need the Nemotron/vision fallback paths).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Thump604
left a comment
There was a problem hiding this comment.
Confirmed this bug on current main (4ede902). Line 54 is an assignment, not a return — the success path falls through and returns None. The caller in batched.py:257 crashes with TypeError.
Three independent PRs (#230, #235, #237) and two issue reports (#211, #212) all identified the same bug. This fix is correct.
|
I confirmed that this allows |
|
Code review for PR #243 The fix is correct. The function try:
model, tokenizer = load(model_name, tokenizer_config=tokenizer_config)
except ValueError as e:
...Without the The bug was introduced in commit Checked git blame, previous PRs (#215, #230, #235, #237), and code comments in the file. No additional bugs introduced by this change. Could you also add a regression test for this? PR #215 included one that reviewers found useful. Something that verifies |
|
@wayne-o let me know if the tests are feasible to add on this PR. |
|
I patched my local install and it works for me |
|
This PR closes the bug described in issues #211, #212, #249, and #252, all of which are open and describe the same root cause: @wayne-o, would you mind adding |
|
Is this a dupe of #215 ? |
|
Yes and no. #243 is a minimal 1-line fix for the same root cause as #215 (krystophny), but #215 is strictly broader: same 1-line return in vllm_mlx/utils/tokenizer.py plus a 50-line regression test in tests/test_tokenizer_utils.py plus an Apple Silicon CI entry that runs the tokenizer regression. #215's PR body explicitly says "This overlaps with #243; this PR keeps the regression test with the fix." Both PRs are pointing at the same bug. The practical decision is:
Either order closes the bug. Both touch the same single line. The underlying bug is also tracked in issues #211, #212, #249, and #252, all of which will close once either PR lands. |
|
I could validate a local model with #243 if it helps. Logically it has the same outcome, but seeing is believing. |
|
Second validation is welcome. You already proved #243 works end-to-end on Qwen3.5-35B-A3B-4bit continuous-batching from v0.2.7, so a second model widens the empirical envelope for the one-line fix, especially on a non-continuous-batching model class where the fallback path is exercised differently. If #215 is also easy to pull locally, validating both branches would be the strongest signal for @waybarrios — same fix, but #215 ships with the tokenizer regression test that would prevent the bug from reappearing on a future refactor. Either PR closes the root cause in #211 / #212 / #249 / #252. |
|
unsurprisingly, 243 works also and the test completes successfully when merged onto v0.2.7. |
|
Hey Wayne, good eye on this one. That missing return was biting everyone. We ended up merging #268 which includes the same fix along with the Gemma 4 work, so this is covered now. Appreciate you taking the time to track it down. |
Summary
load_model_with_fallback()invllm_mlx/utils/tokenizer.pyis missing areturnstatement after the successfulmlx_lm.load()callload()succeeds without raisingValueError, the function falls through the try/except and implicitly returnsNoneMLXLanguageModel.load()) then crashes withTypeError: cannot unpack non-iterable NoneType objectFix
One line:
return model, tokenizerafter the successfulload()call.Test plan
vllm-mlx serve mlx-community/Mistral-Small-3.2-24B-Instruct-2506-4bitstarts successfullyvllm-mlx serve mlx-community/Qwen3.5-9B-4bitstarts successfully🤖 Generated with Claude Code