fix: add missing return in load_model_with_fallback by wayne-o · Pull Request #243 · waybarrios/vllm-mlx

wayne-o · 2026-04-01T08:53:18Z

Summary

load_model_with_fallback() in vllm_mlx/utils/tokenizer.py is missing a return statement after the successful mlx_lm.load() call
When load() succeeds without raising ValueError, the function falls through the try/except and implicitly returns None
The caller (MLXLanguageModel.load()) then crashes with TypeError: cannot unpack non-iterable NoneType object
This affects all models that load successfully on the first try (i.e. most models that don't need the Nemotron/vision fallback paths)

Fix

One line: return model, tokenizer after the successful load() call.

Test plan

Verified vllm-mlx serve mlx-community/Mistral-Small-3.2-24B-Instruct-2506-4bit starts successfully
Verified vllm-mlx serve mlx-community/Qwen3.5-9B-4bit starts successfully
Both models previously crashed with the NoneType error on v0.2.7

🤖 Generated with Claude Code

When `mlx_lm.load()` succeeds without raising a `ValueError`, `load_model_with_fallback()` falls through the try/except block without returning the (model, tokenizer) tuple, causing the caller to receive `None` and crash with: TypeError: cannot unpack non-iterable NoneType object This affects all models that load successfully on the first try (i.e., most models that don't need the Nemotron/vision fallback paths). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Thump604

Confirmed this bug on current main (4ede902). Line 54 is an assignment, not a return — the success path falls through and returns None. The caller in batched.py:257 crashes with TypeError.

Three independent PRs (#230, #235, #237) and two issue reports (#211, #212) all identified the same bug. This fix is correct.

perry2of5 · 2026-04-01T16:48:45Z

I confirmed that this allows vllm-mlx serve mlx-community/Qwen3.5-35B-A3B-4bit --port 7979 --continuous-batching to succeed from v0.2.7

waybarrios · 2026-04-01T18:10:24Z

Code review for PR #243

The fix is correct. The function load_model_with_fallback() in vllm_mlx/utils/tokenizer.py was doing this:

try:
    model, tokenizer = load(model_name, tokenizer_config=tokenizer_config)
except ValueError as e:
    ...

Without the return model, tokenizer, when load() succeeds the function returns None, and callers like llm.py:89 and batched.py:257 that do self.model, self.tokenizer = load_model_with_fallback(...) crash with TypeError: cannot unpack non-iterable NoneType object.

The bug was introduced in commit c70b80b when return load(...) was changed to model, tokenizer = load(...) to support _try_inject_mtp_post_load(), but the return was never added back.

Checked git blame, previous PRs (#215, #230, #235, #237), and code comments in the file. No additional bugs introduced by this change.

Could you also add a regression test for this? PR #215 included one that reviewers found useful. Something that verifies load_model_with_fallback() actually returns a (model, tokenizer) tuple on the happy path instead of None. That would prevent this from regressing in future refactors.

waybarrios · 2026-04-01T18:11:54Z

@wayne-o let me know if the tests are feasible to add on this PR.

npomfret · 2026-04-07T09:36:37Z

I patched my local install and it works for me

Thump604 · 2026-04-07T23:47:21Z

This PR closes the bug described in issues #211, #212, #249, and #252, all of which are open and describe the same root cause: load_model_with_fallback in vllm_mlx/utils/tokenizer.py falls through after a successful mlx_lm.load() call and returns None, causing the caller to crash with TypeError: cannot unpack non-iterable NoneType object. Your one-line fix at line 55 resolves all four.

@wayne-o, would you mind adding Closes #211, #212, #249, #252 to the PR description so all four issues auto-close on merge? Saves @waybarrios the manual cleanup pass.

jtatum · 2026-04-09T07:09:26Z

Is this a dupe of #215 ?

Thump604 · 2026-04-09T12:31:42Z

Yes and no. #243 is a minimal 1-line fix for the same root cause as #215 (krystophny), but #215 is strictly broader: same 1-line return in vllm_mlx/utils/tokenizer.py plus a 50-line regression test in tests/test_tokenizer_utils.py plus an Apple Silicon CI entry that runs the tokenizer regression.

#215's PR body explicitly says "This overlaps with #243; this PR keeps the regression test with the fix." Both PRs are pointing at the same bug. The practical decision is:

merge tokenizer: return successful mlx-lm load result #215 (broader, with regression coverage and CI) and close fix: add missing return in load_model_with_fallback #243 as superseded
or merge fix: add missing return in load_model_with_fallback #243 first (minimal, already validated by @perry2of5 and @npomfret on real models) and pick up the regression test + CI from tokenizer: return successful mlx-lm load result #215 as a follow-up

Either order closes the bug. Both touch the same single line.

The underlying bug is also tracked in issues #211, #212, #249, and #252, all of which will close once either PR lands.

perry2of5 · 2026-04-09T16:59:06Z

I could validate a local model with #243 if it helps. Logically it has the same outcome, but seeing is believing.

Thump604 · 2026-04-09T17:19:28Z

Second validation is welcome. You already proved #243 works end-to-end on Qwen3.5-35B-A3B-4bit continuous-batching from v0.2.7, so a second model widens the empirical envelope for the one-line fix, especially on a non-continuous-batching model class where the fallback path is exercised differently.

If #215 is also easy to pull locally, validating both branches would be the strongest signal for @waybarrios — same fix, but #215 ships with the tokenizer regression test that would prevent the bug from reappearing on a future refactor. Either PR closes the root cause in #211 / #212 / #249 / #252.

perry2of5 · 2026-04-10T04:57:35Z

unsurprisingly, 243 works also and the test completes successfully when merged onto v0.2.7.

waybarrios · 2026-04-10T21:48:33Z

Hey Wayne, good eye on this one. That missing return was biting everyone. We ended up merging #268 which includes the same fix along with the Gemma 4 work, so this is covered now. Appreciate you taking the time to track it down.

Thump604 approved these changes Apr 1, 2026

View reviewed changes

waybarrios assigned waybarrios and Thump604 Apr 1, 2026

waybarrios added the bug Something isn't working label Apr 1, 2026

weklund mentioned this pull request Apr 3, 2026

vllm-mlx continuous batching disabled due to upstream bug (waybarrios/vllm-mlx#211) weklund/mlx-stack#17

Open

This was referenced Apr 7, 2026

load_model_with_fallback missing return statement — model load succeeds but returns None #252

Closed

tokenizer: return successful mlx-lm load result #215

Closed

mikepixelmagic-dev mentioned this pull request Apr 10, 2026

fix: add missing return in load_model_with_fallback #272

Closed

Thump604 mentioned this pull request Apr 10, 2026

feat: add Gemma 4 multimodal model support #268

Merged

3 tasks

waybarrios closed this Apr 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add missing return in load_model_with_fallback#243

fix: add missing return in load_model_with_fallback#243
wayne-o wants to merge 1 commit intowaybarrios:mainfrom
wayne-o:fix/load-model-missing-return

wayne-o commented Apr 1, 2026

Uh oh!

Thump604 left a comment

Uh oh!

perry2of5 commented Apr 1, 2026

Uh oh!

waybarrios commented Apr 1, 2026

Uh oh!

waybarrios commented Apr 1, 2026

Uh oh!

npomfret commented Apr 7, 2026

Uh oh!

Thump604 commented Apr 7, 2026

Uh oh!

jtatum commented Apr 9, 2026

Uh oh!

Thump604 commented Apr 9, 2026

Uh oh!

perry2of5 commented Apr 9, 2026

Uh oh!

Thump604 commented Apr 9, 2026

Uh oh!

perry2of5 commented Apr 10, 2026

Uh oh!

waybarrios commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

wayne-o commented Apr 1, 2026

Summary

Fix

Test plan

Uh oh!

Thump604 left a comment

Choose a reason for hiding this comment

Uh oh!

perry2of5 commented Apr 1, 2026

Uh oh!

waybarrios commented Apr 1, 2026

Uh oh!

waybarrios commented Apr 1, 2026

Uh oh!

npomfret commented Apr 7, 2026

Uh oh!

Thump604 commented Apr 7, 2026

Uh oh!

jtatum commented Apr 9, 2026

Uh oh!

Thump604 commented Apr 9, 2026

Uh oh!

perry2of5 commented Apr 9, 2026

Uh oh!

Thump604 commented Apr 9, 2026

Uh oh!

perry2of5 commented Apr 10, 2026

Uh oh!

waybarrios commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants