Skip to content

Fix successful MLX tokenizer loads#2

Merged
krystophny merged 1 commit intomainfrom
feature/fix-tokenizer-load-return
Mar 24, 2026
Merged

Fix successful MLX tokenizer loads#2
krystophny merged 1 commit intomainfrom
feature/fix-tokenizer-load-return

Conversation

@krystophny
Copy link
Copy Markdown
Collaborator

@krystophny krystophny commented Mar 24, 2026

Summary

  • fix load_model_with_fallback() so successful mlx_lm.load() calls return the (model, tokenizer) tuple instead of falling through as None
  • add regression coverage for the successful load path

Why this is independently deployable

  • pure loader bugfix
  • no API, CLI, batching, or protocol behavior change
  • useful regardless of whether the Responses work is merged

Related context

This bug sits beneath several other Apple Silicon model-serving paths.

Relevant surrounding work in waybarrios/vllm-mlx:

This PR deliberately stays below all of that and only fixes the successful return path.

Validation

  • PYTHONPATH=/Users/ert/code/vllm-mlx /Users/ert/code/.venv/bin/python -m pytest tests/test_tokenizer_utils.py -q
  • python3 -m compileall vllm_mlx

What could still improve

  • broader loader-path coverage for strict/strict-false fallbacks and hybrid model families
  • explicit end-to-end smoke tests for each benchmark model alias used by FortBench

@krystophny krystophny changed the title Return successful mlx-lm loads Fix successful MLX tokenizer loads Mar 24, 2026
@krystophny krystophny merged commit fc0608b into main Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant