qwen3-next-from-scratch

Hands-on resources for understanding and running Qwen 3 Next:

qwen3-next-from-scratch.ipynb — PyTorch, from-scratch re-implementation with commentary on Gated DeltaNet, partial RoPE, and multi-token prediction.
mlx_qwen3_next.py — Apple MLX helper to load 4-bit (q4) Qwen 3 Next checkpoints for fast inference on M-series chips.

The helper defaults to the 80B A3B instruct checkpoint; expect ~40 GB of q4 weights, so the M3 Ultra with 96 GB unified memory is a good fit.

Quickstart

Install Python deps (PyTorch stack, einops, ipywidgets, mlx-lm).
Work through the notebook to understand the architecture.
Run python mlx_qwen3_next.py --prompt "Hello" to sample from a quantized checkpoint (defaults to Qwen/Qwen3-Next-80B-A3B-Instruct).

Feel free to swap in official hyperparameters or larger checkpoints as you experiment.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
main.py		main.py
mlx_qwen3_next.py		mlx_qwen3_next.py
pyproject.toml		pyproject.toml
qwen3-next-from-scratch.ipynb		qwen3-next-from-scratch.ipynb