Skip to content

feat: add MTP speculative decoding support#343

Merged
jhen0409 merged 13 commits into
mainfrom
codex/add-mtp-support
May 20, 2026
Merged

feat: add MTP speculative decoding support#343
jhen0409 merged 13 commits into
mainfrom
codex/add-mtp-support

Conversation

@jhen0409
Copy link
Copy Markdown
Member

@jhen0409 jhen0409 commented May 19, 2026

Summary

  • Add native MTP speculative decoding support and TypeScript API/result fields.
  • Add an independent example app screen for MTP with the ggml-org/Qwen3.6-35B-A3B-MTP-GGUF model. (M3 Max MacBook Pro)
  • Add a Metal shader compatibility patch for the Q8_0 MoE kernel_mul_mv_id pipeline and keep it in bootstrap patches.
  • Document MTP usage and constraints in the README.

Validation

  • npm run typecheck
  • npm run lint
  • npm test -- --runInBand
  • git diff --check
  • ./tests/build_and_test.sh
  • ./tests/run_tests.sh
  • xcrun -sdk iphoneos metal -c cpp/ggml-metal/ggml-metal.metal
  • Local MTLDevice pipeline creation for kernel_mul_mv_id_q8_0_f32 with nsg=4

@jhen0409 jhen0409 marked this pull request as ready for review May 20, 2026 04:24
@jhen0409 jhen0409 merged commit db13bb0 into main May 20, 2026
6 checks passed
@jhen0409 jhen0409 deleted the codex/add-mtp-support branch May 20, 2026 05:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant