FEAT: support MLX engine #1765

qinxuye · 2024-07-03T05:49:05Z

Background

https://github.com/ml-explore/mlx is developed by Apple to enable AI on Apple silicon chips.

M1 MAX （qwen2-instruct，7b）

MLX（4-bit） vs. llama.cpp（q4_k_m)

43.3 tokens/s vs. 31.12 tokens/s

XprobeBot added the feature label Jul 3, 2024

XprobeBot added this to the v0.12.4 milestone Jul 3, 2024

qinxuye force-pushed the feat/mlx branch from e3c8b1e to 964c99a Compare July 3, 2024 06:08

qinxuye added 3 commits July 3, 2024 14:14

FEAT: support MLX

c4f02ee

add tests & update docs

fc5a65c

fix setup

b29250e

qinxuye force-pushed the feat/mlx branch from 964c99a to b29250e Compare July 3, 2024 06:14

qinxuye added 3 commits July 3, 2024 14:21

fix 72b model uid

28eb08c

fix

fd62a44

fix

97c6332

qinxuye mentioned this pull request Jul 3, 2024

FEAT: add gemma-2-it #1774

Merged

qinxuye force-pushed the feat/mlx branch 4 times, most recently from 4324208 to 4e21a16 Compare July 4, 2024 02:41

Add metal tests

69a8d42

qinxuye force-pushed the feat/mlx branch from 4e21a16 to 69a8d42 Compare July 4, 2024 03:24

Add metal tests

a725932

amumu96 approved these changes Jul 5, 2024

View reviewed changes

qinxuye merged commit 3cb1367 into xorbitsai:main Jul 5, 2024
13 checks passed

qinxuye deleted the feat/mlx branch July 5, 2024 04:02