Commit 0bfb6c0
authored
Support Mixtral on macOS (#1558)
A follow-up of my previous PR (#1529).
This PR makes Mixtral work on Metal GPUs that macOS comes with. There
are honestly no much change needed, except for that Metal doesn't
support fp64 data types.
A python script to run Mixtral:
```python
from mlc_chat import ChatConfig, ChatModule, callback
from mlc_chat.support import logging
logging.enable_logging()
MODEL = "HF://junrushao/Mixtral-8x7B-Instruct-v0.1-q4f16_1-MLC"
NUM_GPU = 1
def main():
cm = ChatModule(MODEL, chat_config=ChatConfig(
sliding_window_size=1024,
tensor_parallel_shards=NUM_GPU,
))
cm.generate("What is the meaning of life?", progress_callback=callback.StreamToStdout(callback_interval=2))
if __name__ == "__main__":
main()
```
Quantization formats:
- 3-bit (19.662 GB): ["HF://junrushao/Mixtral-8x7B-Instruct-v0.1-q3f16_1-MLC"](https://huggingface.co/junrushao/Mixtral-8x7B-Instruct-v0.1-q3f16_1-MLC)
- 4-bit (24.466 GB): ["HF://junrushao/Mixtral-8x7B-Instruct-v0.1-q4f16_1-MLC"](https://huggingface.co/junrushao/Mixtral-8x7B-Instruct-v0.1-q4f16_1-MLC)1 parent e32c6c9 commit 0bfb6c0
1 file changed
+1
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
186 | 186 | | |
187 | 187 | | |
188 | 188 | | |
189 | | - | |
| 189 | + | |
190 | 190 | | |
191 | 191 | | |
192 | 192 | | |
| |||
0 commit comments