Skip to content

UPSTREAM PR #18919: metal : support virtual devices#1103

Open
loci-dev wants to merge 4 commits intomainfrom
loci/pr-18919-gg-metal-virtual-devices
Open

UPSTREAM PR #18919: metal : support virtual devices#1103
loci-dev wants to merge 4 commits intomainfrom
loci/pr-18919-gg-metal-virtual-devices

Conversation

@loci-dev
Copy link

Note

Source pull request: ggml-org/llama.cpp#18919

Support virtual Metal devices. Allows simulating multi-GPU environments on Mac using the new GGML_METAL_DEVICES environment variable.

GGML_METAL_DEVICES=4 ./bin/llama-completion -m [model.gguf]

...

0.02.020.033 I llama_memory_breakdown_print: | memory breakdown [MiB]    |  total     free    self   model   context   compute    unaccounted |
0.02.020.034 I llama_memory_breakdown_print: |   - MTL0 (Apple M2 Ultra) | 165150 = 158091 + (1916 =   780 +    1024 +     112) +        5143 |
0.02.020.034 I llama_memory_breakdown_print: |   - MTL1 (Apple M2 Ultra) | 165150 = 158091 + (1738 =   780 +     896 +      62) +        5320 |
0.02.020.036 I llama_memory_breakdown_print: |   - MTL2 (Apple M2 Ultra) | 165150 = 158091 + (1198 =   240 +     896 +      62) +        5861 |
0.02.020.037 I llama_memory_breakdown_print: |   - MTL3 (Apple M2 Ultra) | 165150 = 158091 + (2205 =  1137 +     768 +     300) +        4853 |
0.02.020.037 I llama_memory_breakdown_print: |   - Host                  |                     364 =   296 +       0 +      68                |

@loci-review
Copy link

loci-review bot commented Jan 31, 2026

No meaningful performance changes were detected across 115327 analyzed functions in the following binaries: build.bin.libllama.so, build.bin.llama-cvector-generator, build.bin.llama-tts, build.bin.libmtmd.so, build.bin.llama-bench, build.bin.libggml-base.so, build.bin.libggml-cpu.so, build.bin.libggml.so, build.bin.llama-tokenize, build.bin.llama-qwen2vl-cli, build.bin.llama-quantize, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.llama-gemma3-cli.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

@loci-dev loci-dev force-pushed the main branch 6 times, most recently from 124d3f7 to bc286c3 Compare January 31, 2026 15:10
@loci-review
Copy link

loci-review bot commented Jan 31, 2026

No meaningful performance changes were detected across 115327 analyzed functions in the following binaries: build.bin.libllama.so, build.bin.llama-cvector-generator, build.bin.libmtmd.so, build.bin.llama-tts, build.bin.libggml-cpu.so, build.bin.libggml.so, build.bin.libggml-base.so, build.bin.llama-tokenize, build.bin.llama-quantize, build.bin.llama-qwen2vl-cli, build.bin.llama-bench, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

@loci-dev loci-dev force-pushed the main branch 13 times, most recently from 62123f6 to 8e59b18 Compare February 1, 2026 07:21
@loci-dev loci-dev force-pushed the main branch 30 times, most recently from cd152fa to ab12294 Compare February 3, 2026 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants