UPSTREAM PR #19796: Add model metadata loading from huggingface for use with tests requiring real model data#1201
UPSTREAM PR #19796: Add model metadata loading from huggingface for use with tests requiring real model data#1201loci-dev wants to merge 5 commits into
Conversation
|
No meaningful performance changes were detected across 111671 analyzed functions in the following binaries: build.bin.llama-cvector-generator, build.bin.libmtmd.so, build.bin.libllama.so, build.bin.llama-tts, build.bin.llama-tokenize, build.bin.llama-bench, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.llama-quantize, build.bin.llama-qwen2vl-cli, build.bin.llama-gemma3-cli, build.bin.libggml-base.so, build.bin.libggml-cpu.so, build.bin.libggml.so. 🔎 Full breakdown: Loci Inspector. |
…t file, also avoid mmproj
45aacad to
6e8718a
Compare
|
No meaningful performance changes were detected across 111671 analyzed functions in the following binaries: build.bin.libllama.so, build.bin.llama-tts, build.bin.llama-cvector-generator, build.bin.libmtmd.so, build.bin.llama-tokenize, build.bin.libggml-base.so, build.bin.libggml-cpu.so, build.bin.libggml.so, build.bin.llama-bench, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.llama-quantize, build.bin.llama-qwen2vl-cli. 🔎 Full breakdown: Loci Inspector |
|
No meaningful performance changes were detected across 111671 analyzed functions in the following binaries: build.bin.libllama.so, build.bin.llama-tts, build.bin.llama-cvector-generator, build.bin.libmtmd.so, build.bin.llama-tokenize, build.bin.libggml-base.so, build.bin.libggml-cpu.so, build.bin.libggml.so, build.bin.llama-quantize, build.bin.llama-qwen2vl-cli, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.llama-bench. 🔎 Full breakdown: Loci Inspector |
9f4f332 to
4298c74
Compare
551dfb5 to
55a969e
Compare
89a1190 to
8fec234
Compare
fd3ce9d to
1770118
Compare
ef0eff4 to
385b1fc
Compare
1254f75 to
245e873
Compare
Note
Source pull request: ggml-org/llama.cpp#19796
This is based on the work from huggingface here:
https://github.com/huggingface/huggingface.js/tree/main/packages/gguf
Idea is to partially load GGUF models from huggingface, just enough to get the metadata
The intention is to use this data with realistic unit tests for llama-quant.cpp, but it can be used for anyone needing real model data
To build:
To run the included test:
Caches the model locally for faster subsequent usages
Unit test provided as an example for usage in future tests
AI was used to help with the porting process and writing the unit tests