You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The API of the repo allows benchmarking accuracy of many optimization backends including:
HuggingFace (4-bit via Bytesandbits)
GPTQ
LLama.cpp (via bigdl-llm)
OpenVINO (via optimum)
I suggest creating examples for all four backends and demonstrating capabilities. We can also do a comparison and publish in readme.
You can use optimized versions of llama-7B from here: https://huggingface.co/TheBloke
The text was updated successfully, but these errors were encountered:
The API of the repo allows benchmarking accuracy of many optimization backends including:
I suggest creating examples for all four backends and demonstrating capabilities. We can also do a comparison and publish in readme.
You can use optimized versions of llama-7B from here: https://huggingface.co/TheBloke
The text was updated successfully, but these errors were encountered: