-
Couldn't load subscription status.
- Fork 155
Examples : Add new sweep-bench benchmark #225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this - can be very useful.
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
|
@saood06 thanks I'm a convert to I pushed a branch on my personal mainline llama.cpp fork just to use for testing performance across forks. I don't plan to open a PR to mainline, but just left it up there in case anyone else is using it. I'm guessing ik has something similar as we were comparing the new GLM-4 performance. Thanks! |
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
This is based on saood06's PR ikawrakow/ik_llama.cpp#225
Port of ggml-org/llama.cpp@9488fbf
This is a good tool to benchmark with as requested by #223.
As a very quick demo I generated this, just by running this (
./llama-sweep-bench -c 2048 -ub 512 -m WizardLM-2-8x22B-IQ4_K_R4.gguf -ctk q8_KV -ctv q8_0 -fa --output-format jsonland then sweep-bench-plot.py with the output).