-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fine tune MUL_MAT, new threading (spin+wait/notify), speedup q_f32 BLAS by splitting COMPUTE stage #1632
Commits on Jun 18, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 213f133 - Browse repository at this point
Copy the full SHA 213f133View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1b041d7 - Browse repository at this point
Copy the full SHA 1b041d7View commit details -
bulk refactored task profile to support complete fallback; enable tun…
…e by default for ease of dev
Configuration menu - View commit details
-
Copy full SHA for 48016f6 - Browse repository at this point
Copy the full SHA 48016f6View commit details -
threading test: At github, Windows can take more than 20 seconds to s…
…tart 15 threads.Let's silently ignore when we saw two adjacent slowness.
Configuration menu - View commit details
-
Copy full SHA for 9106232 - Browse repository at this point
Copy the full SHA 9106232View commit details -
Configuration menu - View commit details
-
Copy full SHA for bb590f1 - Browse repository at this point
Copy the full SHA bb590f1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7c05049 - Browse repository at this point
Copy the full SHA 7c05049View commit details -
Configuration menu - View commit details
-
Copy full SHA for 21e9379 - Browse repository at this point
Copy the full SHA 21e9379View commit details -
tunning: support k_quants; disabled rope shapes (workaround); make ca…
…che thread safe; fixed shape comprison
Configuration menu - View commit details
-
Copy full SHA for 5342dc0 - Browse repository at this point
Copy the full SHA 5342dc0View commit details -
try make CL run w/o tunning, but -ngl stucks no output. had to add ta…
…sk runer and profile id, many changes, see the f codes
Configuration menu - View commit details
-
Copy full SHA for 6b83a3e - Browse repository at this point
Copy the full SHA 6b83a3eView commit details -
bulk refactoring task profile and related to run CL GPU offloading.
* removed ggml_task_backend, infavour of ggml_task_profile.runner and newly added id and name. * extracted mul_mat blas codes into ggml_compute_forward_mul_mat_blas, thus align with CUDA/CL a bit more and make it easier to fix profile and run tune. * rewrote task profile and update/add some cuda/cl codes, finnaly made CL GPU offloading work. * misc minor fix/update to tune, the data format was changed.
Configuration menu - View commit details
-
Copy full SHA for 06b0082 - Browse repository at this point
Copy the full SHA 06b0082View commit details -
Configuration menu - View commit details
-
Copy full SHA for 67bb367 - Browse repository at this point
Copy the full SHA 67bb367View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2193ab6 - Browse repository at this point
Copy the full SHA 2193ab6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0ec4dab - Browse repository at this point
Copy the full SHA 0ec4dabView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5abb8ae - Browse repository at this point
Copy the full SHA 5abb8aeView commit details -
threading: add suspend/resume APIs, so it's possible to run a thread …
…pool at session level
Configuration menu - View commit details
-
Copy full SHA for 5feefb3 - Browse repository at this point
Copy the full SHA 5feefb3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 286c5b3 - Browse repository at this point
Copy the full SHA 286c5b3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9872863 - Browse repository at this point
Copy the full SHA 9872863View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6609c22 - Browse repository at this point
Copy the full SHA 6609c22View commit details
Commits on Jun 19, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 65fd65e - Browse repository at this point
Copy the full SHA 65fd65eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 44b831d - Browse repository at this point
Copy the full SHA 44b831dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4d32b40 - Browse repository at this point
Copy the full SHA 4d32b40View commit details -
Configuration menu - View commit details
-
Copy full SHA for cc8a375 - Browse repository at this point
Copy the full SHA cc8a375View commit details -
Configuration menu - View commit details
-
Copy full SHA for aac7f7c - Browse repository at this point
Copy the full SHA aac7f7cView commit details -
threading: removed feature wait_on_done to figure out causes of deadl…
…ock in windows AVX
Configuration menu - View commit details
-
Copy full SHA for 08972d2 - Browse repository at this point
Copy the full SHA 08972d2View commit details