Skip to content

v1.7.2

Latest
Compare
Choose a tag to compare
@ggerganov ggerganov released this 19 Nov 16:55
· 44 commits to master since this release
6266a9f

Overview

  • Various improvements in the Metal backend
  • Fix extra memory usage for large samples
  • Remove limit for ggml_context (i.e. more beams and processors are supported)
CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M2 Ultra METAL tiny 1 1 9.51 1.39 0.41 0.01 83ac284
M2 Ultra METAL tiny-q5_0 1 1 9.57 1.41 0.42 0.01 83ac284
M2 Ultra METAL tiny-q5_1 1 1 8.74 1.39 0.42 0.01 83ac284
M2 Ultra METAL tiny-q8_0 1 1 8.36 1.33 0.41 0.01 83ac284
M2 Ultra METAL base 1 1 14.27 1.90 0.63 0.02 83ac284
M2 Ultra METAL base-q5_0 1 1 15.50 1.90 0.65 0.02 83ac284
M2 Ultra METAL base-q5_1 1 1 15.67 1.88 0.65 0.02 83ac284
M2 Ultra METAL base-q8_0 1 1 14.69 1.81 0.63 0.02 83ac284
M2 Ultra METAL small 1 1 40.85 3.77 1.43 0.05 83ac284
M2 Ultra METAL small-q5_0 1 1 45.99 3.90 1.52 0.05 83ac284
M2 Ultra METAL small-q5_1 1 1 46.19 3.83 1.50 0.06 83ac284
M2 Ultra METAL small-q8_0 1 1 42.90 3.65 1.46 0.05 83ac284
M2 Ultra METAL medium 1 1 109.01 7.59 3.24 0.11 83ac284
M2 Ultra METAL medium-q5_0 1 1 126.78 7.55 3.45 0.13 83ac284
M2 Ultra METAL medium-q5_1 1 1 127.71 7.39 3.43 0.13 83ac284
M2 Ultra METAL medium-q8_0 1 1 115.97 7.21 3.35 0.12 83ac284
M2 Ultra METAL medium-dis 1 1 97.74 1.06 0.36 0.01 83ac284
M2 Ultra METAL large-v2 1 1 196.99 11.29 5.06 0.20 83ac284
M2 Ultra METAL large-v2-q5_0 1 1 233.88 10.83 5.56 0.24 83ac284
M2 Ultra METAL large-v2-q5_1 1 1 234.03 10.73 5.46 0.24 83ac284
M2 Ultra METAL large-v2-q8_0 1 1 210.83 10.29 5.23 0.22 83ac284
M2 Ultra METAL large-v2-dis 1 1 175.37 1.18 0.42 0.02 83ac284
M2 Ultra METAL large-v3-turbo 1 1 177.35 1.85 0.73 0.03 83ac284
M2 Ultra METAL large-v3-turbo-q5_0 1 1 209.31 1.69 0.80 0.04 83ac284
M2 Ultra METAL large-v3-turbo-q8_0 1 1 189.55 1.64 0.75 0.03 83ac284

What's Changed

New Contributors

Full Changelog: v1.7.1...v1.7.2