Skip to content

Commit

Permalink
ggml : avoid duplicating function calls using MIN/MAX macros (ggergan…
Browse files Browse the repository at this point in the history
…ov#5325)

* Avoid duplicating function calls when using MIN/MAX macros.

Since these copy "a" and "b" they ask the compiler to evaluate one of them twice. The compiler doesn't have a problem with removing the duplication in something like MAX(0, x + 2), but in some cases we're calling functions, and those calls just happen twice.
By explicitly evaluating at the expression we get smaller and faster code without duplicate calls. See ggml_rope_yarn_corr_dims in Compiler Explorer:

https://godbolt.org/z/Ee4KMrvKh

Code behaves exactly the same.

* Update ggml.c

---------

Co-authored-by: Georgi Gerganov <[email protected]>
  • Loading branch information
2 people authored and hodlen committed Apr 1, 2024
1 parent 1316ca1 commit e3caee1
Showing 1 changed file with 6 additions and 3 deletions.
9 changes: 6 additions & 3 deletions ggml.c
Original file line number Diff line number Diff line change
Expand Up @@ -2470,7 +2470,8 @@ size_t ggml_get_max_tensor_size(const struct ggml_context * ctx) {
size_t max_size = 0;

for (struct ggml_tensor * tensor = ggml_get_first_tensor(ctx); tensor != NULL; tensor = ggml_get_next_tensor(ctx, tensor)) {
max_size = MAX(max_size, ggml_nbytes(tensor));
size_t bytes = ggml_nbytes(tensor);
max_size = MAX(max_size, bytes);
}

return max_size;
Expand Down Expand Up @@ -11887,8 +11888,10 @@ GGML_CALL void ggml_rope_yarn_corr_dims(
int n_dims, int n_orig_ctx, float freq_base, float beta_fast, float beta_slow, float dims[2]
) {
// start and end correction dims
dims[0] = MAX(0, floorf(ggml_rope_yarn_corr_dim(n_dims, n_orig_ctx, beta_fast, freq_base)));
dims[1] = MIN(n_dims - 1, ceilf(ggml_rope_yarn_corr_dim(n_dims, n_orig_ctx, beta_slow, freq_base)));
float start = floorf(ggml_rope_yarn_corr_dim(n_dims, n_orig_ctx, beta_fast, freq_base));
float end = ceilf(ggml_rope_yarn_corr_dim(n_dims, n_orig_ctx, beta_slow, freq_base));
dims[0] = MAX(0, start);
dims[1] = MIN(n_dims - 1, end);
}

static void ggml_compute_forward_rope_f32(
Expand Down

0 comments on commit e3caee1

Please sign in to comment.