[RFC] Add a header for PyTorch-like operator overloading syntax

Coming from the original PyTorch implementations, people are finding it increasingly cumbersome to type `ggml_` and `ctx` over and over again. One line of Python can turn into 20 lines of C++. This is creating too much friction, and we are getting lost in the boilerplate instead of being able to see the big picture.

I would like to create a header that takes advantage of the C++ way of using operator overloading. Eventually, it will include PyTorch and NumPy aliases to allow simply copying-and-pasting code from Python into ggml C++ with only minor fixup.

The new struct will just wrap things like `struct ggml_tensor_wrapper { ggml_tensor * data; ggml_context * ctx; };` the goal is to not change the resulting `.exe` binary. This will be done by making it header-only and declaring everything inline, ultimately still calling the `ggml_` series of functions.

# Example 1

## Before
https://github.com/ggml-org/whisper.cpp/blob/5527454cdb3e15d7e2b8a6e2afcb58cb61651fd2/src/whisper.cpp#L2231-L2259

```cpp
// feed-forward network
{
    // norm
    {
        cur = ggml_norm(ctx0, inpFF, hparams.eps);

        // cur = mlp_ln_w*cur + mlp_ln_b
        cur = ggml_add(ctx0,
                ggml_mul(ctx0, cur, layer.mlp_ln_w),
                layer.mlp_ln_b);
    }

    // fully connected
    cur = ggml_mul_mat(ctx0,
            layer.mlp_0_w,
            cur);

    cur = ggml_add(ctx0, cur, layer.mlp_0_b);

    // GELU activation
    cur = ggml_gelu(ctx0, cur);

    // projection
    cur = ggml_mul_mat(ctx0,
            layer.mlp_1_w,
            cur);

    cur = ggml_add(ctx0, cur, layer.mlp_1_b);
}
```

## After
```cpp
// feed-forward network
{
    // norm
    {
        cur = inpFF.norm(hparams.eps);
        cur = cur * layer.mlp_ln_w + layer.mlp_ln_b;
    }

    cur = (layer.mlp_0_w ^ cur) + layer.mlp_0_b; // fully connected
    cur = cur.gelu();
    cur = (layer.mlp_1_w ^ cur) + layer.mlp_1_b; // projection
}
```

# Example 2
## Before
https://github.com/ggml-org/whisper.cpp/blob/5527454cdb3e15d7e2b8a6e2afcb58cb61651fd2/src/whisper.cpp#L2748-L2759

```cpp
struct ggml_tensor * aheads_KQs = ggml_reshape_2d(ctx0, KQ_soft_max, KQ_soft_max->ne[0] * KQ_soft_max->ne[1], KQ_soft_max->ne[2]);
aheads_KQs = ggml_transpose(ctx0, aheads_KQs);
aheads_KQs = ggml_cont(ctx0, aheads_KQs);
aheads_KQs = ggml_mul_mat(ctx0, wstate.aheads_masks.m[il], aheads_KQs);
aheads_KQs = ggml_transpose(ctx0, aheads_KQs);
aheads_KQs = ggml_cont(ctx0, aheads_KQs);
aheads_KQs = ggml_reshape_3d(ctx0, aheads_KQs, KQ_soft_max->ne[0], KQ_soft_max->ne[1], wstate.aheads_masks.m[il]->ne[1]);
if (aheads_cross_QKs == NULL) {
    aheads_cross_QKs = aheads_KQs;
} else {
    aheads_cross_QKs = ggml_concat(ctx0, aheads_cross_QKs, aheads_KQs, 2);
}
```
## After
```cpp
// typedef ggml_tensor_wrapper gg
// .flatten is from PyTorch
// .T is from numpy. For convenience, tensor.T() = tensor.transpose().cont()
gg aheads_KQs{KQ_soft_max.flatten(0, 1).T()};
aheads_KQs = (wstate.aheads_masks.m[il] ^ aheads_KQs).T();
aheads_KQs = aheads_KQs.reshape(KQ_soft_max->ne[0], KQ_soft_max->ne[1], wstate.aheads_masks.m[il]->ne[1]);
if (aheads_cross_QKs == NULL) {
    aheads_cross_QKs = aheads_KQs;
} else {
    aheads_cross_QKs = aheads_cross_QKs.concat(aheads_KQs, 2);
}
```

# Example 3

I am drowning in boilerplate at https://github.com/mmwillet/TTS.cpp/blob/0b420102d53c16f36ea75e626a3a3d40d7b26a4d/src/kokoro_model.cpp#L1141 .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Add a header for PyTorch-like operator overloading syntax #1326

Example 1

Before

After

Example 2

Before

After

Example 3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Add a header for PyTorch-like operator overloading syntax #1326

Description

Example 1

Before

After

Example 2

Before

After

Example 3

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions