Skip to content

Conversation

@am17an
Copy link
Contributor

@am17an am17an commented Jul 7, 2025

Maintaining parity between backends is desirable, and developers have to dig in the code to understand which operations/backends need implementation. But adding these ops is good for newcomers as it's straightforward to test.

This PR adds basic documentation for ops using a script, it's limited in scope as it doesn't point when an operation is only partially implemented(e.g. some backend might not support some quantization types etc), which is something we can do in the future.

For now, adding this + automatically generation of the md file after CI would serve as a first point of reference and I think would be useful. Creating as draft because at the moment it doesn't go through the CI

@am17an
Copy link
Contributor Author

am17an commented Jul 7, 2025

@ggerganov @slaren any thoughts? If this is useful then I will work on this

@ggerganov
Copy link
Member

I think it's nice.

it's limited in scope as it doesn't point when an operation is only partially implemented

The test-backend-ops could generate more precise summary, though it would be per-backend and would have to rely on developers to run it and push the updated tables for the backends that they use. If we do it like this, we can have a script to combine all the per-backend summaries into a final summary as in this PR.

@am17an
Copy link
Contributor Author

am17an commented Jul 7, 2025

Since we only care about which operation is supported, can it be than we can modify test-backend-ops to not run the operation but just return early, kind of like a probe onto which operation is supported, that way we can just rely on that to create the final summary without any manual intervention

@slaren
Copy link
Member

slaren commented Jul 7, 2025

Yes, we can add a mode to test-backend-ops that only tests which test cases are supported.

@am17an
Copy link
Contributor Author

am17an commented Jul 7, 2025

Yes, we can add a mode to test-backend-ops that only tests which test cases are supported.

It already does this right? What we want to do is for all backends, report back the tests which it supports, and then use that to create a table. I'm not super familiar with it, but I guess it would need to do a backend_init for this, which would fail if the backend is not available.

@slaren
Copy link
Member

slaren commented Jul 7, 2025

test-backend-ops currently has modes test, grad and perf. What I mean is that you could add a new mode support that only checks which test cases are supported, but does not run them.

@am17an
Copy link
Contributor Author

am17an commented Jul 7, 2025

I'm wondering how to do this without a lot of changes. I think the best way is that the CI tests all backends and writes to a common database somewhere, and using a post-action we can create this final summary. I initially thought we can do it locally in one process, but I don't think it's possible to get all the supported operations for all backends as the library is not organized that way ( I may be wrong), we have to be able to load the backend before we can call the supports_op function.

In case this proves to be a lot of work, the python script is still useful and not particularly misleading

@slaren
Copy link
Member

slaren commented Jul 7, 2025

The backend needs to be loaded and have a device to use supports_op. The ops that a backend supports may depend on the device. The python script could be ok for a quick overview, but to get into the details you really need test-backend-ops.

@ggerganov
Copy link
Member

I think the best way is that the CI tests all backends and writes to a common database somewhere, and using a post-action we can create this final summary.

This will be too difficult to organize. Here is an alternative approach that should be relatively easy to implement:

  • Extend test-backend-ops with support mode as @slaren explained. It should print the results in some machine-ingestible format and should output the following information:
    • backend
    • device name
    • op
    • support: full, partial, none
  • In /docs create a README with instructions how to run the test-backend-ops. For example:
./bin/test-backend-ops -b CUDA support > ../docs/ops/cuda/rtx-3090.txt
./bin/test-backend-ops -b Metal support > ../docs/ops/metal/m2-ultra.txt
...
  • Write a script that reads all files in docs/ops/*/*.txt and generates a markdown table based on the information it finds in the files and write is in docs/ops/README.md

Having this implemented, it's just a matter of asking the contributors to run the test-backend-ops from time to time on their hardware and commit the results upstream.

@am17an
Copy link
Contributor Author

am17an commented Jul 7, 2025

Ok that makes sense, I think for a start it should be good. Whenever someone adds a new op or implementation from their device, this script should pick it up. Only that's not foolproof, people can forget to update this file when adding a new op - so docs can be temporarily out of sync, which is okay.

Also the table will be quite dense as this granularity (device, ops, backend). I'm trying to find a balance between it being useful for newcomers and also a source of truth for ops. Let me know if you have any suggestions.

@ggerganov
Copy link
Member

The final table can be the exact same format as your table here. It does not have to include the device information. It would simply use it to correctly determine "partial support".

For example if an op is fully supported on Metal with M3 chip, but partially supported on Metal with M1 chip, then the script will pick this up and the final table will indicate that the op is partially supported on Metal. No need to display why and on which device it is partially supported.

@am17an am17an marked this pull request as ready for review July 9, 2025 07:53
@am17an
Copy link
Contributor Author

am17an commented Jul 9, 2025

@ggerganov @slaren this is now ready for review, I think it might be useful if some others can push their backends onto this PR before merging, just need to run test-backend-ops support -b <backend> > docs/ops/<BACKEND>/<device>.txt

@ggerganov
Copy link
Member

Here is the result on M4 Max:

Operation blas cpu metal
ABS
ACC
ADD 🟡
ADD1
ARANGE
ARGMAX
ARGSORT
CLAMP 🟡
CONCAT
CONT 🟡
CONV_2D_DW
CONV_TRANSPOSE_1D
CONV_TRANSPOSE_2D
COS 🟡
COUNT_EQUAL
CPY 🟡 🟡
CROSS_ENTROPY_LOSS
CROSS_ENTROPY_LOSS_BACK
DIAG_MASK_INF 🟡
DIV 🟡
DUP 🟡
ELU 🟡
EXP
FLASH_ATTN_EXT 🟡
GATED_LINEAR_ATTN
GEGLU 🟡
GELU 🟡
GELU_ERF 🟡
GELU_QUICK 🟡
GET_ROWS 🟡
GET_ROWS_BACK 🟡
GROUP_NORM
HARDSIGMOID
HARDSWISH
IM2COL 🟡
L2_NORM
LEAKY_RELU
LOG
MEAN
MUL 🟡
MUL_MAT 🟡 🟡 🟡
MUL_MAT_ID 🟡
NEG 🟡
NORM 🟡
OPT_STEP_ADAMW
OUT_PROD 🟡 🟡
PAD
PAD_REFLECT_1D
POOL_2D
REGLU 🟡
RELU 🟡
REPEAT
REPEAT_BACK
RMS_NORM 🟡
RMS_NORM_BACK
RMS_NORM_MUL
ROLL
ROPE
ROPE_BACK
RWKV_WKV6
RWKV_WKV7
SCALE
SET
SET_ROWS 🟡 🟡
SGN
SIGMOID 🟡
SILU 🟡
SILU_BACK
SIN 🟡
SOFT_MAX
SOFT_MAX_BACK 🟡
SQR 🟡
SQRT 🟡
SSM_CONV
SSM_SCAN
STEP
SUB 🟡
SUM
SUM_ROWS
SWIGLU 🟡
TANH 🟡
TIMESTEP_EMBEDDING
UPSCALE 🟡

@ggerganov ggerganov requested a review from slaren July 9, 2025 09:26
@am17an
Copy link
Contributor Author

am17an commented Jul 9, 2025

Two things to resolve here still

  1. Need to enforce consistent naming of backends (i.e. CPU/cpu/Cpu) will all mess up this table
  2. Once someone adds something to docs//*.txt, we need to enforce that they also generate the ops.md table

@slaren
Copy link
Member

slaren commented Jul 9, 2025

  1. Need to enforce consistent naming of backends (i.e. CPU/cpu/Cpu) will all mess up this table

You can include the backend and devices name in the output of test-backend-ops support (use ggml_backend_reg_name() and ggml_backend_dev_name()). Then modify the script to take the backend name from the table, and just scan all the files in a directory.

The table could be generated automatically from a github action when the files change.

@am17an
Copy link
Contributor Author

am17an commented Jul 9, 2025

  1. Need to enforce consistent naming of backends (i.e. CPU/cpu/Cpu) will all mess up this table

You can include the backend and devices name in the output of test-backend-ops support (use ggml_backend_reg_name() and ggml_backend_dev_name()). Then modify the script to take the backend name from the table, and just scan all the files in a directory.

The table could be generated automatically from a github action when the files change.

I made the first change, someone else will need to add the github action


printf("supported,%s,%s,%s,%s,%s\n",
ggml_backend_reg_name(ggml_backend_dev_backend_reg(ggml_backend_get_device(backend))),
ggml_backend_dev_name(ggml_backend_get_device(backend)),
Copy link
Member

@slaren slaren Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have been the device description, not the name, which is not useful in this context.

Suggested change
ggml_backend_dev_name(ggml_backend_get_device(backend)),
ggml_backend_dev_description(ggml_backend_get_device(backend)),

Note that this an arbitrary string and will need to be escaped. For example Vulkan with llvmpipe may return llvmpipe (LLVM 19.1.1, 256 bits).

ggml_backend_dev_name(ggml_backend_get_device(backend)),
op_desc.c_str(),
supported ? "yes" : "no",
test_vars.c_str()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test_vars also need to be escaped since they contain commas, the current CSVs are not valid.

@slaren
Copy link
Member

slaren commented Jul 9, 2025

There was a change merged recently to test-backend-ops in llama.cpp that adds a printer interface to allow different output formats. It would be good to use that here (for test-backend-ops support), and default to a human-readable console output.

@slaren
Copy link
Member

slaren commented Jul 9, 2025

The data files should also contain the version/commit used.

@slaren
Copy link
Member

slaren commented Jul 9, 2025

I made the first change, someone else will need to add the github action

The action should be in the llama.cpp repository anyway, nearly all the backend development happens there.

@am17an
Copy link
Contributor Author

am17an commented Jul 9, 2025

I made the first change, someone else will need to add the github action

The action should be in the llama.cpp repository anyway, nearly all the backend development happens there.

Should I open this PR in llama.cpp?

@slaren
Copy link
Member

slaren commented Jul 9, 2025

Should I open this PR in llama.cpp?

I think that would be better, it is more likely to be noticed by backend developers there.

@am17an
Copy link
Contributor Author

am17an commented Jul 9, 2025

Opened a PR there, closing this now

@am17an am17an closed this Jul 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants