-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add script to generate docs #1293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@ggerganov @slaren any thoughts? If this is useful then I will work on this |
|
I think it's nice.
The |
|
Since we only care about which operation is supported, can it be than we can modify |
|
Yes, we can add a mode to |
It already does this right? What we want to do is for all backends, report back the tests which it supports, and then use that to create a table. I'm not super familiar with it, but I guess it would need to do a |
|
|
|
I'm wondering how to do this without a lot of changes. I think the best way is that the CI tests all backends and writes to a common database somewhere, and using a post-action we can create this final summary. I initially thought we can do it locally in one process, but I don't think it's possible to get all the supported operations for all backends as the library is not organized that way ( I may be wrong), we have to be able to load the backend before we can call the In case this proves to be a lot of work, the python script is still useful and not particularly misleading |
|
The backend needs to be loaded and have a device to use |
This will be too difficult to organize. Here is an alternative approach that should be relatively easy to implement:
./bin/test-backend-ops -b CUDA support > ../docs/ops/cuda/rtx-3090.txt
./bin/test-backend-ops -b Metal support > ../docs/ops/metal/m2-ultra.txt
...
Having this implemented, it's just a matter of asking the contributors to run the |
|
Ok that makes sense, I think for a start it should be good. Whenever someone adds a new op or implementation from their device, this script should pick it up. Only that's not foolproof, people can forget to update this file when adding a new op - so docs can be temporarily out of sync, which is okay. Also the table will be quite dense as this granularity (device, ops, backend). I'm trying to find a balance between it being useful for newcomers and also a source of truth for ops. Let me know if you have any suggestions. |
|
The final table can be the exact same format as your table here. It does not have to include the device information. It would simply use it to correctly determine "partial support". For example if an op is fully supported on Metal with M3 chip, but partially supported on Metal with M1 chip, then the script will pick this up and the final table will indicate that the op is partially supported on Metal. No need to display why and on which device it is partially supported. |
|
@ggerganov @slaren this is now ready for review, I think it might be useful if some others can push their backends onto this PR before merging, just need to run |
|
Here is the result on M4 Max:
|
|
Two things to resolve here still
|
You can include the backend and devices name in the output of The table could be generated automatically from a github action when the files change. |
I made the first change, someone else will need to add the github action |
|
|
||
| printf("supported,%s,%s,%s,%s,%s\n", | ||
| ggml_backend_reg_name(ggml_backend_dev_backend_reg(ggml_backend_get_device(backend))), | ||
| ggml_backend_dev_name(ggml_backend_get_device(backend)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have been the device description, not the name, which is not useful in this context.
| ggml_backend_dev_name(ggml_backend_get_device(backend)), | |
| ggml_backend_dev_description(ggml_backend_get_device(backend)), |
Note that this an arbitrary string and will need to be escaped. For example Vulkan with llvmpipe may return llvmpipe (LLVM 19.1.1, 256 bits).
| ggml_backend_dev_name(ggml_backend_get_device(backend)), | ||
| op_desc.c_str(), | ||
| supported ? "yes" : "no", | ||
| test_vars.c_str() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test_vars also need to be escaped since they contain commas, the current CSVs are not valid.
|
There was a change merged recently to |
|
The data files should also contain the version/commit used. |
The action should be in the llama.cpp repository anyway, nearly all the backend development happens there. |
Should I open this PR in llama.cpp? |
I think that would be better, it is more likely to be noticed by backend developers there. |
|
Opened a PR there, closing this now |
Maintaining parity between backends is desirable, and developers have to dig in the code to understand which operations/backends need implementation. But adding these ops is good for newcomers as it's straightforward to test.
This PR adds basic documentation for ops using a script, it's limited in scope as it doesn't point when an operation is only partially implemented(e.g. some backend might not support some quantization types etc), which is something we can do in the future.
For now, adding this + automatically generation of the md file after CI would serve as a first point of reference and I think would be useful. Creating as draft because at the moment it doesn't go through the CI