how to serve onnx based models on the web via rest api #23165
aria3ppp
started this conversation in
Ideas / Feature Requests
Replies: 1 comment
-
That would be another project that wraps onnxruntime in a web service. I know Triton has a backend for onnx https://github.com/triton-inference-server/onnxruntime_backend But never tried them |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
is there any least affort way to serve the models on the web? i mean llama.cpp have llama.server tool that deploy a openai api like rest api? is there any option here for onnxruntime?
Beta Was this translation helpful? Give feedback.
All reactions