how to serve onnx based models on the web via rest api #23165

aria3ppp · 2024-12-20T09:10:36Z

aria3ppp
Dec 20, 2024

is there any least affort way to serve the models on the web? i mean llama.cpp have llama.server tool that deploy a openai api like rest api? is there any option here for onnxruntime?

tarekziade · 2024-12-23T09:06:18Z

tarekziade
Dec 23, 2024

That would be another project that wraps onnxruntime in a web service.

I know Triton has a backend for onnx https://github.com/triton-inference-server/onnxruntime_backend
And I saw this project on Github https://github.com/kibae/onnxruntime-server

But never tried them

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to serve onnx based models on the web via rest api #23165

{{title}}

Replies: 1 comment

{{title}}

Select a reply

how to serve onnx based models on the web via rest api #23165

aria3ppp Dec 20, 2024

Replies: 1 comment

tarekziade Dec 23, 2024

aria3ppp
Dec 20, 2024

tarekziade
Dec 23, 2024