8 bit quantization #514

Positronx · 2024-05-24T16:07:27Z

Positronx
May 24, 2024

Hello,
Are 8 bit quantization methods planned to be supported by ONNXRuntime-genai in the future?
Thanks,

mgiessing · 2024-05-26T11:00:49Z

Yeah, int8 quantization would be of interest for me as well!

0 replies

prashantaithal · 2024-08-08T22:15:36Z

Any update on this?

0 replies

elephantpanda · 2024-08-19T05:04:16Z

I just tried it with the int4 phi-3 onnx model and it worked fine if that's any help.

0 replies