New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

nncase2.8.3使用(u)int8和/或前处理时量化效果很差，且int16无法调用kpu #1231

Closed

PermissionDenied7335 opened this issue Aug 21, 2024 · 4 comments

PermissionDenied7335 commented Aug 21, 2024 •

edited

Loading

我修改了yolov8模型的结构，将mobilenetv3缝合到yolov8n中作为backbone，检测头未作修改，在openvino的int8量化前后均可在验证集中达到0.97以上的mAP50。
但是由nncase量化后，只要开启int8量化或者uint8[0-255]转float32[0-1]的前处理，则模型输出的置信度将低到与背景无法区分，只有将quant_type和w_quant_type均指定为int16时才能得到与原模型差异不大的结果，此时kmodel与onnx的输出的余弦相似度可达到0.9999983。
而且如果w_quant_type将权重量化至int16，则推理时间会剧增至4到5分钟一帧，约等于fp32的cpu模型的速度，可见该情况下没有能正确使用kpu推理。同时经测试，使用int8量化的模型虽然置信度不可用，但推理时间大约数百毫秒一帧，虽然离可用尚有差距，但不至于启动推理后几乎死机，也才有可能继续裁剪模型适配kpu.
附件内含onnx格式模型，模型编译脚本，以及全int16量化下的kmodel（dump文件夹下）
compile.zip

PermissionDenied7335 changed the title ~~nncase2.8.3使用(u)int8和/或前处理时量化效果一般，且int16无法调用kpu~~ nncase2.8.3使用(u)int8和/或前处理时量化效果很差，且int16无法调用kpu

Member

curioyang commented Aug 26, 2024 •

edited

Loading

@PermissionDenied7335 检查一下输入的numpy是不是内存连续，如果不是的话把所有的数据都加上img_data = np.ascontiguousarray(img_data)

目前我这边测试精度没有问题

Author

PermissionDenied7335 commented Aug 26, 2024

好的，调整为内存连续后int8精度也正常了
不过还想确认一下，int16是不能通过kpu加速的吗？

Member

curioyang commented Aug 26, 2024

好的，调整为内存连续后int8精度也正常了不过还想确认一下，int16是不能通过kpu加速的吗？

并不是，只不过是不能同时数据和权重都int16，单个int16都是没问题的

Author

PermissionDenied7335 commented Aug 26, 2024

谢谢，那么这个问题就当作结束了

PermissionDenied7335 closed this as completed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment