Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integer tflite model slow than float32 model #796

Open
yahuuu opened this issue Aug 19, 2021 · 11 comments
Open

Integer tflite model slow than float32 model #796

yahuuu opened this issue Aug 19, 2021 · 11 comments
Assignees

Comments

@yahuuu
Copy link

yahuuu commented Aug 19, 2021

1. System information

  • quantized integer tflite model in Window10(without Nvidia GPU)
  • TensorFlow installation method pip installed tensorflow-gpu-2.2.2 and tensorflow-2.2.2.
  • try to run quantized model in mobile phone (Mi 9 with built-in GPU and DSP).
  • MI 9: android 9
  • nnapi-delegate tool version: 2.5.0
  • gpu-delegate tool version: 2.5.0
  • android.tools.build:gradle: 4.1.3
  • compile Sdk version : 29 android 10
  • build tools version: 29.0.2
  • NDK version: 21.1.6352462

2. Code

    # part of my script
    try:
        keras_model = tf.keras.models.load_model(save_model_dir)
        # Normalize the input image so that each pixel value is between 0 to 1.
        converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
        converter.optimizations = [tf.lite.Optimize.DEFAULT]
        if INTEGER_QUANTIZATON:
            converter.representative_dataset = representative_data_gen1
            # Ensure that if any ops can't be quantized, the converter throws an error
            converter.representative_dataset = tf.lite.RepresentativeDataset(representative_data_gen1)
            converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
            # Set the input and output tensors to uint8 (APIs added in r2.3)
            converter.inference_input_type = tf.int8
            converter.inference_output_type = tf.int8
        tflite_model_quant = converter.convert()
        open(quantized_save_path, "wb").write(tflite_model_quant)

Option A: Reference colab notebooks

  1. Reference url:
    https://www.tensorflow.org/lite/performance/post_training_integer_quant

3. Failure after conversion

I try to quantify the model to increase the speed of inference. The float32 tflite model work well on mobile gpu(mi 9) and DSP(mi 9), BUT afer integer quantized to int8 model, the int8 tflite inference slow down on gpu and cannot work on DSP.
floate32 tflite model
size: 86 KB
after quantized to integer model :
size: 38.3KB

5. Android studio logs

2021-08-19 12:53:22.484 7948-7948/? A/DEBUG: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x7c29573cd0
2021-08-19 12:53:22.484 7948-7948/? A/DEBUG: x0 0000007ff5514480 x1 0000007b8f610804 x2 0000000000000008 x3 0000007ff5514488
2021-08-19 12:53:22.484 7948-7948/? A/DEBUG: x4 0000007b8f61080c x5 0000000000000000 x6 0000000200000004 x7 0000000200000004
2021-08-19 12:53:22.484 7948-7948/? A/DEBUG: x8 0000007c29573cd0 x9 0000000000000000 x10 0000007ff5514488 x11 0000000000000002
2021-08-19 12:53:22.484 7948-7948/? A/DEBUG: x12 0000000000038400 x13 00000000000383f8 x14 0000007b898e1010 x15 0000007b898e0f70
2021-08-19 12:53:22.484 7948-7948/? A/DEBUG: x16 0000007b8fc171e8 x17 0000007c2afbb2c0 x18 0000000000000008 x19 0000007ff5514500
2021-08-19 12:53:22.484 7948-7948/? A/DEBUG: x20 0000007ff5514480 x21 0000000000000008 x22 0000007c2edb95e0 x23 0000007c2edb95e0
2021-08-19 12:53:22.484 7948-7948/? A/DEBUG: x24 0000007b8f6580e0 x25 0000000000000002 x26 0000007b8f665238 x27 0000007b8f6651c8
2021-08-19 12:53:22.484 7948-7948/? A/DEBUG: x28 000000000004b000 x29 0000007ff55144e0
2021-08-19 12:53:22.484 7948-7948/? A/DEBUG: sp 0000007ff5514470 lr 0000007b8facafe0 pc 0000007b8facb088
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: backtrace:
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #00 pc 00000000000f9088 /data/app/com.example.srdemo-iDI0OLee27Jcr3JbKsYvqg==/base.apk (offset 0x25a000)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #1 pc 00000000000f9134 /data/app/com.example.srdemo-iDI0OLee27Jcr3JbKsYvqg==/base.apk (offset 0x25a000)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #2 pc 00000000001c2be8 /data/app/com.example.srdemo-iDI0OLee27Jcr3JbKsYvqg==/base.apk (offset 0x25a000)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #3 pc 00000000001c6404 /data/app/com.example.srdemo-iDI0OLee27Jcr3JbKsYvqg==/base.apk (offset 0x25a000)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #4 pc 0000000000002f10 /data/app/com.example.srdemo-iDI0OLee27Jcr3JbKsYvqg==/base.apk (offset 0x254000) (srdemo::SuperResolution::DoSuperResolution(signed char*)+292)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #5 pc 00000000000021a8 /data/app/com.example.srdemo-iDI0OLee27Jcr3JbKsYvqg==/base.apk (offset 0x254000) (Java_com_example_srdemo_render_JniLib_SuperResolutionFromJNI+136)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #6 pc 0000000000046a50 /data/app/com.example.srdemo-iDI0OLee27Jcr3JbKsYvqg==/oat/arm64/base.odex (offset 0x45000) (com.example.srdemo.render.JniLib.SuperResolutionFromJNI+160)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #7 pc 00000000000c17b4 /data/app/com.example.srdemo-iDI0OLee27Jcr3JbKsYvqg==/oat/arm64/base.odex (offset 0x45000) (com.example.srdemo.render.JniLib.GetYResult+116)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #8 pc 000000000000b584 /dev/ashmem/dalvik-jit-code-cache (deleted) (com.example.srdemo.render.SRGLSurfaceView.onPreviewFrame+404)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #9 pc 00000000000110a0 /dev/ashmem/dalvik-jit-code-cache (deleted) (com.example.srdemo.camera.CameraProxy.onPreviewFrame+160)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #10 pc 0000000000557388 /system/lib64/libart.so (art_quick_invoke_stub+584)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #11 pc 00000000000cfcc8 /system/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+200)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #12 pc 0000000000280338 /system/lib64/libart.so (art::interpreter::ArtInterpreterToCompiledCodeBridge(art::Thread*, art::ArtMethod*, art::ShadowFrame*, unsigned short, art::JValue*)+344)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #13 pc 000000000027a34c /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+968)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #14 pc 0000000000527444 /system/lib64/libart.so (MterpInvokeInterface+1392)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #15 pc 0000000000549b94 /system/lib64/libart.so (ExecuteMterpImpl+14740)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #16 pc 00000000005a6a3a /system/framework/boot-framework.vdex (android.hardware.Camera$EventHandler.handleMessage+542)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #17 pc 0000000000254050 /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3375396565+488)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #18 pc 0000000000259b44 /system/lib64/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+216)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #19 pc 000000000027a330 /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+940)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #20 pc 00000000005264c8 /system/lib64/libart.so (MterpInvokeVirtual+588)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #21 pc 0000000000549994 /system/lib64/libart.so (ExecuteMterpImpl+14228)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #22 pc 0000000000bb4c42 /system/framework/boot-framework.vdex (android.os.Handler.dispatchMessage+42)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #23 pc 0000000000254050 /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3375396565+488)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #24 pc 0000000000259b44 /system/lib64/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+216)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #25 pc 000000000027a330 /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+940)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #26 pc 00000000005264c8 /system/lib64/libart.so (MterpInvokeVirtual+588)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #27 pc 0000000000549994 /system/lib64/libart.so (ExecuteMterpImpl+14228)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #28 pc 0000000000bc781a /system/framework/boot-framework.vdex (android.os.Looper.loop+416)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #29 pc 0000000000254050 /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3375396565+488)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #30 pc 0000000000259b44 /system/lib64/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+216)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #31 pc 000000000027a330 /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+940)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #32 pc 00000000005279cc /system/lib64/libart.so (MterpInvokeStatic+204)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #33 pc 0000000000549b14 /system/lib64/libart.so (ExecuteMterpImpl+14612)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #34 pc 0000000000421b46 /system/framework/boot-framework.vdex (android.app.ActivityThread.main+214)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #35 pc 0000000000254050 /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3375396565+488)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #36 pc 0000000000516d7c /system/lib64/libart.so (artQuickToInterpreterBridge+1020)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #37 pc 00000000005604fc /system/lib64/libart.so (art_quick_to_interpreter_bridge+92)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #38 pc 000000000055764c /system/lib64/libart.so (art_quick_invoke_static_stub+604)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #39 pc 00000000000cfce8 /system/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+232)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #40 pc 000000000045dd7c /system/lib64/libart.so (art::(anonymous namespace)::InvokeWithArgArray(art::ScopedObjectAccessAlreadyRunnable const&, art::ArtMethod*, art::(anonymous namespace)::ArgArray*, art::JValue*, char const*)+104)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #41 pc 000000000045f7d0 /system/lib64/libart.so (art::InvokeMethod(art::ScopedObjectAccessAlreadyRunnable const&, _jobject*, _jobject*, _jobject*, unsigned long)+1440)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #42 pc 00000000003ef398 /system/lib64/libart.so (art::Method_invoke(_JNIEnv*, _jobject*, _jobject*, _jobjectArray*)+52)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #43 pc 000000000078eed4 /system/framework/arm64/boot-core-oj.oat (offset 0x2dc000) (java.lang.Class.getDeclaredMethodInternal [DEDUPED]+180)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #44 pc 0000000000557388 /system/lib64/libart.so (art_quick_invoke_stub+584)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #45 pc 00000000000cfcc8 /system/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+200)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #46 pc 0000000000280338 /system/lib64/libart.so (art::interpreter::ArtInterpreterToCompiledCodeBridge(art::Thread*, art::ArtMethod*, art::ShadowFrame*, unsigned short, art::JValue*)+344)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #47 pc 000000000027a34c /system/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+968)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #48 pc 00000000005264c8 /system/lib64/libart.so (MterpInvokeVirtual+588)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #49 pc 0000000000549994 /system/lib64/libart.so (ExecuteMterpImpl+14228)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #50 pc 0000000001284a64 /system/framework/boot-framework.vdex (com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run+22)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #51 pc 0000000000254050 /system/lib64/libart.so (_ZN3art11interpreterL7ExecuteEPNS_6ThreadERKNS_20CodeItemDataAccessorERNS_11ShadowFrameENS_6JValueEb.llvm.3375396565+488)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #52 pc 0000000000516d7c /system/lib64/libart.so (artQuickToInterpreterBridge+1020)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #53 pc 00000000005604fc /system/lib64/libart.so (art_quick_to_interpreter_bridge+92)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #54 pc 000000000245d5dc /system/framework/arm64/boot-framework.oat (offset 0xa28000) (com.android.internal.os.ZygoteInit.main+2028)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #55 pc 000000000055764c /system/lib64/libart.so (art_quick_invoke_static_stub+604)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #56 pc 00000000000cfce8 /system/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+232)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #57 pc 000000000045dd7c /system/lib64/libart.so (art::(anonymous namespace)::InvokeWithArgArray(art::ScopedObjectAccessAlreadyRunnable const&, art::ArtMethod*, art::(anonymous namespace)::ArgArray*, art::JValue*, char const*)+104)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #58 pc 000000000045d9dc /system/lib64/libart.so (art::InvokeWithVarArgs(art::ScopedObjectAccessAlreadyRunnable const&, _jobject*, _jmethodID*, std::__va_list)+424)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #59 pc 0000000000362cd8 /system/lib64/libart.so (art::JNI::CallStaticVoidMethodV(_JNIEnv*, _jclass*, _jmethodID*, std::__va_list)+652)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #60 pc 00000000000b2884 /system/lib64/libandroid_runtime.so (_JNIEnv::CallStaticVoidMethod(_jclass*, _jmethodID*, ...)+116)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #61 pc 00000000000b53f4 /system/lib64/libandroid_runtime.so (android::AndroidRuntime::start(char const*, android::Vectorandroid::String8 const&, bool)+924)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #62 pc 0000000000002528 /system/bin/app_process64 (main+2012)
2021-08-19 12:53:22.667 7948-7948/? A/DEBUG: #63 pc 00000000000c888c /system/lib64/libc.so (__libc_init+88)
2021-08-19 12:53:22.756 7948-7948/? E/crash_dump64: cannot open libmiuindbg.so: No such file or directory
2021-08-19 12:53:22.760 985-985/? E//system/bin/tombstoned: Tombstone written to: /data/tombstones/tombstone_07
2021-08-19 12:53:22.828 1410-1941/? E/InputDispatcher: channel 'ad25ed com.example.srdemo/com.example.srdemo.MainActivity (server)' ~ Channel is unrecoverably broken and will be disposed!
2021-08-19 12:53:22.845 930-25654/? E/Camera2Client: notifyError: Error condition 0 reported by HAL, requestId -1
2021-08-19 12:53:22.869 783-870/? E/BufferQueueProducer: [SurfaceView - com.example.srdemo/com.example.srdemo.MainActivity#1] queueBuffer: BufferQueue has been abandoned
2021-08-19 12:53:22.870 930-1229/? E/Surface: queueBuffer: error queuing buffer to SurfaceTexture, -19
2021-08-19 12:53:22.879 783-870/? E/BufferQueueProducer: [SurfaceView - com.example.srdemo/com.example.srdemo.MainActivity#1] cancelBuffer: BufferQueue has been abandoned
2021-08-19 12:53:22.911 783-1273/? E/BufferQueueProducer: [SurfaceView - com.example.srdemo/com.example.srdemo.MainActivity#1] cancelBuffer: BufferQueue has been abandoned
2021-08-19 12:53:22.912 783-1273/? E/BufferQueueProducer: [SurfaceView - com.example.srdemo/com.example.srdemo.MainActivity#1] cancelBuffer: BufferQueue has been abandoned
2021-08-19 12:53:22.912 783-1273/? E/BufferQueueProducer: [SurfaceView - com.example.srdemo/com.example.srdemo.MainActivity#1] cancelBuffer: BufferQueue has been abandoned
2021-08-19 12:53:22.912 783-1273/? E/BufferQueueProducer: [SurfaceView - com.example.srdemo/com.example.srdemo.MainActivity#1] cancelBuffer: BufferQueue has been abandoned
2021-08-19 12:53:23.189 6722-6722/? E/Launcher: changeViewByFsGestureState, view=FitSystemWindowView, alpha=1.0, scale=1.0
2021-08-19 12:53:23.189 6722-6722/? E/Launcher: changeViewByFsGestureState, view=ShortcutMenuLayer, alpha=1.0, scale=1.0
2021-08-19 12:53:32.851 722-7760/? E//vendor/bin/hw/[email protected]: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/listener_android.c:244:listener protocol failure ffffffff
2021-08-19 12:53:32.852 722-7760/? E//vendor/bin/hw/[email protected]: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/listener_android.c:251::error: -2147482611: 0 == (nErr = __QAIC_HEADER(adsp_listener_next2)( ctx, nErr, 0, 0, &ctx, &handle, &sc, inBufs, inBufsLen, &inBufsLenReq))
2021-08-19 12:53:32.852 722-7760/? E//vendor/bin/hw/[email protected]: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/listener_android.c:333:Error 0x8000040d: listener2 thread exited
2021-08-19 12:53:32.856 722-7761/? E//vendor/bin/hw/[email protected]: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/log_config.c:319:Received exit.
2021-08-19 12:53:32.876 722-24072/? E/[email protected]: FAILED TO CLOSE LIBRARY
2021-08-19 12:53:32.921 722-7965/? E//vendor/bin/hw/[email protected]: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/apps_std_imp.c:729:Error 45: fopen failed for testsig-0xa454e1ab.so. (No such file or directory)
2021-08-19 12:53:32.921 915-964/? E//vendor/bin/cdsprpcd: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/apps_std_imp.c:729:Error 45: fopen failed for testsig-0xa454e1ab.so. (No such file or directory)
2021-08-19 12:53:32.922 722-7965/? E//vendor/bin/hw/[email protected]: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/apps_std_imp.c:729:Error 45: fopen failed for testsig.so. (No such file or directory)
2021-08-19 12:53:32.922 915-964/? E//vendor/bin/cdsprpcd: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/apps_std_imp.c:729:Error 45: fopen failed for testsig.so. (No such file or directory)

tmp1
tmp2

@yahuuu yahuuu changed the title ModelOptimizationToolkit Integer tflite model slow than float32 model Aug 19, 2021
@mohantym
Copy link

Hi@ yahuuu !
Could you please try on latest stable version of TF 2.6 and let us know if this is still an issue.Thanks!

@abattery abattery transferred this issue from tensorflow/tensorflow Aug 19, 2021
@yahuuu
Copy link
Author

yahuuu commented Aug 19, 2021

Hi@ yahuuu !
Could you please try on latest stable version of TF 2.6 and let us know if this is still an issue.Thanks!

@mohantym Glad to receive a reply soon.
whether I should try to use tf2.6 stable version to quantize the model or use tf2.6 to compile the nnapi-delegete?

@yahuuu
Copy link
Author

yahuuu commented Aug 19, 2021

The type of filter is int8 array , I updated converter.inference_input_type = tf.int8 to converter.inference_input_type = tf.uint8 in script, but still got int8 array. How can I get uint8 filter model. @MeghnaNatraj
I have read the issue about Unsupported Full-Integer, but can I get uint8 tflite through low version of tensorflow?

@SkylerZheng
Copy link

Hi @yahuuu , I also have this int tflite slower than float tflite, and mine is much slower actually. I checked my model with Netron, and except the input and output are uint8, the rest of weights and activations are int8. Do you have the similar problem as well?

@MeghnaNatraj
Copy link
Member

@yahuuu yes you can by using TF 1.15. The code might looks like this (source)

import numpy as np
import tensorflow as tf

# Generate tf.keras model.
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(2, input_shape=(3,)))
model.add(tf.keras.layers.RepeatVector(3))
model.add(tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(3)))
model.compile(loss=tf.keras.losses.MSE,
              optimizer=tf.keras.optimizers.RMSprop(lr=0.0001),
              metrics=[tf.keras.metrics.categorical_accuracy],
              sample_weight_mode='temporal')

x = np.random.random((1, 3))
y = np.random.random((1, 3, 3))
model.train_on_batch(x, y)
model.predict(x)

# Save tf.keras model in H5 format.
keras_file = 'keras_model.h5'
tf.keras.models.save_model(model, keras_file)

# Convert the model.
converter = tf.compat.v1.lite.TFLiteConverter.from_keras_model_file(keras_file)
converter.optimizations = {tf.lite.Optimize.DEFAULT}
converter.representative_dataset = representative_data_gen
converter.target_spec.supported_ops = {tf.lite.OpsSet.TFLITE_BUILTINS_INT8}
converter.inference_type = tf.uint8 # optional. you can also use tf.int8
# converter.inference_input_type=tf.uint8 # optional. you can also use tf.int8
# converter.inference_output_type=tf.uint8 # optional. you can also use tf.int8
tflite_model = converter.convert()

# Save the model.
with open('model.tflite', 'wb') as f:
  f.write(tflite_model)

@yahuuu
Copy link
Author

yahuuu commented Aug 23, 2021

Hi @yahuuu , I also have this int tflite slower than float tflite, and mine is much slower actually. I checked my model with Netron, and except the input and output are uint8, the rest of weights and activations are int8. Do you have the similar problem as well?

In my model, the input and output are fp32, and only conv2d weight is of int8. My model has a relatively small number of convolutions, and the model is about 7% slower after quantization.

@MeghnaNatraj
Copy link
Member

@yahuuu

Quantization is most effective for larger models with more convolutions. It also does not guarantee that the model will be faster. This model might be one of the few cases where it's slower as there are fewer convolutions and the model input/output is still float32 - further adding to the overhead of quantizing the inputs from float to int8 and and outputs from int8 to float32 during inference.

Make the input and output int8 as well and try again. (not float32 and not uint8). Inspect the model using Netron and ensure that all the ops are int8 quantized. Run inference for many images at once using the same interpreter and take an average of the inference time.

@MeghnaNatraj MeghnaNatraj self-assigned this Aug 26, 2021
@yahuuu
Copy link
Author

yahuuu commented Oct 12, 2021

@MeghnaNatraj
I try to modify the input data type to int8 when I quantize the model, but meet error :
"""
Cannot set tensor: Got value of type UINT8 but expected type FLOAT32 for input 0, name: input
"""
I see that there are many uint8 models in this link, and there are no quantize or dequantize operators!
I am very curious how the official quantized models, I guess when training the model, the type of input data is uint8 or int8.

@yahuuu
Copy link
Author

yahuuu commented Oct 12, 2021

@mohantym Sorry I don't think it is the reason for the version.

@MeghnaNatraj
Copy link
Member

The models might be quite outdated (from TF1). Please use models from TF Hub eg: https://tfhub.dev/s?deployment-format=lite&module-type=image-classification&q=quantized

@mohantym
Copy link

Hi @yahuuu ! Did you check with above suggestion yet?

@mohantym mohantym removed their assignment Nov 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants