[IE CLDNN] Improve network outputs detection in quantized FP16+INT8 IR to avoid converting them to FP16 precision #3407

jhajducz · 2020-11-29T14:35:49Z

Fallback to FP16 for non-quantized layers in quantized FP16+INT8 IR introduced in #941 shouldn't happen for network outputs. However, mechanism used to detect them was only checking if given layer has no next layers - it does not take into account the fact that even network output layers can still be used in other parts of the graph (e.g. 1st output form TopK primitive may become network output, while 2nd output from the same primitive may still be used in the graph). As a result, such FP32 outputs will be converted to FP16 precision inside clDNN, and since they were forced to have FP32 precision during model read, we end up with memory data type misalignment error.

This patch is meant to improve mechanism used to detection of network outputs for such cases to avoid converting them to FP16 precision.

Alternatively, we can keep current IE network output detection mechanism and make sure that the precision chosen for clDNN output primitives will be the same as forced during model read, something along the lines of #3405.

JIRA: CVS-43902

…R to avoid converting them to FP16 precision (openvinotoolkit#3407)

jhajducz added bug Something isn't working category: GPU OpenVINO GPU plugin labels Nov 29, 2020

jhajducz added this to the 2021.2 milestone Nov 29, 2020

jhajducz requested review from a team as code owners November 29, 2020 14:35

jhajducz changed the title ~~Improve network outputs detection in quantized FP16+INT8 IR to avoid converting them to FP16 precision~~ [IE CLDNN] Improve network outputs detection in quantized FP16+INT8 IR to avoid converting them to FP16 precision Nov 29, 2020

jhajducz modified the milestones: 2021.2, 2021.3 Nov 30, 2020

jhajducz added 5 commits November 30, 2020 16:38

[IE CLDNN] Fix network output detection for FP16+INT8 IRs

ff52556

[IE CLDNN] More detailed explanation in the comment

60beb0e

[IE CLDNN] Restore accidentally removed line

fd793ea

[IE CLDNN] Adapt to change in network.getOutputsInfo() API

bd1b37d

[IE CLDNN] Remove parameter from network.getOutputsInfo() call

59708f1

vladimir-paramuzov approved these changes Dec 1, 2020

View reviewed changes

vladimir-paramuzov merged commit 4a91f91 into openvinotoolkit:master Dec 1, 2020

jhajducz mentioned this pull request Dec 2, 2020

[IE CLDNN] Reflect output precision conversion from model reader #3405

Closed

evolosen pushed a commit to evolosen/openvino that referenced this pull request Dec 3, 2020

[IE CLDNN] Improve network outputs detection in quantized FP16+INT8 I…

e2afcad

…R to avoid converting them to FP16 precision (openvinotoolkit#3407)

mryzhov pushed a commit to mryzhov/openvino that referenced this pull request Dec 11, 2020

[IE CLDNN] Improve network outputs detection in quantized FP16+INT8 I…

58e4e44

…R to avoid converting them to FP16 precision (openvinotoolkit#3407)

mryzhov pushed a commit to mryzhov/openvino that referenced this pull request Dec 16, 2020

[IE CLDNN] Improve network outputs detection in quantized FP16+INT8 I…

7ff04ff

…R to avoid converting them to FP16 precision (openvinotoolkit#3407)

mryzhov pushed a commit to mryzhov/openvino that referenced this pull request Jan 14, 2021

[IE CLDNN] Improve network outputs detection in quantized FP16+INT8 I…

235e365

…R to avoid converting them to FP16 precision (openvinotoolkit#3407)

jiwaszki pushed a commit to akuporos/openvino that referenced this pull request Jan 15, 2021

[IE CLDNN] Improve network outputs detection in quantized FP16+INT8 I…

11e1ed2

…R to avoid converting them to FP16 precision (openvinotoolkit#3407)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IE CLDNN] Improve network outputs detection in quantized FP16+INT8 IR to avoid converting them to FP16 precision #3407

[IE CLDNN] Improve network outputs detection in quantized FP16+INT8 IR to avoid converting them to FP16 precision #3407

jhajducz commented Nov 29, 2020 •

edited

Loading

[IE CLDNN] Improve network outputs detection in quantized FP16+INT8 IR to avoid converting them to FP16 precision #3407

[IE CLDNN] Improve network outputs detection in quantized FP16+INT8 IR to avoid converting them to FP16 precision #3407

Conversation

jhajducz commented Nov 29, 2020 • edited Loading

jhajducz commented Nov 29, 2020 •

edited

Loading