Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IE CLDNN] Improve network outputs detection in quantized FP16+INT8 IR to avoid converting them to FP16 precision #3407

Merged
merged 5 commits into from
Dec 1, 2020
Merged

Conversation

jhajducz
Copy link
Contributor

@jhajducz jhajducz commented Nov 29, 2020

Fallback to FP16 for non-quantized layers in quantized FP16+INT8 IR introduced in #941 shouldn't happen for network outputs. However, mechanism used to detect them was only checking if given layer has no next layers - it does not take into account the fact that even network output layers can still be used in other parts of the graph (e.g. 1st output form TopK primitive may become network output, while 2nd output from the same primitive may still be used in the graph). As a result, such FP32 outputs will be converted to FP16 precision inside clDNN, and since they were forced to have FP32 precision during model read, we end up with memory data type misalignment error.

This patch is meant to improve mechanism used to detection of network outputs for such cases to avoid converting them to FP16 precision.

Alternatively, we can keep current IE network output detection mechanism and make sure that the precision chosen for clDNN output primitives will be the same as forced during model read, something along the lines of #3405.

JIRA: CVS-43902

@jhajducz jhajducz added bug Something isn't working category: GPU OpenVINO GPU plugin labels Nov 29, 2020
@jhajducz jhajducz added this to the 2021.2 milestone Nov 29, 2020
@jhajducz jhajducz requested review from a team as code owners November 29, 2020 14:35
@jhajducz jhajducz changed the title Improve network outputs detection in quantized FP16+INT8 IR to avoid converting them to FP16 precision [IE CLDNN] Improve network outputs detection in quantized FP16+INT8 IR to avoid converting them to FP16 precision Nov 29, 2020
@jhajducz jhajducz modified the milestones: 2021.2, 2021.3 Nov 30, 2020
@vladimir-paramuzov vladimir-paramuzov merged commit 4a91f91 into openvinotoolkit:master Dec 1, 2020
evolosen pushed a commit to evolosen/openvino that referenced this pull request Dec 3, 2020
mryzhov pushed a commit to mryzhov/openvino that referenced this pull request Dec 11, 2020
mryzhov pushed a commit to mryzhov/openvino that referenced this pull request Dec 16, 2020
mryzhov pushed a commit to mryzhov/openvino that referenced this pull request Jan 14, 2021
jiwaszki pushed a commit to akuporos/openvino that referenced this pull request Jan 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working category: GPU OpenVINO GPU plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants