Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support ernie trt-int8 for inference #32232

Merged
merged 2 commits into from
Apr 16, 2021
Merged

Conversation

ceci3
Copy link
Contributor

@ceci3 ceci3 commented Apr 13, 2021

PR types

New features

PR changes

Others

Describe

support trt-int8 inference for ernie
通过load_inference_model fake 量化测试精度结果为0.7786

layer prec prec
skip ln FP16 FP16
qkv2ctx FP16 FP16
fc FP16 int8
latency(bs=40,T4) 35.5341ms 29.1797ms
acc 0.7786 0.7770
qps 1898seq/s 2310seq/s

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

}
}

bool enable_int8 = mul0_op_desc->HasAttr("enable_int8");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是不是和上面重复了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删掉了,感谢~

n_output, weight.get(), bias.get());
nvinfer1::ILayer* fc_layer = nullptr;
if (enable_int8) {
CHECK(op_desc.HasAttr("out_threshold"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

用PADDLE_ENFORCE吧,给出报错信息

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改,感谢~

Copy link
Contributor

@cryoco cryoco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ceci3 ceci3 merged commit 6da043e into PaddlePaddle:develop Apr 16, 2021
@ceci3 ceci3 deleted the ernie_trt_int8 branch April 21, 2021 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants