Fp16 support for softmax #14072

eric-haibin-lin · 2019-02-06T07:19:15Z

Currently, given fp16 inputs, nd.softmax/sym.softmax perform reduction in fp16, which losses precision. The reduction should be done in fp32 instead.

https://github.com/apache/incubator-mxnet/blob/32c9ca74839ae4d275bcf9a027ea0a711373be81/src/operator/nn/softmax-inl.h#L164-L202
pytorch reference:
https://github.com/zdevito/ATen/blob/a6cc4156fe4abc9e31f62f2bba1a2f68c58b77b7/aten/src/ATen/native/cuda/SoftMax.cu#L43-L55

mxnet-label-bot · 2019-02-06T07:19:18Z

Hey, this is the MXNet Label Bot.
Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it.
Here are my recommended labels: Feature

eric-haibin-lin · 2019-03-08T06:38:12Z

fixed in #14098

eric-haibin-lin added Operator FP16 Feature request labels Feb 6, 2019

This was referenced Feb 6, 2019

[feature] support for BERT fp16 inference dmlc/gluon-nlp#578

Closed

[BERT] Distributed Training Support dmlc/gluon-nlp#478

Closed

eric-haibin-lin closed this as completed Mar 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fp16 support for softmax #14072

Fp16 support for softmax #14072

eric-haibin-lin commented Feb 6, 2019 •

edited

Loading

mxnet-label-bot commented Feb 6, 2019

eric-haibin-lin commented Mar 8, 2019

Fp16 support for softmax #14072

Fp16 support for softmax #14072

Comments

eric-haibin-lin commented Feb 6, 2019 • edited Loading

mxnet-label-bot commented Feb 6, 2019

eric-haibin-lin commented Mar 8, 2019

eric-haibin-lin commented Feb 6, 2019 •

edited

Loading