mx.nd.topk does not work with ndarray of type float16 #11156

hetong007 · 2018-06-05T18:56:39Z

Maybe there are some operators don't work well on some data types, but I just encountered the float16-topk() pair, didn't run a full scan of all possibly broken pairs.

The following is a few lines of code to reproduce it, including the error message:

Python 2.7.12 (default, Dec  4 2017, 14:50:18)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet as mx
>>> a = mx.nd.array([1,2,3])
>>> a.topk()

[2.]
<NDArray 1 @cpu(0)>
>>> a.astype('float32').topk()

[2.]
<NDArray 1 @cpu(0)>
>>> a.astype('float16').max()

[3.]
<NDArray 1 @cpu(0)>
>>> a.astype('float16').topk()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/incubator-mxnet/python/mxnet/ndarray/ndarray.py", line 189, in __repr__
    return '\n%s\n<%s %s @%s>' % (str(self.asnumpy()),
  File "/home/ubuntu/incubator-mxnet/python/mxnet/ndarray/ndarray.py", line 1894, in asnumpy
    ctypes.c_size_t(data.size)))
  File "/home/ubuntu/incubator-mxnet/python/mxnet/base.py", line 210, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [18:53:33] include/mxnet/././tensor_blob.h:203: Check failed: mshadow::DataType<DType>::kFlag == type_flag_ TBlob.get_with_shape: data type do not match specified type.Expected: 2 v.s. given 0

Stack trace returned 10 entries:
[bt] (0) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::StackTrace[abi:cxx11]()+0x5b) [0x7f96f88955fb]
[bt] (1) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x28) [0x7f96f8896168]
[bt] (2) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(float* mxnet::TBlob::dptr<float>() const+0xfa) [0x7f96f890d2ba]
[bt] (3) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(void mxnet::op::TopKImpl<mshadow::cpu>(mxnet::RunContext, mxnet::Resource, mxnet::TBlob const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, mxnet::op::TopKParam const&)+0x97b) [0x7f96f9d39e0b]
[bt] (4) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(void mxnet::op::TopK<mshadow::cpu>(nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)+0x108) [0x7f96f9d3f5c8]
[bt] (5) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::imperative::PushFCompute(std::function<void (nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)> const&, nnvm::Op const*, nnvm::NodeAttrs const&, mxnet::Context const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::Resource, std::allocator<mxnet::Resource> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<unsigned int, std::allocator<unsigned int> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&)::{lambda(mxnet::RunContext)#1}::operator()(mxnet::RunContext) const+0x294) [0x7f96faf83454]
[bt] (6) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(+0x372bdeb) [0x7f96fb3d0deb]
[bt] (7) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::engine::ThreadedEngine::ExecuteOprBlock(mxnet::RunContext, mxnet::engine::OprBlock*)+0x8e5) [0x7f96fb3cbc75]
[bt] (8) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(std::_Function_handler<void (std::shared_ptr<dmlc::ManualEvent>), mxnet::engine::ThreadedEnginePerDevice::PushToExecute(mxnet::engine::OprBlock*, bool)::{lambda()#1}::operator()() const::{lambda(std::shared_ptr<dmlc::ManualEvent>)#1}>::_M_invoke(std::_Any_data const&, std::shared_ptr<dmlc::ManualEvent>&&)+0xe2) [0x7f96fb3e24b2]
[bt] (9) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(std::thread::_Impl<std::_Bind_simple<std::function<void (std::shared_ptr<dmlc::ManualEvent>)> (std::shared_ptr<dmlc::ManualEvent>)> >::_M_run()+0x4a) [0x7f96fb3dc6ea]

Environment info (Required)

----------Python Info----------
('Version      :', '2.7.12')
('Compiler     :', 'GCC 5.4.0 20160609')
('Build        :', ('default', 'Dec  4 2017 14:50:18'))
('Arch         :', ('64bit', 'ELF'))
------------Pip Info-----------
('Version      :', '10.0.0')
('Directory    :', '/usr/local/lib/python2.7/dist-packages/pip')
----------MXNet Info-----------
('Version      :', '1.3.0')
('Directory    :', '/home/ubuntu/incubator-mxnet/python/mxnet')
Hashtag not found. Not installed from pre-built package.
----------System Info----------
('Platform     :', 'Linux-4.4.0-1060-aws-x86_64-with-Ubuntu-16.04-xenial')
('system       :', 'Linux')
('node         :', 'ip-172-31-3-103')
('release      :', '4.4.0-1060-aws')
('version      :', '#69-Ubuntu SMP Sun May 20 13:42:07 UTC 2018')
----------Hardware Info----------
('machine      :', 'x86_64')
('processor    :', 'x86_64')
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                64
On-line CPU(s) list:   0-63
Thread(s) per core:    2
Core(s) per socket:    16
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
Stepping:              1
CPU MHz:               2699.984
CPU max MHz:           3000.0000
CPU min MHz:           1200.0000
BogoMIPS:              4600.12
Hypervisor vendor:     Xen
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              46080K
NUMA node0 CPU(s):     0-15,32-47
NUMA node1 CPU(s):     16-31,48-63
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq monitor est ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single kaiser fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx xsaveopt ida
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0020 sec, LOAD: 0.8709 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0024 sec, LOAD: 0.3203 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0077 sec, LOAD: 0.0831 sec.
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0832 sec, LOAD: 0.7508 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.1036 sec, LOAD: 0.0853 sec.
Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.2463 sec, LOAD: 0.1423 sec.

The text was updated successfully, but these errors were encountered:

eric-haibin-lin · 2018-06-05T21:35:31Z

@haojin2 @rahul003

haojin2 · 2018-07-25T00:01:18Z

Taking a look at it now

samskalicky · 2018-08-23T23:03:10Z

Rerunning this now results in the following message:

>>> import mxnet as mx
>>> a = mx.nd.array([1,2,3])
>>> a.astype('float16').max()

[3.]
<NDArray 1 @cpu(0)>
>>> a.astype('float16').topk()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/topk_fp16/python/mxnet/ndarray/ndarray.py", line 189, in __repr__
    return '\n%s\n<%s %s @%s>' % (str(self.asnumpy()),
  File "/home/ubuntu/topk_fp16/python/mxnet/ndarray/ndarray.py", line 1972, in asnumpy
    ctypes.c_size_t(data.size)))
  File "/home/ubuntu/topk_fp16/python/mxnet/base.py", line 252, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [22:58:33] /home/ubuntu/topk_fp16/src/operator/tensor/./ordering_op-inl.h:535: This operation does not support float16

This is due to change #12250 that improves the messaging that float16 is not supported.

We should change the tags on this issue to [Operator, Feature Request] and remove [Bug] now that its been handled as not currently supported.

hetong007 added the Bug label Jun 5, 2018

eric-haibin-lin added the Operator label Jun 11, 2018

haojin2 mentioned this issue Aug 2, 2018

ArgSort binds to a float64 array but forward fails #11966

Closed

szha added Feature request and removed Bug labels Aug 25, 2018

anirudh2290 mentioned this issue Sep 18, 2018

[DISCUSS] Infer Type Logic of Operators should throw exceptions for Unsupported Dtypes #12593

Open

eric-haibin-lin added the FP16 label Sep 29, 2018

eric-haibin-lin mentioned this issue Feb 20, 2019

FP16 support for topK #14125

Closed

anirudhacharya mentioned this issue May 9, 2019

Add fp16 and fp64 support for topk #14912

Closed

4 tasks

anirudhacharya mentioned this issue Jul 16, 2019

Add fp16 support for topk #15560

Merged

4 tasks

yuxihu closed this as completed Aug 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mx.nd.topk does not work with ndarray of type float16 #11156

mx.nd.topk does not work with ndarray of type float16 #11156

hetong007 commented Jun 5, 2018

eric-haibin-lin commented Jun 5, 2018

haojin2 commented Jul 25, 2018

samskalicky commented Aug 23, 2018

mx.nd.topk does not work with ndarray of type float16 #11156

mx.nd.topk does not work with ndarray of type float16 #11156

Comments

hetong007 commented Jun 5, 2018

Environment info (Required)

eric-haibin-lin commented Jun 5, 2018

haojin2 commented Jul 25, 2018

samskalicky commented Aug 23, 2018