Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

mx.nd.topk does not work with ndarray of type float16 #11156

Closed
hetong007 opened this issue Jun 5, 2018 · 3 comments
Closed

mx.nd.topk does not work with ndarray of type float16 #11156

hetong007 opened this issue Jun 5, 2018 · 3 comments

Comments

@hetong007
Copy link
Contributor

Maybe there are some operators don't work well on some data types, but I just encountered the float16-topk() pair, didn't run a full scan of all possibly broken pairs.

The following is a few lines of code to reproduce it, including the error message:

Python 2.7.12 (default, Dec  4 2017, 14:50:18)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet as mx
>>> a = mx.nd.array([1,2,3])
>>> a.topk()

[2.]
<NDArray 1 @cpu(0)>
>>> a.astype('float32').topk()

[2.]
<NDArray 1 @cpu(0)>
>>> a.astype('float16').max()

[3.]
<NDArray 1 @cpu(0)>
>>> a.astype('float16').topk()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/incubator-mxnet/python/mxnet/ndarray/ndarray.py", line 189, in __repr__
    return '\n%s\n<%s %s @%s>' % (str(self.asnumpy()),
  File "/home/ubuntu/incubator-mxnet/python/mxnet/ndarray/ndarray.py", line 1894, in asnumpy
    ctypes.c_size_t(data.size)))
  File "/home/ubuntu/incubator-mxnet/python/mxnet/base.py", line 210, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [18:53:33] include/mxnet/././tensor_blob.h:203: Check failed: mshadow::DataType<DType>::kFlag == type_flag_ TBlob.get_with_shape: data type do not match specified type.Expected: 2 v.s. given 0

Stack trace returned 10 entries:
[bt] (0) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::StackTrace[abi:cxx11]()+0x5b) [0x7f96f88955fb]
[bt] (1) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x28) [0x7f96f8896168]
[bt] (2) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(float* mxnet::TBlob::dptr<float>() const+0xfa) [0x7f96f890d2ba]
[bt] (3) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(void mxnet::op::TopKImpl<mshadow::cpu>(mxnet::RunContext, mxnet::Resource, mxnet::TBlob const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, mxnet::op::TopKParam const&)+0x97b) [0x7f96f9d39e0b]
[bt] (4) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(void mxnet::op::TopK<mshadow::cpu>(nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)+0x108) [0x7f96f9d3f5c8]
[bt] (5) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::imperative::PushFCompute(std::function<void (nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)> const&, nnvm::Op const*, nnvm::NodeAttrs const&, mxnet::Context const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::Resource, std::allocator<mxnet::Resource> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<unsigned int, std::allocator<unsigned int> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&)::{lambda(mxnet::RunContext)#1}::operator()(mxnet::RunContext) const+0x294) [0x7f96faf83454]
[bt] (6) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(+0x372bdeb) [0x7f96fb3d0deb]
[bt] (7) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::engine::ThreadedEngine::ExecuteOprBlock(mxnet::RunContext, mxnet::engine::OprBlock*)+0x8e5) [0x7f96fb3cbc75]
[bt] (8) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(std::_Function_handler<void (std::shared_ptr<dmlc::ManualEvent>), mxnet::engine::ThreadedEnginePerDevice::PushToExecute(mxnet::engine::OprBlock*, bool)::{lambda()#1}::operator()() const::{lambda(std::shared_ptr<dmlc::ManualEvent>)#1}>::_M_invoke(std::_Any_data const&, std::shared_ptr<dmlc::ManualEvent>&&)+0xe2) [0x7f96fb3e24b2]
[bt] (9) /home/ubuntu/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(std::thread::_Impl<std::_Bind_simple<std::function<void (std::shared_ptr<dmlc::ManualEvent>)> (std::shared_ptr<dmlc::ManualEvent>)> >::_M_run()+0x4a) [0x7f96fb3dc6ea]

Environment info (Required)

----------Python Info----------
('Version      :', '2.7.12')
('Compiler     :', 'GCC 5.4.0 20160609')
('Build        :', ('default', 'Dec  4 2017 14:50:18'))
('Arch         :', ('64bit', 'ELF'))
------------Pip Info-----------
('Version      :', '10.0.0')
('Directory    :', '/usr/local/lib/python2.7/dist-packages/pip')
----------MXNet Info-----------
('Version      :', '1.3.0')
('Directory    :', '/home/ubuntu/incubator-mxnet/python/mxnet')
Hashtag not found. Not installed from pre-built package.
----------System Info----------
('Platform     :', 'Linux-4.4.0-1060-aws-x86_64-with-Ubuntu-16.04-xenial')
('system       :', 'Linux')
('node         :', 'ip-172-31-3-103')
('release      :', '4.4.0-1060-aws')
('version      :', '#69-Ubuntu SMP Sun May 20 13:42:07 UTC 2018')
----------Hardware Info----------
('machine      :', 'x86_64')
('processor    :', 'x86_64')
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                64
On-line CPU(s) list:   0-63
Thread(s) per core:    2
Core(s) per socket:    16
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
Stepping:              1
CPU MHz:               2699.984
CPU max MHz:           3000.0000
CPU min MHz:           1200.0000
BogoMIPS:              4600.12
Hypervisor vendor:     Xen
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              46080K
NUMA node0 CPU(s):     0-15,32-47
NUMA node1 CPU(s):     16-31,48-63
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq monitor est ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single kaiser fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx xsaveopt ida
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0020 sec, LOAD: 0.8709 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0024 sec, LOAD: 0.3203 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0077 sec, LOAD: 0.0831 sec.
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0832 sec, LOAD: 0.7508 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.1036 sec, LOAD: 0.0853 sec.
Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.2463 sec, LOAD: 0.1423 sec.
@hetong007 hetong007 added the Bug label Jun 5, 2018
@eric-haibin-lin
Copy link
Member

@haojin2 @rahul003

@haojin2
Copy link
Contributor

haojin2 commented Jul 25, 2018

Taking a look at it now

@samskalicky
Copy link
Contributor

Rerunning this now results in the following message:

>>> import mxnet as mx
>>> a = mx.nd.array([1,2,3])
>>> a.astype('float16').max()

[3.]
<NDArray 1 @cpu(0)>
>>> a.astype('float16').topk()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/topk_fp16/python/mxnet/ndarray/ndarray.py", line 189, in __repr__
    return '\n%s\n<%s %s @%s>' % (str(self.asnumpy()),
  File "/home/ubuntu/topk_fp16/python/mxnet/ndarray/ndarray.py", line 1972, in asnumpy
    ctypes.c_size_t(data.size)))
  File "/home/ubuntu/topk_fp16/python/mxnet/base.py", line 252, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [22:58:33] /home/ubuntu/topk_fp16/src/operator/tensor/./ordering_op-inl.h:535: This operation does not support float16

This is due to change #12250 that improves the messaging that float16 is not supported.

We should change the tags on this issue to [Operator, Feature Request] and remove [Bug] now that its been handled as not currently supported.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants