Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

It seems nd.array or NDArrayIter not support too large array #9207

Closed
fcbruce opened this issue Dec 27, 2017 · 6 comments
Closed

It seems nd.array or NDArrayIter not support too large array #9207

fcbruce opened this issue Dec 27, 2017 · 6 comments

Comments

@fcbruce
Copy link

fcbruce commented Dec 27, 2017

Description

I have a large array, cannot be transformed into nd.array

Environment info (Required)

CentOS and MacOS

Package used (Python/R/Scala/Julia):
Python

Build info (Required if built from source)

install by pip, mxnet=1.0.0

Error Message:

15:31:56] /Users/travis/build/dmlc/mxnet-distro/mxnet-build/dmlc-core/include/dmlc/logging.h:308: [15:31:56] include/mxnet/./tensor_blob.h:275: Check failed: this->shape_.Size() == shape.Size() (6553600000 vs. 2258632704) TBlob.get_with_shape: new and old shape do not match total elements

Stack trace returned 7 entries:
[bt] (0) 0   libmxnet.so                         0x0000000d378eaad8 _ZN4dmlc15LogMessageFatalD2Ev + 40
[bt] (1) 1   libmxnet.so                         0x0000000d3790cae9 _ZNK5mxnet5TBlob14get_with_shapeIN7mshadow3cpuELi1EfEENS2_6TensorIT_XT0_ET1_EERKNS2_5ShapeIXT0_EEEPNS2_6StreamIS5_EE + 777
[bt] (2) 2   libmxnet.so                         0x0000000d380fa0be _ZN5mxnet7ndarray4CopyIN7mshadow3cpuES3_EEvRKNS_5TBlobEPS4_NS_7ContextES8_NS_10RunContextE + 14382
[bt] (3) 3   libmxnet.so                         0x0000000d380d9673 _ZNK5mxnet7NDArray15SyncCopyFromCPUEPKvm + 1139
[bt] (4) 4   libmxnet.so                         0x0000000d37fcc1fd MXNDArraySyncCopyFromCPU + 13
[bt] (5) 5   _ctypes.cpython-36m-darwin.so       0x0000000101eb742f ffi_call_unix64 + 79
[bt] (6) 6   ???                                 0x00007fff5e83a820 0x0 + 140734779074592

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/mxnet/ndarray.py", line 1295, in array
    arr[:] = source_array
  File "/usr/local/lib/python3.6/site-packages/mxnet/ndarray.py", line 386, in __setitem__
    self._sync_copyfrom(value)
  File "/usr/local/lib/python3.6/site-packages/mxnet/ndarray.py", line 560, in _sync_copyfrom
    ctypes.c_size_t(source_array.size)))
  File "/usr/local/lib/python3.6/site-packages/mxnet/base.py", line 129, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [15:31:56] include/mxnet/./tensor_blob.h:275: Check failed: this->shape_.Size() == shape.Size() (6553600000 vs. 2258632704) TBlob.get_with_shape: new and old shape do not match total elements

Stack trace returned 7 entries:
[bt] (0) 0   libmxnet.so                         0x0000000d378eaad8 _ZN4dmlc15LogMessageFatalD2Ev + 40
[bt] (1) 1   libmxnet.so                         0x0000000d3790cae9 _ZNK5mxnet5TBlob14get_with_shapeIN7mshadow3cpuELi1EfEENS2_6TensorIT_XT0_ET1_EERKNS2_5ShapeIXT0_EEEPNS2_6StreamIS5_EE + 777
[bt] (2) 2   libmxnet.so                         0x0000000d380fa0be _ZN5mxnet7ndarray4CopyIN7mshadow3cpuES3_EEvRKNS_5TBlobEPS4_NS_7ContextES8_NS_10RunContextE + 14382
[bt] (3) 3   libmxnet.so                         0x0000000d380d9673 _ZNK5mxnet7NDArray15SyncCopyFromCPUEPKvm + 1139
[bt] (4) 4   libmxnet.so                         0x0000000d37fcc1fd MXNDArraySyncCopyFromCPU + 13
[bt] (5) 5   _ctypes.cpython-36m-darwin.so       0x0000000101eb742f ffi_call_unix64 + 79
[bt] (6) 6   ???                                 0x00007fff5e83a820 0x0 + 140734779074592

Minimum reproducible example

import numpy as np
import mxnet as mx
X = np.zeros((20000, 32768), dtypes=np.float32)
mx.nd.array(X)

Steps to reproduce

just run the code above

@fcbruce
Copy link
Author

fcbruce commented Dec 27, 2017

Sorry, the version on CentOS is 1.0.0 and on MacOS is 0.11.0

@mwbyeon
Copy link
Contributor

mwbyeon commented Dec 27, 2017

@fcbruce

It works.

$ pip install mxnet
$ python
Python 3.6.3 (default, Oct 19 2017, 23:50:38)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet as mx
>>> mx.__version__
'1.0.0'
>>> import numpy as np
>>> X = np.zeros((20000, 32768), dtype=np.float32)
>>> mx.nd.array(X)

[[ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 ...,
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]]
<NDArray 20000x32768 @cpu(0)>
>>>

what's your memory capacity?
your array is required 2.4GB memory (20000*32768*4bytes).
(I tested on MacOS with 16GB Memory)

if exceed memory capacity, it occurs above error message.

>>> mx.nd.array(np.zeros((200000, 32768), dtype=np.float32))
[17:23:34] /Users/travis/build/dmlc/mxnet-distro/mxnet-build/dmlc-core/include/dmlc/logging.h:308: [17:23:34] include/mxnet/./tensor_blob.h:276: Check failed: this->shape_.Size() == shape.Size() (6553600000 vs. 2258632704) TBlob.get_with_shape: new and old shape do not match total elements

Stack trace returned 7 entries:
[bt] (0) 0   libmxnet.so                         0x0000000107116b98 _ZN4dmlc15LogMessageFatalD2Ev + 40
[bt] (1) 1   libmxnet.so                         0x000000010713b7c9 _ZNK5mxnet5TBlob14get_with_shapeIN7mshadow3cpuELi1EfEENS2_6TensorIT_XT0_ET1_EERKNS2_5ShapeIXT0_EEEPNS2_6StreamIS5_EE + 777
[bt] (2) 2   libmxnet.so                         0x000000010819427e _ZN5mxnet7ndarray4CopyIN7mshadow3cpuES3_EEvRKNS_5TBlobEPS4_NS_7ContextES8_NS_10RunContextE + 14382
[bt] (3) 3   libmxnet.so                         0x000000010816e1e5 _ZNK5mxnet7NDArray15SyncCopyFromCPUEPKvm + 1109
[bt] (4) 4   libmxnet.so                         0x0000000107ffb2cd MXNDArraySyncCopyFromCPU + 13
[bt] (5) 5   _ctypes.cpython-36m-darwin.so       0x000000010628d02f ffi_call_unix64 + 79
[bt] (6) 6   ???                                 0x00007ffeea4422a0 0x0 + 140732828754592

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dylan/.pyenv/versions/mxnet/lib/python3.6/site-packages/mxnet/ndarray/utils.py", line 146, in array
    return _array(source_array, ctx=ctx, dtype=dtype)
  File "/Users/dylan/.pyenv/versions/mxnet/lib/python3.6/site-packages/mxnet/ndarray/ndarray.py", line 2245, in array
    arr[:] = source_array
  File "/Users/dylan/.pyenv/versions/mxnet/lib/python3.6/site-packages/mxnet/ndarray/ndarray.py", line 437, in __setitem__
    self._set_nd_basic_indexing(key, value)
  File "/Users/dylan/.pyenv/versions/mxnet/lib/python3.6/site-packages/mxnet/ndarray/ndarray.py", line 698, in _set_nd_basic_indexing
    self._sync_copyfrom(value)
  File "/Users/dylan/.pyenv/versions/mxnet/lib/python3.6/site-packages/mxnet/ndarray/ndarray.py", line 863, in _sync_copyfrom
    ctypes.c_size_t(source_array.size)))
  File "/Users/dylan/.pyenv/versions/mxnet/lib/python3.6/site-packages/mxnet/base.py", line 146, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [17:23:34] include/mxnet/./tensor_blob.h:276: Check failed: this->shape_.Size() == shape.Size() (6553600000 vs. 2258632704) TBlob.get_with_shape: new and old shape do not match total elements

Stack trace returned 7 entries:
[bt] (0) 0   libmxnet.so                         0x0000000107116b98 _ZN4dmlc15LogMessageFatalD2Ev + 40
[bt] (1) 1   libmxnet.so                         0x000000010713b7c9 _ZNK5mxnet5TBlob14get_with_shapeIN7mshadow3cpuELi1EfEENS2_6TensorIT_XT0_ET1_EERKNS2_5ShapeIXT0_EEEPNS2_6StreamIS5_EE + 777
[bt] (2) 2   libmxnet.so                         0x000000010819427e _ZN5mxnet7ndarray4CopyIN7mshadow3cpuES3_EEvRKNS_5TBlobEPS4_NS_7ContextES8_NS_10RunContextE + 14382
[bt] (3) 3   libmxnet.so                         0x000000010816e1e5 _ZNK5mxnet7NDArray15SyncCopyFromCPUEPKvm + 1109
[bt] (4) 4   libmxnet.so                         0x0000000107ffb2cd MXNDArraySyncCopyFromCPU + 13
[bt] (5) 5   _ctypes.cpython-36m-darwin.so       0x000000010628d02f ffi_call_unix64 + 79
[bt] (6) 6   ???                                 0x00007ffeea4422a0 0x0 + 140732828754592

>>>

@fcbruce
Copy link
Author

fcbruce commented Dec 27, 2017

@mwbyeon Sorry, It's my fault, lost a zero

import numpy as np
import mxnet as mx
X = np.zeros((200000, 32768), dtypes=np.float32)
mx.nd.array(X)

My MacOS has 8GB and CentOS has 64GB

@wkcn
Copy link
Member

wkcn commented Dec 28, 2017

I get the error:

mxnet.base.MXNetError: [16:14:07] g:\deeplearn\mxnet\include\mxnet\./tensor_blob.h:275: Check failed: this->shape_.Size(
) == shape.Size() (6553600000 vs. 2258632704) TBlob.get_with_shape: new and old shape do not match total elements

I seems there is overflow on the struct Shape

@anirudh2290
Copy link
Member

int64 type for tensor dimension sizes is not supported yet but there is a plan to support it. Please see: #10158

@apeforest
Copy link
Contributor

Verified the fix with PR #11742. @sandeep-krishnamurthy Please close this issue. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants