-
Notifications
You must be signed in to change notification settings - Fork 6.8k
fix operators to support large arrays. #13036
Comments
@mxnet-label-bot [Bug, Operator] |
The type of iterator is Please see the code: |
Will take a look |
@wkcn is right. We need to chang int to index_t. I am busy with other tasks now and can only come to this in one week. Let me know if it requires an immediate fix. |
JIRA task created: https://issues.apache.org/jira/browse/MXNET-1185 |
@apeforest |
i'm fixing some of the operators. but we need a systematic fix. The problem is everywhere. i'll provide a temp fix for some of the operators. |
@wkcn in cpu, it shouldn't be a problem. I heard concerns on GPUs. Potentially, we can use int64_t for CPU and int for GPU. |
@zheng-da |
@wkcn My concern is that this modification makes the code complex. As for using different int types for CPU and GPU, it's relatively easier. We can use the template argument to easily achieve it. |
@pengzhao-intel what is the performance difference between int32 and int64 in intel CPUs? |
@apeforest I have fixed some of the operators, including all random generators, zeros, ones, full, arange, gather_nd. |
@zheng-da Maybe |
@zheng-da Do you plan to create a PR with your change? I will be glad to review. Also, I have created an epic (https://issues.apache.org/jira/browse/MXNET-1184) to address this support in a systematic way. Please feel free to add additional tasks to it as needed. Thanks. |
@zheng-da in general, int64 is only half of int32 performance. |
Hi, I modified And I wrote a script to replace the type of interator to Usage:
However, there was some bug in the script. :-( |
@apeforest I just fixed the operators I use in my model. Could you help add test and fix other operators? |
In my test, it seem that the performances of CPU: Intel i7-7500U
|
integer operations are cheap. even if int64 is a little more expensive, it's hard to believe that it can affect the overall performance by much. |
try the gemm with int 32 and int |
I believe the following is also a repo of this issue: import mxnet as mx
mx.nd.eye(10240 * 5) * 2
[[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
...
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]] |
This issue has been fixed. In 1.5.0 release, user need to build MXNet from source with the compilation flag USE_INT64_TENSOR_SIZE=1. We are working to make this flag on by default and available in pip package in next minor release. Closing this issue for now. |
We're working on a model that requires very large NDArrays. For example, we want to create an NDArray as follows:
The current implementation doesn't fail with an error, but it doesn't generate a matrix correctly (it only fills the rows at the beginning).
mx.nd.zeros
also fails.It's unclear what operators support and which operators don't.
The text was updated successfully, but these errors were encountered: