-
Notifications
You must be signed in to change notification settings - Fork 6.8k
mxnet R updated version out of memory error #11682
Comments
how do you install mxnet 1.3.0? can you provide some code that make mxnet run out of memory to help us reproduce the error? |
Yes I built the GPU version from the source. As described here, https://github.com/apache/incubator-mxnet/tree/master/R-package, " bump up version number to 1.3.0 to make nightly build to build with r… ". So the mxnet R package is 1.3.0. Error in mx.nd.internal.as.array(nd) : Stack trace returned 10 entries: When you run the code on the new mxnet version, please change, in CGAN_train.R, Line 124 to
Change Line 133 to metric_D_value <- metric_D$update(as.array(mx.nd.array(rep(1, batch_size))), as.array(exec_D$ref.outputs[["D_sym_output"]]), metric_D_value) |
@sandeep-krishnamurthy please close this issue. |
I recently updated mxnet package to 1.3.0 and keep noticing an out of memory error. The same code was run on mxnet 1.0.1 or before and had never encountered the following error.
Error in mx.nd.internal.as.array(nd) :
[00:58:56] src/storage/./pooled_storage_manager.h:118: cudaMalloc failed: out of memory
Stack trace returned 10 entries:
[bt] (0) /home/username/R/x86_64-redhat-linux-gnu-library/3.5/mxnet/libs/libmxnet.so(dmlc::StackTrace()+0x4a) [0x7f6e319552ba]
[bt] (1) /home/username/R/x86_64-redhat-linux-gnu-library/3.5/mxnet/libs/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x21) [0x7f6e319558c1]
[bt] (2) /home/username/R/x86_64-redhat-linux-gnu-library/3.5/mxnet/libs/libmxnet.so(mxnet::storage::GPUPooledStorageManager::Alloc(mxnet::Storage::Handle*)+0x1bb) [0x7f6e3447ed2b]
[bt] (3) /home/username/R/x86_64-redhat-linux-gnu-library/3.5/mxnet/libs/libmxnet.so(mxnet::StorageImpl::Alloc(mxnet::Storage::Handle*)+0x55) [0x7f6e344818a5]
[bt] (4) /home/username/R/x86_64-redhat-linux-gnu-library/3.5/mxnet/libs/libmxnet.so(mxnet::NDArray::CheckAndAlloc() const+0x19b) [0x7f6e31a4a7db]
[bt] (5) /home/username/R/x86_64-redhat-linux-gnu-library/3.5/mxnet/libs/libmxnet.so(+0x331257d) [0x7f6e33fc45
In the beginning when I encountered this problem, I would restart R, using q("no") and the code is able to train without the error. But now simply restarting R does not solve this issue.
Another thing I noticed is that on the official webpage mxnet is 1.2.0. But in my R sessionInfo(), I am seeing mxnet 1.3.0 is loaded.
R version 3.5.0 (2018-04-23)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets
[6] methods base
other attached packages:
[1] caret_6.0-80 ggplot2_2.2.1 lattice_0.20-35
[4] mxnet_1.3.0 readr_1.1.1 dplyr_0.7.5
[7] imager_0.41.1 magrittr_1.5
Please provide some guidance on solving this issue. Thanks!
The text was updated successfully, but these errors were encountered: