Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Bug in bipartite_matching #13074

Closed
wshuail opened this issue Nov 1, 2018 · 3 comments
Closed

Bug in bipartite_matching #13074

wshuail opened this issue Nov 1, 2018 · 3 comments

Comments

@wshuail
Copy link

wshuail commented Nov 1, 2018

Hi when I use SSD for my own dataset, I found a bug in mx.sym.contrib.bipartite_matching for mxnet 1.3.x.

it's easy to reproduce with the code as below. But maybe you have to try many times.

import mxnet as mx
from mxnet import nd

for _ in range(10000):
x = nd.random.uniform(0, 1, (10, 100))
output = nd.contrib.bipartite_matching(data=x, threshold=1e-12, is_ascend=False)

The error informations change sometimes, but generally it's about the memory.
sometimes, it's memory corruption. Sometimes like the message below.

*** Error in `python': double free or corruption (out): 0x00007f246400ce40 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7d053)[0x7f25523dc053]
/home/xxx/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet2op9SortByKeyIifEEvN7mshadow6TensorINS2_3cpuELi1ET_EENS3_IS4_Li1ET0_EEbPNS3_IS4_Li1EcEEii+0x3
62)[0x7f24aaf5f2d2]
/home/xxx/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(ZN5mxnet2op24BipartiteMatchingForwardIN7mshadow3cpuEEEvRKN4nnvm9NodeAttrsERKNS_9OpContextERKSt6vector
INS_5TBlobESaISC_EERKSB_INS_9OpReqTypeESaISH_EESG
+0x129f)[0x7f24aaf63cff]
/home/xxx/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZZN5mxnet10imperative12PushFComputeERKSt8functionIFvRKN4nnvm9NodeAttrsERKNS_9OpContextERKSt6vectorINS
5TBlobESaISA_EERKS9_INS_9OpReqTypeESaISF_EESE_EEPKNS2_2OpES5_RKNS_7ContextERKS9_IPNS_6engine3VarESaISW_EES10_RKS9_INS_8ResourceESaIS11_EERKS9_IPNS_7NDArrayESaIS17_EES1B_RKS
9_IjSaIjEESJ_ENKUlNS_10RunContextEE_clES1G
+0x2e8)[0x7f24ab33ac18]
/home/xxx/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(+0x3b47dc9)[0x7f24ab754dc9]
/home/xxx/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet6engine14ThreadedEngine15ExecuteOprBlockENS_10RunContextEPNS0_8OprBlockE+0x599)[0x7f24ab7503d
9]
/home/xxx/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(ZNSt17_Function_handlerIFvSt10shared_ptrIN4dmlc11ManualEventEEEZZN5mxnet6engine23ThreadedEnginePerDev
ice13PushToExecuteEPNS6_8OprBlockEbENKUlvE_clEvEUlS3_E_E9_M_invokeERKSt9_Any_dataS3
+0xd2)[0x7f24ab760f32]
/home/xxx/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZNSt6thread5_ImplISt12_Bind_simpleIFSt8functionIFvSt10shared_ptrIN4dmlc11ManualEventEEEES6_EEE6_M_run
Ev+0x44)[0x7f24ab74fd14]
/lib64/libstdc++.so.6(+0xb5220)[0x7f253a198220]
/lib64/libpthread.so.0(+0x7dc5)[0x7f2552e31dc5]
/lib64/libc.so.6(clone+0x6d)[0x7f2552455ced]

This doesn't happen in mxnet 1.1.0, but gluoncv needs mxnet 1.3.0.

@frankfliu
Copy link
Contributor

@mxnet-label-bot [Bug, Operator]

@vuvko
Copy link

vuvko commented Nov 6, 2018

Can confirm this problem on CPU. On GPU everything is fine.

import mxnet as mx
from mxnet import nd

for _ in range(10):
    x = nd.random.uniform(0, 1, (10, 10))
    output = nd.contrib.bipartite_matching(data=x, threshold=1e-12, is_ascend=False)

results in free(): invalid pointer. While

import mxnet as mx
from mxnet import nd

for _ in range(10):
    x = nd.random.uniform(0, 1, (10, 10), ctx=mx.gpu(0))
    output = nd.contrib.bipartite_matching(data=x, threshold=1e-12, is_ascend=False)

seems to works just fine

@wshuail
Copy link
Author

wshuail commented Jan 15, 2019

##13727
#dmlc/gluon-cv#529

This was fixed already. Thx.

@wshuail wshuail closed this as completed Jan 15, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants