-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Mxnet gets stuck when run example/image-classification/train_mnist.py #1468
Comments
And here is the gdb backtrace output at the stuck point: (gdb) bt rtld_fini=, stack_end=0x7fffffffe3f8) at libc-start.c:289 #32 0x000000000049a429 in _start () |
This issue is closed due to lack of activity in the last 90 days. Feel free to reopen if this is still an active issue. Thanks! |
The process gets stuck when I run train_mnist.py in Ubuntu 15.04. I only change config.mk to use openblas as backend before build mxnet itself.
The last few lines in output of
strace python train_mnist.py
is:clone(child_stack=0x7f71d3ffeff0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f71d3fff9d0, tls=0x7f71d3fff700, child_tidptr=0x7f71d3fff9d0) = 2337
futex(0x24533cc, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x24533c8, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
brk(0x292f000) = 0x292f000
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=414, ...}) = 0
write(2, "2016-02-14 18:32:45,836 Node[0] "..., 612016-02-14 18:32:45,836 Node[0] Start training with [cpu(0)]
) = 61
brk(0x2992000) = 0x2992000
brk(0x29f4000) = 0x29f4000
brk(0x2a5e000) = 0x2a5e000
brk(0x2acd000) = 0x2acd000
futex(0x23e0aa4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x23e0aa0, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x23e0a70, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x23e0ad0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x23e0ad4, FUTEX_WAIT_PRIVATE, 3, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x23e0a70, FUTEX_WAKE_PRIVATE, 1) = 0
brk(0x2aee000) = 0x2aee000
futex(0x7f71dc001260, FUTEX_WAKE_PRIVATE, 1) = 1
brk(0x2b50000) = 0x2b50000
futex(0x7f71dc000e6c, FUTEX_WAIT_PRIVATE, 1, NULL) = 0
futex(0x7f71dc000e40, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x23e0aa4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x23e0aa0, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x7f71dc000e6c, FUTEX_WAIT_PRIVATE, 3, NULL) = 0
futex(0x7f71dc000e40, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x23e0aa4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x23e0aa0, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x7f71dc000e6c, FUTEX_WAIT_PRIVATE, 5, NULL) = 0
futex(0x7f71dc000e40, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f71dc000e6c, FUTEX_WAIT_PRIVATE, 7, NULL) = 0
futex(0x7f71dc000e40, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f71dc000e6c, FUTEX_WAIT_PRIVATE, 9, NULL) = 0
futex(0x7f71dc000e40, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x23e0aa4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x23e0aa0, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x7f71dc000e6c, FUTEX_WAIT_PRIVATE, 11, NULL) = 0
futex(0x7f71dc000e40, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f71dc000e6c, FUTEX_WAIT_PRIVATE, 13, NULL) = 0
futex(0x7f71dc000e40, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f71dc000e6c, FUTEX_WAIT_PRIVATE, 15, NULL) = 0
futex(0x7f71dc000e40, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x23e0aa4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x23e0aa0, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x7f71dc000e6c, FUTEX_WAIT_PRIVATE, 17, NULL
It seems get stuck on a mutex. Anyone knows what happens?
The text was updated successfully, but these errors were encountered: