Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Segmentation Fault #9507

Closed
jimm1973 opened this issue Jan 20, 2018 · 21 comments
Closed

Segmentation Fault #9507

jimm1973 opened this issue Jan 20, 2018 · 21 comments

Comments

@jimm1973
Copy link

jimm1973 commented Jan 20, 2018

Hello everyone, Im following this post for my studies "https://mxnet.incubator.apache.org/tutorials/embedded/wine_detector.html" but whenever I run the python camera_test.py on the Terminal, this shows up. Can someone tell me what should I do in order to progress. Thanks.

pi@raspberrypi:~ $ python camera_test.py
[13:37:44] src/nnvm/legacy_json_util.cc:190: Loading symbol saved by previous version v0.8.0. Attempting to upgrade...
Segmentation fault

@larroy
Copy link
Contributor

larroy commented Jan 20, 2018

Where's the segmentation fault happening?
Can you do:

ulimit -c unlimited

then use the bt command in gdb to get a backtrace, but mxnet should provide a backtrace if you compile with USE_SIGNAL_HANDLER

@jimm1973
Copy link
Author

jimm1973 commented Jan 21, 2018

I'll do it when i get back on my RPi3. It's happening when I rename the "Inception-BN_symbol.json" to "Inception_BN_symbol.json"

@jimm1973
Copy link
Author

HI @larroy I did try ulimit -c unlimited i worked but the the terminal responds "Segmentation fault (core dumped)"

@larroy
Copy link
Contributor

larroy commented Jan 22, 2018

Yes, do you have a core file there. Please use gdb to get a backtrace, it's a standard way you can find some intsructions doing a quick google search, basically gdb /executable core then bt command

@roywei
Copy link
Member

roywei commented Feb 27, 2018

@jimm1973 Closing this issue for now, please reopen if issue persists and a core file is available.
@sandeep-krishnamurthy Please add label: Pending Requester Info, and close this issue
Thanks!

@tvandergeer
Copy link

I'm running into the same issue using MXNet v1.2.0 on a Raspberry Pi 3+. I've made a core dump and did a backtrace:

pi@raspberrypi:~/incubator-mxnet/python $ gdb /home/pi/incubator-mxnet/python/mxnet/../../lib/libmxnet.so core
GNU gdb (Raspbian 7.12-6) 7.12.0.20161007-git
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/pi/incubator-mxnet/python/mxnet/../../lib/libmxnet.so...done.

warning: core file may not match specified executable file.
[New LWP 5251]
[New LWP 5262]
[New LWP 5255]
[New LWP 5254]
[New LWP 5265]
[New LWP 5263]
[New LWP 5256]
[New LWP 5264]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `python'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x75674978 in nnvm::CreateVariableNode(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /home/pi/incubator-mxnet/python/mxnet/../../lib/libmxnet.so
[Current thread is 1 (Thread 0x76f628c0 (LWP 5251))]
(gdb) bt
#0  0x75674978 in nnvm::CreateVariableNode(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /home/pi/incubator-mxnet/python/mxnet/../../lib/libmxnet.so
#1  0x75674a30 in nnvm::Symbol::CreateVariable(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /home/pi/incubator-mxnet/python/mxnet/../../lib/libmxnet.so
#2  0x750a2d88 in mxnet::UpgradeJSON_Parse(nnvm::Graph) () from /home/pi/incubator-mxnet/python/mxnet/../../lib/libmxnet.so
#3  0x750a8a5c in std::_Function_handler<nnvm::Graph (nnvm::Graph), nnvm::Graph (*)(nnvm::Graph)>::_M_invoke(std::_Any_data const&, nnvm::Graph&&) () from /home/pi/incubator-mxnet/python/mxnet/../../lib/libmxnet.so
#4  0x7509d29c in mxnet::LoadLegacyJSONPass(nnvm::Graph) () from /home/pi/incubator-mxnet/python/mxnet/../../lib/libmxnet.so
#5  0x750a8a5c in std::_Function_handler<nnvm::Graph (nnvm::Graph), nnvm::Graph (*)(nnvm::Graph)>::_M_invoke(std::_Any_data const&, nnvm::Graph&&) () from /home/pi/incubator-mxnet/python/mxnet/../../lib/libmxnet.so
#6  0x75672cf0 in nnvm::ApplyPasses(nnvm::Graph, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) () from /home/pi/incubator-mxnet/python/mxnet/../../lib/libmxnet.so
#7  0x750d01f0 in nnvm::ApplyPass(nnvm::Graph, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /home/pi/incubator-mxnet/python/mxnet/../../lib/libmxnet.so
#8  0x755433a8 in MXSymbolCreateFromFile () from /home/pi/incubator-mxnet/python/mxnet/../../lib/libmxnet.so
#9  0x76935dd0 in ffi_call_VFP () from /usr/lib/arm-linux-gnueabihf/libffi.so.6
#10 0x769367ec in ffi_call () from /usr/lib/arm-linux-gnueabihf/libffi.so.6
#11 0x76954b68 in _ctypes_callproc () from /usr/lib/python2.7/lib-dynload/_ctypes.arm-linux-gnueabihf.so
Backtrace stopped: Cannot access memory at address 0x52
(gdb) 

See this post on the MXNet forum for more info. Let me know when you need more information.

@tvandergeer
Copy link

I've tried on v1.3.0 (current HEAD) and v1.2.0 branch (which is actually v1.2.1) and had the same results. It breaks in the same CreateVariableNode function.

@tvandergeer
Copy link

It looks like a duplicate to #9094

@larroy
Copy link
Contributor

larroy commented Jun 25, 2018

@tvandergeer how did you build? Our docker builds?

@tvandergeer
Copy link

@larroy I followed this howto (select "Devices" and "Raspberry Pi"). I tried with both export USE_OPENCV = 0 and export USE_OPENCV = 1

@larroy
Copy link
Contributor

larroy commented Jun 25, 2018

We also encountered this issue. Thanks for the backtrace! If you use naive engine I think it won't crash: https://mxnet.incubator.apache.org/faq/env_var.html

Could you confirm?

@tvandergeer
Copy link

I've did the following:

make clean
export MXNET_ENGINE_TYPE=NaiveEngine
make

It still segfaults with the same error

@larroy
Copy link
Contributor

larroy commented Jun 28, 2018

I will try to have a look at this next week when I have access to a raspberry Pi. Feel free to ping this issue again if you don't see activity.

@lebeg
Copy link
Contributor

lebeg commented Jun 28, 2018

Try to set the environment variable export MXNET_ENGINE_TYPE=NaiveEngine during runtime, not during build.

@tvandergeer
Copy link

The environment variable was still active when I ran the code.

@mdenna-nviso
Copy link

Hi all,

I'm having exaclty the same issue, I tried on both Raspberry Pi 3, and Raspberry Pi 3+.
I compiled the latest mxnet using arm-linux-gnueabihf-gcc and did "export MXNET_ENGINE_TYPE=NaiveEngine" during runtime as suggested above in this post.
I have the same backtrace reported by tvandergeer .

Any suggestions?
Thanks

@piyushghai
Copy link
Contributor

@lebeg @larroy Bouncing this ...

@MyraBaba
Copy link

MyraBaba commented Jan 7, 2019

is there any improvement this segfault ?

@larroy
Copy link
Contributor

larroy commented Jan 15, 2019

Can you post a contained example for reproduction? I have used qemu as described in:

https://github.com/apache/incubator-mxnet/blob/master/ci/README.md

And for me seems to work, saving difficulties with cv2

>>> from inception_predict2 import *


>>> predict_from_url("https://i.imgur.com/HzafyBA.jpg") 
pre-processed image in 0.20366191864
forward pass in 63.2164611816
probability=0.718524, class=n02403003 ox
probability=0.176381, class=n02389026 sorrel
probability=0.095558, class=n03868242 oxcart
probability=0.002765, class=n02408429 water buffalo, water ox, Asiatic buffalo, Bubalus bubalis
probability=0.001262, class=n03935335 piggy bank, penny bank
[(0.71852392, 'n02403003 ox'), (0.17638102, 'n02389026 sorrel'), (0.09555836, 'n03868242 oxcart'), (0.0027645244, 'n02408429 water buffalo, water ox, Asiatic buffalo, Bubalus bubalis'), (0.0012616422, 'n03935335 piggy bank, penny bank')]

@larroy
Copy link
Contributor

larroy commented Jan 15, 2019

#13886

@larroy
Copy link
Contributor

larroy commented Jul 17, 2019

Let's close this, I suggest reopen with a backtrace with debug symbols if anyone still experience difficulties.

@yuxihu yuxihu closed this as completed Jul 17, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants