-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zmq::encoder_base_t segmentation fault #2674
Comments
I don't think there are relevant changes since 4.2.1. Any chance you could build with dbg symbols, run in gdb and try to print the state? At least what reads/writes are causing the segmentation violation? |
Yes. I'll do that. |
And there are no "debug" modes I'm afraid. Also the usual:
From the line number, it's either the buffer it's writing into, unlikely since it's allocated in the same class, or the one it's reading from |
This time it crashed with SIGABRT at zmq::tcp_write after 6.9h and 3,038,554 messages. This is some of the state information from gdb:
What else from gdb might be helpful? I'll leave it open. |
Here is
|
@bluca |
Here is a
|
This too points to memory corruption, from the manpage of send:
It would be useful to print the value of size_ and data_ from the zmq::tcp_write frame. |
@bluca -
|
Could you try to just go up in the backtrace? If you are at the abort |
I did that (up) but the variables were optimized out. When I attempted to look at the assembler (-) to get the values the stack frame was lost. So I started to look for a way to add some debug and may have found and fixed the issue in my code where it calls |
You can also try to build without optimisations and with extra debugs ( |
Thanks a million! I'll do that next if the issue is not resolved. |
Still crashing. Built with no optimizations and extra debug but still see
|
That's a basic sanity check on the message object, which is failing: https://github.com/zeromq/libzmq/blob/v4.2.1/src/msg.cpp#L51 All of these really still points to memory corruption. I would suggest to run the application through valgrind, or compiling with gcc's address sanitizer, to check for buffer overflows etc etc |
Just in case, did you rule out physical memory corruption - e.g. old DRAM chips, power line (or PSU) noise, bad contacts etc.? Is RAM with ECC? Do any other processes and subsystems like FS cache behave funny? Sent from my Xiaomi Redmi Note 4 using FastHub |
@bluca I'll see what gcc address sanitizer and valgrind. I've run valgrind on this code before but not recently and there have been many changes since. @jimklimov - this is a fairly new PC but I will try it on another. |
Made some changes based on valgrind that may have resolved this issue. I'll call it fixed if it survives the weekend. There was a smart pointer to a struct containing smart pointer to a buffer in a map that seemed to be the issue when erasing a range of map entries. Thanks a million for the assistance! |
Great, fingers crossed! |
Segmentation fault at 41.7h. Made additional changes based on valgrind and running it again. I'm closing this issue since the trouble is most likely in my code. |
Got a segmentation fault in zmq::encoder_base_t at encoder.hpp:127 after 25.9 hours of operation and 11,393,496 messages. I probably will not be able to provide a minimal reproducible example. Is there a debug or logging mode I can run zmq in to provide more information? Has anything been fixed in the code base between libzmq 4.2.1 and 4.2.2 that might address this? I didn't see anything that looked related in the release notes.
Environment
Coredump backtrace:
The text was updated successfully, but these errors were encountered: