-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Socket monitor hangs if zmq_bind or zmq_setsockopt is failed in ::monitor #1315
Comments
Could be related to #1279 |
Happened again...scenario is disconnecting and connecting REQ-REP again. Monitor was stopped by calling socket monitor with NULL address: after it, we called valid zmq_socket_monitor and again: bind is failed and call hanged. |
Looking for a workaround by removing subscription for monitor stopped event... |
I am talking to myself :) |
What error you got from |
Not possible to check without internal modification, but I think yes: rc == -1 and we go to stop monitor. |
The reason could be in the following. ZeroMQ closes connection asynchronously in separate thread. So closing connection and immediately after it binding to the same address can lead to error because address is still in use. Try to add a small sleep after closing a monitor, probably it will help. |
Thanks, I've done this already and will be writing a unit test tomorrow. Anyway, once got to a trap in the wait(), program will not possible to continue. I am also wondering why send timeout was -1 i my case while sending monitor event and hanged. I checked our application - it sets snd timeout to 10 seconds by setsockopt. Hope I will workaround this issue for myself by adding a sleep... |
So I ran into this issue and did some digging. Turns out you can reliably reproduce this issue simply by reusing the same monitor address twice.
The actually hang is caused by stop_monitor trying to send an event on a socket for which bind has failed. An easy solution would be to simply not send an event when we fail to create or bind the monitor socket. I have a patch, I'll follow up with a pull request. |
Hello,
Version: 4.0.4
RHEL: 6.1
Scenario:
Analysis:
At least I could not find another answer why user thread hangs. I apologize if idea is wrong.
Stack:
Thread 1 (Thread 0x7f80d2e23740 (LWP 15523)):
#0 0x00007f80c82e5053 in poll () from /lib64/libc.so.6
#1 0x00007f80cb341056 in zmq::signaler_t::wait (this=0x1f6816360, timeout_=-1) at signaler.cpp:173
#2 0x00007f80cb33217e in zmq::mailbox_t::recv (this=0x1f6816300, cmd_=0x7fff786b5e20, timeout_=-1) at mailbox.cpp:72
#3 0x00007f80cb341954 in zmq::socket_base_t::process_commands (this=0x1f6816000, timeout_=, throttle_=false) at socket_base.cpp:872
#4 0x00007f80cb341d80 in zmq::socket_base_t::send (this=0x1f6816000, msg_=0x7fff786b5ed0, flags_=) at socket_base.cpp:724
#5 0x00007f80cb35575a in s_sendmsg (s_=0x1f6816000, msg_=0x7fff786b5ed0, flags_=2) at zmq.cpp:350
#6 0x00007f80cb3412d8 in zmq::socket_base_t::monitor_event (this=0x8013700, event_=, addr_="") at socket_base.cpp:1249
#7 0x00007f80cb34366a in zmq::socket_base_t::stop_monitor (this=0x8013700) at socket_base.cpp:1265
#8 0x00007f80cb343886 in zmq::socket_base_t::monitor (this=0x8013700, addr_=0x1f537b0a8 "inproc://monitor.sock.KROMBERG_RTA2ZMQ_REQ0", events_=2047) at socket_base.cpp:1133
Aleksei.
The text was updated successfully, but these errors were encountered: