-
Notifications
You must be signed in to change notification settings - Fork 140
Drone blocks, when stream statistics is switched off in port configuration #234
Comments
@Ehlers Not able to repro this on Ubuntu or Windows. Busy cursor means Ostinato didn't get a response from Drone. Can you check what is happening on the Drone side? See if you can run |
Quite unfortunate, that you can't reproduce it. As debugging this on my side will be very time-consuming for both of us, I suggest to close this issue. Maybe it's just my environment, that is unusual and I currently can't spend a lot of time to it. |
No problem. Though, if you could get me the drone side log that may help. |
No problem, see attached file. I only started Ostinato, enabled stream statistics, then disabled it. |
Two observation:
|
Well, I have compiled the drone with
|
That BT is most likely of the main thread which listens for incoming connections. The actual processing for a client/controller is done in a different thread. It is the latter that I suspect is getting stuck somewhere. Now the problem is that it might be difficult to isolate the specific thread in question because there are multiple threads in Drone. What might be easier is to put a breakpoint on The problem is somewhere in that function only (or some function called by that function) Thanks a bunch for your help! |
Here my two GDB logs, the drone hangs in PcapRxStats::stop() As I wrote previously, there is some (light) traffic coming in into that interface. |
I think. I found the reason for blocking. I've added a debug message to PcapRxStats::run() to show me, when (and with which status) pcap_next_ex returns. With the libpcap v1.8.1-3 from debian stretch it returns only, when at least one packet is available. It never returns with status 0 (timeout). On a normal pcap capture this is also an issue (see #215), but normally some packets are always floating around. So it just takes a while until one is received and the capture can be stopped. But with the capture filter in PcapRxStats::run() it returns only, when a statistics packet is received, what's not the case in my test setup. Therefore pcap_next_ex will never return. Here my change to print the pcap_next_ex debug:
|
I had the idea, that the stop function could issue some pcap function to unblock the pcap receive thread. But https://www.tcpdump.org/manpages/pcap_breakloop.3pcap.txt states:
|
Just tested with libpcap v1.6.2, that works fine, no blocking. For me this is a good workaround. Update: libpcap v1.7.4 works as well, so only the newest v1.8 seems problematic. |
@Ehlers Thank you so much for the detailed debugging and investigation. Yes, the workaround of using an older version of libpcap should suffice for now. But I do need to find a way to fix this for the future - the rub is to find a solution that works across Linux, Mac and Windows as I said in #215. Now that 0.9 is out, I'll put on my thinking cap to find a way. Thanks once again! |
On some platforms and/or some libpcap verisons, libpcap doesn't support a timeout which makes interactive stop not possible. So we now use a UNIX signal to break out. Obviously this works only on *nix platforms - which includes MacOS. For now the problem is not seen on Windows with WinPCAP, so we should be fine. May need to revisit when we add Npcap support. Fixes #215, #234
Fixed by 64d1525 |
Version 0.9 revision f32c50e on Debian 9.2 (Stretch)
Simple test:
Enabling works fine, but disabling port statistics leads to a blocked GUI (busy cursor remains active). The drone doesn't use CPU time in that situation, so it is not in an infinite loop.
The output during disabling of the stream statistics:
The text was updated successfully, but these errors were encountered: