Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARV_BUFFER_STATUS_TIMEOUT #25

Open
MaxBial opened this issue May 2, 2023 · 10 comments
Open

ARV_BUFFER_STATUS_TIMEOUT #25

MaxBial opened this issue May 2, 2023 · 10 comments

Comments

@MaxBial
Copy link

MaxBial commented May 2, 2023

I am using this github repo for my "The Imaging Source" GiGE Cameras (DFK 33 GX249e - 2,3MP and 48 fps).
I am getting:
[ WARN ] Frame error: ARV_BUFFER_STATUS_TIMEOUT

This happens by using 1 camera at 10 fps.
When using multiple cameras, then in addition I am getting afterwards:
[ ERROR ] Control to aravis device lost.

When I only decrease the framerate (fps) to around 2-4 fps then everything is running good.
I assume that there has to be made only some minor adjustments in the settings/parameters in the code.
I think that I have just to increase/decrease the parameter "Timeout-time" or "Buffer-size".
Maybe somebody already faced with issue and could tell where in the code or how to change those settings, if this is the issue.

Thanks

Best Regards

@Janphr
Copy link

Janphr commented May 9, 2023

Hey,
I'm facing the same problem. Did you solve it?
Thanks!

@tonyromarock
Copy link
Contributor

tonyromarock commented May 23, 2023

Hi @MaxBial,

did you rule out that this might be a bandwidth issue on your network?
We usually increase the mtu size to 9000 on the network card port to avoid any dropped frames (jumbo packets).

How many cameras are you trying to use in parallel?
Here it might be necessary to upgrade to a 10Gbit ethernet switch and 10Gbit network card if all camera streams are coming in over the same network card port to your host PC.

In my experience, we didn't have much success in improving our throughput with the timeout-time and buffer-size parameters.
You could see if setting all cameras with a different packet-delay improves the use of multiple cameras.
This is relevant if all cameras are triggered at the same time.

Let me know if that helped.

@boitumeloruf
Copy link
Collaborator

Hi @MaxBial,

when do you get the '[ ERROR ] Control to aravis device lost'?

We recently encountered the same error. But in our case, this happens before any subscriber attaches to the node and, thus, before any big data junks are sent over the network. We have tried increasing the Timeout-Time, with no success. We haven't yet resolved the error but we assume that this might have something to do with the routing as the cameras are currently attached to a subnet with a Mesh-Wifi.

@bmegli
Copy link

bmegli commented Jul 25, 2023

@MaxBial

[ WARN ] Frame error: ARV_BUFFER_STATUS_TIMEOUT

I am flooded by those also in the following scenario:

  • transport layer (ethernet) MTU was changed (e.g. to 9000)
  • camera_aravis mtu was not changed
    • it is possible to observe this effect with dynamic reconfigure of mtu

So if you change one of them also make sure you change the second.


FWIW, I am on different fork but the code that is potentially causing those problems is shared.


Maybe somebody already faced with issue and could tell where in the code or how to change those settings, if this is the issue.

(DFK 33 GX249e - 2,3MP and 48 fps).

Looking at implementation of camera_aravis

  • if processing time of buffer takes too much time you will get timeouts

Processing time is mainly:

  • pixel format conversion from GiGE-Vision/GenICam to ROS
  • ROS publishing
    • if you are pulling data over some compressed image_transport compression time adds up here

To diagnose you may:

  • pull data at image_raw if you are using some compressed transport
  • comment out pixel format conversion and check if it improves fps

@bmegli
Copy link

bmegli commented Jul 27, 2023

The problem was diagnosed here:

At this point this is cause identified, no solution implemented yet.

@MaxBial
Copy link
Author

MaxBial commented Jul 27, 2023

Back in the day I tried to change in the code the parameters for the "timeout-time" and the 2-3 other parameters at the same position in the code. Unfortunately it had only a small impact. For this reason I had to change to a different Camera with different ROS driver implementation.

The things that I noticed by installing the requirements for the new camera were:
to set "Receive Thread Priority Override"
and
"Increase the ring buffer size using the ethtool command."
and
"Configure the interrupt moderation rate using the ethtool command."

These things were not explicitly required for the camera_aravis driver but I guess maybe it shoulde be?

Best Regards
Max

@bmegli
Copy link

bmegli commented Jul 28, 2023

Transport level optimizations and larger buffers help but here the problem was blocking network communication from running while processing received data.

Solved in a fork today:


This fork is not easy to build in 20.04 + Noetic

  • it depends on latest aravis (a bit different API).
    • latest aravis doesn't build trivially from source in 20.04
      • as it depends on recent Meson build system
        • not packaged in 20.04

But we released packaged version of latest aravis and camera_aravis for Noetic:


As a side note, above fork had some breaking changes compared to this upstream.
I am not sure everything will work the same.

And finally our fork will become stale soon as I have implemented almost everything we needed.

Some last optimizations of pixel format conversions we need are pending.

@mersadsh
Copy link

mersadsh commented Feb 2, 2024

Hi
I also have the same problem of "[ WARN] (cam1) Frame error: ARV_BUFFER_STATUS_TIMEOUT".
With MTU of 2000 and the resolution of 2736x1824, I have the images on Rviz but if I increase the resolution to (like 5472x3648) it throws the Buffer Warning and there is no image.

Things that I have tried until now:

  • Increasing MTU both with " sudo ip link set dev enp0s31f6 mtu 9000" and "GevSCPSPacketSize" from the launch file.
  • Increasing:
    sudo sysctl -w net.core.rmem_max=33554432
    sudo sysctl -w net.core.wmem_max=33554432
    sudo sysctl -w net.core.rmem_default=33554432
    sudo sysctl -w net.core.wmem_default=33554432

Also the thing is, if I increase the MTU to higher than 2000, this resolution 2736x1824 also doesn't work anymore.

Thanks for your help in advance.

@tonyromarock
Copy link
Contributor

Hi @mersadsh, did you try this using the fork mentioned by @bmegli?

Extend-Robotics#12

I can imagine this being an issue with processing the received data fast enough.

@mersadsh
Copy link

mersadsh commented Feb 6, 2024

Yes I also tried it with that fork.
tried decreasing the FPS and still the same warning for resolutions higher than 2736x1824.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants