-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some routes are not sent to peers when commands are sent too quick #736
Comments
Are you flushing the data on the STDOUT between each command ? If not this likely the reason and you need to do it. I should update the documentation to make this a requirement. |
@thomas-mangin , I can see the route is added in store.py: insert_announced. Both dict_nlri & dict_sorted has the change. However after below line in store.py:updates method, some routes are missed. I am trying to identify where it is getting dropped. Is there any change done to dict_sorted after routes are inserted? |
That's bad .. I can not look into it right now but will do. |
Thanks @thomas-mangin . Btw, I am using only for IPv4 VPN Unicast family. |
Are all the updates announcement or do you have some withdraw ? |
this is happening during bootstrap and will be only announcements. |
I found the issue. The incoming commands are added to RIB._modify_sorted through insert_announced method. The worker that is getting all the routes from RIB._modify_sorted and constructs the UPDATE message and sends it. Since our usage is pushing the commands during bootstrap faster than the worker could construct the message, the UPDATE message constructed leaves out some of the commands that were added after the enumerator started with dict_sorted.items(). However RIB._modify_sorted is cleared at the end of the enumeration that is causing the routes added after the enumeration start to be dropped and never announced to the peers. I have the fix that addressed this issue. I have moved the clearing of RIB._modify_sorted to be done just after getting the reference in dict_sorted. I will send out the pull request soon. 2017-10-20_22:13:52.12815 Fri, 20 Oct 2017 22:13:52 | INFO | 5803 | store | store:updates clearing _modify_nlri:290, _modify_sorted:11, entries in modified_sorted: 290 |
@thomas-mangin, pls take a look at #737 |
Thank you very much @ravikumar727 for tracing and fixing this bug - I checked master and it seems the bug was fixed during the large code rewrite the RIB got (totally unexpected side effect). |
fixed with #737 - thank you. |
@thomas-mangin , may i know if i can have a version released with the fix? we need to patch our system with the fix. |
Upload 3.4.21 to GitHub and Pypi |
ISSUE TYPE
OS
VERSION
ENVIRONMENT
CONFIGURATION
SUMMARY
We have an agent that runs as the ExaBGP child process. When the process starts, it does the bootstrap and gets the route details from the route provider service. The agent gets around 15000 routes in 1-2 seconds and pushes it ExaBGP as commands. I am observing that around 300 routes in 15000 are not sent to the BGP peers.
I have looked at the RIB and they have all the routes. I have instrumented the code in the store.py::updates to confirm if update groups are missing those routes. I couldn't find them missing there. With tcpdump, I have validated that these routes are not leaving the wire from the ExaBGP node.
When I added 50 millisecond sleep between commands from agent, it is able to handle without dropping any routes.
May I know where else can these routes be dropped? Any pointers to the module that could potentially miss these routes would be great.
STEPS TO REPRODUCE
EXPECTED RESULTS
All the routes announced to ExaBGP should be announced
ACTUAL RESULTS
Few routes are not announced to BGP peers
IMPORTANCE
The text was updated successfully, but these errors were encountered: