-
Notifications
You must be signed in to change notification settings - Fork 707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
evpn: scope mac-mobility handling to MAC-VRF of the route #2805
Conversation
6379ff8
to
973bc58
Compare
Looks like the evpn test failed. |
@fujita indeed. Since you're here I would like for your input on the implementation. I am not sure about the behavior when one of the RTs of the route is for an IP-VRF and not just a MAC-VRF (i.e. a route when doing type-5 routing for e.g. a router's mac). I will fix the test, thanks! |
973bc58
to
c438a2c
Compare
@fujita the evpn tests have been fixed. there was indeed a logic error in my patch due to some place I missed accessing the destinations directly (TableManager directly writing to Table's Destinations without going through table, which is bad isolation). I moved accesses to Destination to Table itself. The graceful restart tests do fail which I don't understand since I did not touch this part of the code, especially since it's on an address family not affected by the patch (ipv4/6 unicast vs evpn). Those appear to be flaky since when running them locally they don't fail more often than they do. But they still do occasionally. |
This test ensures that a MAC can be announced by different peers as long as it is in different MAC-VRFs, not applying MAC-mobility algorithms. It does so by adding two additional VRFs for the new route to land in, and by cleaning up a bit the RDs and RTs used by the existing tests.
The mac mobility code used lookup tables to find the routes to the MAC address. It only considered the MAC itself, and not the associated MAC-VRF. This led to the complexity still being way too high when one MAC is present in all MAC-VRFs, like a virtual router or an anycast gateway. This patch now considers the MAC-VRF for a route by looking up its route targets, and considers them all if multiple are present. This way, the lookup for routes that will participate in mac-mobility will be limited to the MAC-VRF. This allows to not blow up the CPU when a MAC address is present in a lot of MAC-VRFs. The biggest behavioral change is now how mac-mobility is sharded. For example, it can now have the same MAC in multiple MAC-VRFs on different PEs. This is quite useful in highly multi-tenant environments where control on the MAC addresses is nigh impossible. This is a net fix of the behavior where MAC-VRFs cannot affect each other. For reference, FRRouting does scope Mac Mobility to the MAC-VRF. But the main reason this was done is to fix the quadratic complexity that comes when there are many times the same MAC in many different MAC-VRFs, by not needing to iterate through all the routes for other unrelated MAC-VRFs. A quick benchmark was done by adding type-2 routes through the gRPC API. For 20k routes, on my laptop (Ryzen 5 PRO 4650U): - without this patch * clean run: 228 seconds (3min 48s), starting around 1200 routes/s and decreasing (non linearily) to less than 50 routes/s * rerun a second time: 424 seconds (7min 4s), stuck at around 50 routes/s - with this patch, regardless of the whether the process is empty or not: 7.7s seconds, at 2700 routes/s the whole time It is to be noted that, without the patch, the whole route ingestion will be throttled by type-2 route ingestion.
c438a2c
to
7d2d020
Compare
rebased |
Thanks! |
The mac mobility code used lookup tables to find the routes to the MAC address. It only considered the MAC itself, and not the associated MAC-VRF. This led to the complexity still being way too high when one MAC is present in all MAC-VRFs, like a virtual router or an anycast gateway.
This patch now considers the MAC-VRF for a route by looking up its route targets, and considers them all if multiple are present. This way, the lookup for routes that will participate in mac-mobility will be limited to the MAC-VRF. This allows to not blow up the CPU when a MAC address is present in a lot of MAC-VRFs.
The biggest behavioral change is now how mac-mobility is sharded. For example, it can now have the same MAC in multiple MAC-VRFs on different PEs. This is quite useful in highly multi-tenant environments where control on the MAC addresses is nigh impossible. This is a net fix of the behavior where MAC-VRFs cannot affect each other. For reference, FRRouting does scope Mac Mobility to the MAC-VRF.
But the main reason this was done is to fix the quadratic complexity that comes when there are many times the same MAC in many different MAC-VRFs, by not needing to iterate through all the routes for other unrelated MAC-VRFs.
A quick benchmark was done by adding type-2 routes through the gRPC API. For 20k routes, on my laptop (Ryzen 5 PRO 4650U):
It is to be noted that, without the patch, the whole route ingestion will be throttled by type-2 route ingestion.
Note: this PR is both a performance and a correctness fix. It is built atop #2804 which is another correctness fix, hence the two commits. Please keep the comments for the first commit in the other pull request