Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mp info changes #16961

Merged
merged 5 commits into from
Oct 2, 2024
Merged

Mp info changes #16961

merged 5 commits into from
Oct 2, 2024

Conversation

donaldsharp
Copy link
Member

See last commit for full data. Effectively though, mp_info is expensive this reduces the cost by a decent amount.

bgpd/bgp_mpath.h Outdated Show resolved Hide resolved
@riw777 riw777 self-requested a review October 1, 2024 15:25
The mpath data structure has data that is only relevant
for the first mpath in the list.  It is not being used
anywhere else.  Let's document that a bit more.

Signed-off-by: Donald Sharp <[email protected]>
Test was confusing.  Add some useful data and clean up some debugs

Signed-off-by: Donald Sharp <[email protected]>
Currently bgp multipath has these properties:

a) mp_info may or may not be on a single path, based
upon path perturbations in the past.
b) mp_info->count started counting at 0( meaning 1 ).  As that the
bestpath path_info was never included in the count
c) The first mp_info in the list held the multipath data associated
with the multipath.  As such if you were at any other node that data
was not filled in.
d) As such the mp_info's that are not first on the list basically
were just pointers to the corresponding bgp_path_info that was in
the multipath.
e) On bestpath calculation, a linklist(struct linklist *) of bgp_path_info's was
created.
f) This linklist was passed in to a comparison function that took the
old mpinfo list and compared it item by item to the linklist and
doing magic to figure out how to create a new mp_info list.
g) the old mp_info and the link list had to be memory managed and
freed up.
h) BGP_PATH_MULTIPATH is only set on non bestpath nodes in the
multipath.

This is really complicated.  Let's change the algorithm to this:

a) When running bestpath, mark a bgp_path_info node that could be in the ecmp path as
BGP_PATH_MULTIPATH_NEW.
b) When running multipath, just walk the list of bgp_path_info's and if
it has BGP_PATH_MULTIPATH_NEW on it, decide if it is in BGP_MULTIPATH.
If we run out of space to put in the ecmp, clear the flag on the rest.
c) Clean up the counting of sometimes adding 1 to the mpath count.
d) Only allocate a mpath_info node for the bestpath.  Clean it up
when done with it.
e) remove the unneeded list management associated with the linklist and
the mp_list.

This greatly simplifies multipath computation for bgp and reduces memory
load for large scale deployments.

2 full feeds in work_queue_run prior:

    0      56367.471      1123    50193    493695    50362    493791         0         0          0    TE   work_queue_run

BGP multipath info            :  1941844     48   110780992  1941844 110780992

2 full feeds in work_queue_run after change:

    1      52924.931      1296    40837    465968    41025    487390         0         0          1    TE   work_queue_run

BGP multipath info            :   970860     32    38836880   970866  38837120

Aproximately 4 seconds of saved cpu time for convergence and ~75 mb
smaller run time.

Signed-off-by: Donald Sharp <[email protected]>
This function is no doing any work.  Let's remove.

Signed-off-by: Donald Sharp <[email protected]>
Copy link
Member

@ton31337 ton31337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one question to be sure I got the point. Did I understand this correctly that we get rid of managing bgp_path_info_mpath linked-list at all and just using best path's information for all the multi paths?

@ton31337 ton31337 added this to the 10.2 milestone Oct 2, 2024
@donaldsharp
Copy link
Member Author

yes exactly. The only mp_info allocated is for the bestpath bgp_path_info node.

@ton31337 ton31337 merged commit 9f9d240 into FRRouting:master Oct 2, 2024
11 checks passed
lguohan pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Dec 19, 2024
Why I did it
Added patches from FRR to support scale of BGP neighbors to 256/514. Below are the patches

Patch	FRR Pull request
0069-lib-nexthop-code-should-use-uint16_t-for-nexthop-cou.patch	FRRouting/frr#16967
0070-Allow-16-bit-size-for-nexthops.patch	FRRouting/frr#17023
0071-zebra-Only-notify-dplane-work-pthread-when-needed.patch	FRRouting/frr#17062
0072-Fix-up-improper-handling-of-nexthops-for-nexthop-tra.patch	FRRouting/frr#17076
0073-remove-in6addr-cmp.patch	FRRouting/frr#17312
0074-bgp-best-port-reordering.patch	FRRouting/frr#15572
0075-bgp-mp-info-changes.patch	FRRouting/frr#16961
0076-Optimizations-and-problem-fixing-for-large-scale-ecmp-from-bgp.patch	FRRouting/frr#17229
dgsudharsan added a commit to dgsudharsan/sonic-buildimage that referenced this pull request Dec 23, 2024
…net#21199)

Why I did it
Added patches from FRR to support scale of BGP neighbors to 256/514. Below are the patches

Patch	FRR Pull request
0069-lib-nexthop-code-should-use-uint16_t-for-nexthop-cou.patch	FRRouting/frr#16967
0070-Allow-16-bit-size-for-nexthops.patch	FRRouting/frr#17023
0071-zebra-Only-notify-dplane-work-pthread-when-needed.patch	FRRouting/frr#17062
0072-Fix-up-improper-handling-of-nexthops-for-nexthop-tra.patch	FRRouting/frr#17076
0073-remove-in6addr-cmp.patch	FRRouting/frr#17312
0074-bgp-best-port-reordering.patch	FRRouting/frr#15572
0075-bgp-mp-info-changes.patch	FRRouting/frr#16961
0076-Optimizations-and-problem-fixing-for-large-scale-ecmp-from-bgp.patch	FRRouting/frr#17229
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bgp master size/XXL tests Topotests, make check, etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants