Skip to content

Commit

Permalink
bgpd: avoid clearing routes for peers that were never established
Browse files Browse the repository at this point in the history
Under heavy system load with many peers in passive mode and a large
number of routes, bgpd can enter an infinite loop. This occurs while
processing timeout BGP_OPEN messages, which prevents it from accepting
new connections. The following log entries illustrate the issue:
>bgpd[6151]: [VX6SM-8YE5W][EC 33554460] 3.3.2.224: nexthop_set failed, resetting connection - intf 0x0
>bgpd[6151]: [P790V-THJKS][EC 100663299] bgp_open_receive: bgp_getsockname() failed for peer: 3.3.2.224
>bgpd[6151]: [HTQD2-0R1WR][EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 3.3.2.224
... repeating

The issue occurs when bgpd handles a massive number of routes in the RIB
while receiving numerous BGP_OPEN packets. If bgpd is overloaded, it
fails to process these packets promptly, leading the remote peer to
close the connection and resend BGP_OPEN packets.

When bgpd eventually starts processing these timeout BGP_OPEN packets,
it finds the TCP connection closed by the remote peer, resulting in
"bgp_stop()" being called. For each timeout peer, bgpd must iterate
through the routing table, which is time-consuming and causes new
incoming BGP_OPEN packets to timeout, perpetuating the infinite loop.

To address this issue, the code is modified to check if the peer has
been established at least once before calling "bgp_clear_route_all()".
This ensures that routes are only cleared for peers that had a
successful session, preventing unnecessary iterations over the routing
table for peers that never established a connection.

With this change, BGP_OPEN timeout messages may still occur, but in the
worst case, bgpd will stabilize. Before this patch, bgpd could enter a
loop where it was unable to accpet any new connections.

Signed-off-by: Loïc Sang <[email protected]>
  • Loading branch information
Loïc Sang committed Jun 26, 2024
1 parent bf5f0f1 commit e0ae285
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion bgpd/bgp_fsm.c
Original file line number Diff line number Diff line change
Expand Up @@ -1241,7 +1241,7 @@ void bgp_fsm_change_status(struct peer_connection *connection,
/* Transition into Clearing or Deleted must /always/ clear all routes..
* (and must do so before actually changing into Deleted..
*/
if (status >= Clearing) {
if (status >= Clearing && (peer->established || peer == bgp->peer_self)) {
bgp_clear_route_all(peer);

/* If no route was queued for the clear-node processing,
Expand Down

0 comments on commit e0ae285

Please sign in to comment.