Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BGP BFD strict mode is not working #16186

Closed
2 tasks done
pbrisset opened this issue Jun 7, 2024 · 2 comments
Closed
2 tasks done

BGP BFD strict mode is not working #16186

pbrisset opened this issue Jun 7, 2024 · 2 comments
Assignees

Comments

@pbrisset
Copy link

pbrisset commented Jun 7, 2024

Description

Expectations: When BFD goes down or admin down, associated BGP session must go down and stay down until BFD is back up.
 
Outcome: Currently, associated BGP session goes down and gets re-established.
 
Debugging:
BGP FSM outputs:
 
2024/06/07 16:50:53.747 BGP: [ZWCSR-M7FG9] 192.168.1.1 [FSM] Receive_NOTIFICATION_message (Established->Clearing), fd 25
2024/06/07 16:50:53.757 BGP: [ZWCSR-M7FG9] 192.168.1.1 [FSM] Clearing_Completed (Clearing->Idle), fd -1
2024/06/07 16:50:55.758 BGP: [ZQTB5-H8522] 192.168.1.1 [FSM] Timer (start timer expire).
2024/06/07 16:50:55.758 BGP: [ZWCSR-M7FG9] 192.168.1.1 [FSM] BGP_Start (Idle->Connect), fd -1
2024/06/07 16:50:55.758 BGP: [G0837-S7QES] 192.168.1.1 [FSM] Non blocking connect waiting result, fd 25
2024/06/07 16:50:55.759 BGP: [ZWCSR-M7FG9] 192.168.1.1 [FSM] TCP_connection_open (Active->OpenSent), fd 27
2024/06/07 16:50:55.759 BGP: [ZWCSR-M7FG9] 192.168.1.1 [FSM] TCP_connection_open (Connect->OpenSent), fd 25
2024/06/07 16:50:55.759 BGP: [ZWCSR-M7FG9] 192.168.1.1 [FSM] BGP_Stop (OpenSent->Idle), fd 27
2024/06/07 16:50:55.760 BGP: [ZWCSR-M7FG9] 192.168.1.1 [FSM] Receive_OPEN_message (OpenSent->OpenConfirm), fd 25
2024/06/07 16:50:55.760 BGP: [ZWCSR-M7FG9] 192.168.1.1 [FSM] Receive_KEEPALIVE_message (OpenConfirm->Established), fd 25
2024/06/07 16:50:55.761 BGP: [P3D3N-3277A] 192.168.1.1 [FSM] Timer (routeadv timer expire)
2024/06/07 16:50:56.911 BGP: [P3D3N-3277A] 192.168.1.1 [FSM] Timer (routeadv timer expire)
 
BGP session goes in from Established to Clearing to Idle state.
Then a timer is launched. On expiry, BGP_START is triggered. As part of that function, there is no check about BFD or on any notification from peer routers keeping the FSM in Idle state. The only existing verification is about PEER_FLAG_PASSIVE.

Here is the messages from BFD:

2024/06/07 16:50:53.746 BGP: [Q4BCV-6FHZ5] zclient_bfd_session_update: 0.0.0.0/32 -> 192.168.1.1/32 (interface torm11-eth0) VRF default(0) (CPI bit no): Admin Down
2024/06/07 16:50:53.746 BGP: [MKVHZ-7MS3V] bfd_session_status_update: neighbor 192.168.1.1 vrf default(0) bfd state Up -> Admin Down
2024/06/07 16:50:53.746 BGP: [QFMSE-NPSNN] zclient_bfd_session_update: sessions updated: 1

2024/06/07 16:50:53.747 BGP: [HZN6M-XRM1G] %NOTIFICATION(Hard Reset): received from neighbor 192.168.1.1 6/10 (Cease/BFD Down) 0 bytes

Version

FRR 8.5.1

Problem seems also to be in master

How to reproduce

Using topotest,

I configure the BFD and BGP like this on B2B devices

bfd
profile foo
exit
!
peer 192.168.1.1 interface torm11-eth0
profile foo
receive-interval 299
exit
!
exit

router bgp 65011
bgp router-id 192.168.100.15
neighbor 192.168.1.1 remote-as external
neighbor 192.168.1.1 bfd
neighbor 192.168.1.1 bfd profile foo

Expected behavior

When adding "shutdown" command to "BDF profile foo", I'm expecting the BGP session to 192.168.1.1 to go down and remain down. Unfortunately, it only gets reset.

Actual behavior

As per description, FSM state transitions are shown.

Additional context

The only similar issue open found is: #14266

But it is not quite the same.

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@pbrisset pbrisset added the triage Needs further investigation label Jun 7, 2024
@ton31337 ton31337 added the bgp label Jun 10, 2024
@ton31337 ton31337 self-assigned this Jun 10, 2024
@ton31337 ton31337 added bug and removed triage Needs further investigation labels Jun 11, 2024
ton31337 added a commit to opensourcerouting/frr that referenced this issue Jun 11, 2024
If we do:

```
bfd
 profile foo
  shutdown
```

The session is dropped, but immediately established again because we don't
have a proper check on BFD.

If BFD is administratively shutdown, ignore starting the session.

Fixes: FRRouting#16186

Signed-off-by: Donatas Abraitis <[email protected]>
ton31337 added a commit to opensourcerouting/frr that referenced this issue Jun 11, 2024
If we do:

```
bfd
 profile foo
  shutdown
```

The session is dropped, but immediately established again because we don't
have a proper check on BFD.

If BFD is administratively shutdown, ignore starting the session.

Fixes: FRRouting#16186

Signed-off-by: Donatas Abraitis <[email protected]>
@ton31337
Copy link
Member

Could you test this patch #16194?

ton31337 added a commit to opensourcerouting/frr that referenced this issue Jun 11, 2024
If we do:

```
bfd
 profile foo
  shutdown
```

The session is dropped, but immediately established again because we don't
have a proper check on BFD.

If BFD is administratively shutdown, ignore starting the session.

Fixes: FRRouting#16186

Signed-off-by: Donatas Abraitis <[email protected]>
ton31337 added a commit to opensourcerouting/frr that referenced this issue Jun 11, 2024
If we do:

```
bfd
 profile foo
  shutdown
```

The session is dropped, but immediately established again because we don't
have a proper check on BFD.

If BFD is administratively shutdown, ignore starting the session.

Fixes: FRRouting#16186

Signed-off-by: Donatas Abraitis <[email protected]>
ton31337 added a commit to opensourcerouting/frr that referenced this issue Jun 11, 2024
If we do:

```
bfd
 profile foo
  shutdown
```

The session is dropped, but immediately established again because we don't
have a proper check on BFD.

If BFD is administratively shutdown, ignore starting the session.

Fixes: FRRouting#16186

Signed-off-by: Donatas Abraitis <[email protected]>
@pbrisset
Copy link
Author

Could you test this patch #16194?

Thanks for the bugfixes. It seems to work very nice. Nice work!

@riw777 riw777 closed this as completed in 1fb48f5 Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants