Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zebra: Fix to avoid two Vrfs with same table ids #16350

Merged

Conversation

raja-rajasekar
Copy link
Contributor

During internal testing, when the following sequence is followed, two non default vrfs end up pointing to the same table-id

  • Initially vrf201 has table id 1002
  • ip link add dev vrf202 type vrf table 1002
  • ip link set dev vrf202 up
  • ip link set dev master vrf202

This will ideally lead to zebra exit since this is a misconfiguration as expected.

However if we perform a restart frr.service at this point, we end up having two vrfs pointing to same table-id and bad things can happen.

root@mlx-3700-20:mgmt:/var/log/frr# sudo vtysh -c "sh vrf" vrf mgmt id 37 table 1001
vrf vrf201 id 46 table 1002
vrf vrf202 id 59 table 1002 >>>>

Fix: in all cases of misconfiguration, exit zebra as expected.

Ticket :#3970414

@raja-rajasekar
Copy link
Contributor Author

Testing

Without Fix:

root@mlx-3700-20:mgmt:/var/log/frr# vtysh --help | grep version
Integrated shell for FRR (version 10.2-dev-MyOwnFRRVersion).
root@mlx-3700-20:mgmt:/var/log/frr# sudo vtysh -c "sh vrf"
vrf mgmt id 37 table 1001
vrf vrf201 id 46 table 1002


ip link add dev vrf202 type vrf table 1002
ip link set dev vrf202 up
ip link set dev <intrerface> master vrf202
This will trigger zebra exit as expected

root@mlx-3700-20:mgmt:/var/log/frr# sudo vtysh -c "sh vrf"
zebra is not running

root@mlx-3700-20:mgmt:/var/log/frr# date
Fri Jul  5 10:26:11 PM UTC 2024

root@mlx-3700-20:mgmt:/var/log/frr# cat raja.log | grep overlap
2024/07/05 22:25:39.822632 ZEBRA: [Q9Z6Y-66B7H][EC 4043309160] VRF vrf202 id 59 table id overlaps existing vrf vrf201(46), misconfiguration exiting


root@mlx-3700-20:mgmt:/var/log/frr# sudo systemctl restart frr

root@mlx-3700-20:mgmt:/var/log/frr# sudo systemctl status frr
● frr.service - FRRouting
     Loaded: loaded (/lib/systemd/system/frr.service; enabled; preset: enabled)
     Active: active (running) since Fri 2024-07-05 22:27:52 UTC; 6s ago
       Docs: https://frrouting.readthedocs.io/en/latest/setup.html
    Process: 830168 ExecStart=/usr/lib/frr/frrinit.sh start (code=exited, status=0/SUCCESS)
     Status: "FRR Operational"
      Tasks: 15 (limit: 18738)
     Memory: 32.0M
        CPU: 497ms
     CGroup: /system.slice/frr.service
             ├─830181 /usr/lib/frr/watchfrr -d -F datacenter zebra mgmtd bgpd staticd
             ├─830193 /usr/lib/frr/zebra -d -F datacenter
             ├─830198 /usr/lib/frr/mgmtd -d -F datacenter
             ├─830200 /usr/lib/frr/bgpd -d -F datacenter
             └─830207 /usr/lib/frr/staticd -d -F datacenter
……
root@mlx-3700-20:mgmt:/var/log/frr# sudo vtysh -c "sh vrf"
vrf mgmt id 37 table 1001
vrf vrf201 id 46 table 1002
vrf vrf202 id 59 table 1002 >>>> THIS IS WRONG.

With Fix:

root@mlx-3700-20:mgmt:/var/log/frr# sudo vtysh -c "sh vrf"
vrf mgmt id 37 table 1001
vrf vrf201 id 46 table 1002
root@mlx-3700-20:mgmt:/var/log/frr# sudo vtysh -c "sh vrf"
zebra is not running
root@mlx-3700-20:mgmt:/var/log/frr# date
Fri Jul  5 10:31:55 PM UTC 2024
root@mlx-3700-20:mgmt:/var/log/frr# cat raja.log | grep overlap
2024/07/05 22:31:38.087296 ZEBRA: [Q9Z6Y-66B7H][EC 4043309160] VRF vrf202 id 61 table id overlaps existing vrf vrf201(46), misconfiguration exiting

root@mlx-3700-20:mgmt:/var/log/frr# sudo systemctl restart frr
root@mlx-3700-20:mgmt:/var/log/frr# sudo systemctl status frr
● frr.service - FRRouting
     Loaded: loaded (/lib/systemd/system/frr.service; enabled; preset: enabled)
     Active: active (running) since Fri 2024-07-05 22:33:24 UTC; 40s ago
       Docs: https://frrouting.readthedocs.io/en/latest/setup.html
    Process: 836554 ExecStart=/usr/lib/frr/frrinit.sh start (code=exited, status=0/SUCCESS)
     Status: "restarting all"
      Tasks: 7 (limit: 18738)
     Memory: 22.8M
        CPU: 987ms
     CGroup: /system.slice/frr.service
             ├─836568 /usr/lib/frr/watchfrr -d -F datacenter zebra mgmtd bgpd staticd
             ├─837440 /usr/lib/frr/mgmtd -d -F datacenter
             ├─837443 /usr/lib/frr/bgpd -d -F datacenter
             └─837450 /usr/lib/frr/staticd -d -F datacenter

root@mlx-3700-20:mgmt:/var/log/frr# sudo vtysh -c "sh vrf"
zebra is not running

@raja-rajasekar raja-rajasekar force-pushed the rajasekarr/table_id_for_2vrf_3970414 branch from 4e94167 to bb033c2 Compare July 5, 2024 23:20
zebra/interface.c Outdated Show resolved Hide resolved
@ton31337
Copy link
Member

ton31337 commented Jul 7, 2024

@Mergifyio backport dev/10.1 stable/10.0 stable/9.1 stable/9.0

@ton31337 ton31337 added this to the 10.1 milestone Jul 7, 2024
Copy link

mergify bot commented Jul 7, 2024

backport dev/10.1 stable/10.0 stable/9.1 stable/9.0

✅ Backports have been created

@raja-rajasekar raja-rajasekar force-pushed the rajasekarr/table_id_for_2vrf_3970414 branch from bb033c2 to a80298a Compare July 7, 2024 19:36
@raja-rajasekar raja-rajasekar requested a review from ton31337 July 7, 2024 19:42
@raja-rajasekar
Copy link
Contributor Author

ci:rerun

1 similar comment
@raja-rajasekar
Copy link
Contributor Author

ci:rerun


if (strcmp(name, vrf->name)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vrf can be null, so... segfault is guaranteed here, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies; our internal code has the vrf NULL check missing, so I directly did strcmp.
I added the check appropriately now.

@raja-rajasekar raja-rajasekar force-pushed the rajasekarr/table_id_for_2vrf_3970414 branch from a80298a to e03cba8 Compare July 9, 2024 15:09
@github-actions github-actions bot added size/M and removed size/S labels Jul 9, 2024
@raja-rajasekar
Copy link
Contributor Author

ci:rerun

During internal testing, when the following sequence is followed, two
non default vrfs end up pointing to the same table-id

 - Initially vrf201 has table id 1002
 - ip link add dev vrf202 type vrf table 1002
 - ip link set dev vrf202 up
 - ip link set dev <intrerface> master vrf202

This will ideally lead to zebra exit since this is a misconfiguration as
expected.

However if we perform a restart frr.service at this point, we end up
having two vrfs pointing to same table-id and bad things can happen.
This is because in the interface_vrf_change, we incorrectly check for
vrf_lookup_by_id() to evaluate if there is a misconfig. This works well
for a non restart case but not for the startup case.

root@mlx-3700-20:mgmt:/var/log/frr# sudo vtysh -c "sh vrf"
vrf mgmt id 37 table 1001
vrf vrf201 id 46 table 1002
vrf vrf202 id 59 table 1002 >>>>

Fix: in all cases of misconfiguration, exit zebra as expected.

Ticket :#3970414

Signed-off-by: Donald Sharp <[email protected]>

Signed-off-by: Rajasekar Raja <[email protected]>
@raja-rajasekar raja-rajasekar force-pushed the rajasekarr/table_id_for_2vrf_3970414 branch from e03cba8 to c77e157 Compare July 12, 2024 21:31
@ton31337 ton31337 merged commit 659741f into FRRouting:master Jul 14, 2024
11 checks passed
Jafaral added a commit that referenced this pull request Jul 15, 2024
zebra: Fix to avoid two Vrfs with same table ids (backport #16350)
ton31337 added a commit that referenced this pull request Jul 16, 2024
zebra: Fix to avoid two Vrfs with same table ids (backport #16350)
Jafaral added a commit that referenced this pull request Jul 25, 2024
zebra: Fix to avoid two Vrfs with same table ids (backport #16350)
@raja-rajasekar raja-rajasekar deleted the rajasekarr/table_id_for_2vrf_3970414 branch August 30, 2024 06:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants