Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BMPServer in topotests should not pkill -f bmpserver #17465

Open
2 tasks done
donaldsharp opened this issue Nov 19, 2024 · 0 comments
Open
2 tasks done

BMPServer in topotests should not pkill -f bmpserver #17465

donaldsharp opened this issue Nov 19, 2024 · 0 comments
Labels
triage Needs further investigation

Comments

@donaldsharp
Copy link
Member

Description

The bmpServer shutdown does a pkill -f bmpserver which says kill all processes with this name. FRR topotests has multiple topotests that run at the same time, the topotests also currently has 2 tests which use the bmpserver. If they happen to be running at the same time and one test finishes before the other, the first test will kill the second tests bmpserver, thus causing it to not properly finish running.

lib/topogen.py:        self.run("pkill -f bmpserver")
sharpd@eva ~/f/t/topotests (more_found_connection_conversion_issues)> git grep add_bmp_server
bgp_bmp/test_bgp_bmp.py:    tgen.add_bmp_server("bmp1", ip="192.0.2.10", defaultRoute="via 192.0.2.1")
bgp_bmp_vrf/test_bgp_bmp_vrf.py:    tgen.add_bmp_server("bmp1", ip="192.0.2.10", defaultRoute="via 192.0.2.1")

I repeatedly see bgp_bmp failing to run properly locally.

the failing test has this log bm1/bmpserver.log:

[2024-11-19 14:43:59] Got message type: <class 'bmp.BMPRouteMonitoring'> 84
[2024-11-19 14:43:59] Got message type: <class 'bmp.BMPRouteMonitoring'> 85
[2024-11-19 14:43:59] Finished dissecting data from ('192.0.2.1', 51660)
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.
[2024-11-19 14:43:59] Received signal 15, shutting down.

The exec.log has this at that time:

2024-11-19 14:43:59,059 DEBUG: r1: vtysh result:
        {
         "vrfId": 0,
         "vrfName": "default",
         "tableVersion": 5,
         "routerId": "192.168.0.1",
         "defaultLocPrf": 100,
         "localAS": 65501,
         "routes": {  "routeDistinguishers" : { "444:2" : { "172.31.0.15/32": [{"valid":true,"bestpath":true,"selectionReason":"First path received","pathFrom":"external","prefix":"172.31.0.15","prefixLen":32,"network":"172.31.0.15/32","version":5,"metric":0,"weight":0,"peerId":"192.168.0.2","path":"65502","origin":"IGP","nexthops":[{"ip":"192.168.0.2","hostname":"r2","afi":"ipv4","used":true}]}]
         }  }  }  }
2024-11-19 14:43:59,059 DEBUG: topo: 'router_json_cmp' succeeded after 0.01 seconds
2024-11-19 14:43:59,059 DEBUG: topo: 'router_json_cmp' polling started (interval 1 secs, maximum 30 tries)
2024-11-19 14:43:59,059 DEBUG: r1: vtysh command => 'show bgp ipv6 vpn json'
2024-11-19 14:43:59,059 DEBUG: r1: cmd_status("/bin/bash -c 'vtysh  -c '"'"'show bgp ipv6 vpn json'"'"' 2>/dev/null'")
2024-11-19 14:43:59,073 DEBUG: r1:
        stdout: ...

pkill is run at this time:

2024-11-19 14:44:31,563 DEBUG: bmp1: cmd_status("/bin/bash -c 'pkill -f bmpserver'")

Version

latest master

How to reproduce

Run multiple bmp tests at the same time.

Expected behavior

one bmp test not to kill another bmp tests mojo

Actual behavior

mojo killed

Additional context

No response

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@donaldsharp donaldsharp added the triage Needs further investigation label Nov 19, 2024
pguibert6WIND pushed a commit to pguibert6WIND/frr that referenced this issue Nov 20, 2024
Multiple BMP tests can run in parallel but, when one instance ends,
it kills the BMP server process of all BMP tests.

Save the PID of a BMP server and only kill it at the end.

Link: FRRouting#17465
Fixes: 875511c ("topotests: add basic bmp collector")
Signed-off-by: Louis Scalbert <[email protected]>
pguibert6WIND pushed a commit to pguibert6WIND/frr that referenced this issue Nov 21, 2024
Multiple BMP tests can run in parallel but, when one instance ends,
it kills the BMP server process of all BMP tests.

Save the PID of a BMP server and only kill it at the end.

Link: FRRouting#17465
Fixes: 875511c ("topotests: add basic bmp collector")
Signed-off-by: Louis Scalbert <[email protected]>
pguibert6WIND pushed a commit to pguibert6WIND/frr that referenced this issue Nov 25, 2024
Multiple BMP tests can run in parallel but, when one instance ends,
it kills the BMP server process of all BMP tests.

Save the PID of a BMP server and only kill it at the end.

Link: FRRouting#17465
Fixes: 875511c ("topotests: add basic bmp collector")
Signed-off-by: Louis Scalbert <[email protected]>
pguibert6WIND pushed a commit to pguibert6WIND/frr that referenced this issue Nov 27, 2024
Multiple BMP tests can run in parallel but, when one instance ends,
it kills the BMP server process of all BMP tests.

Save the PID of a BMP server and only kill it at the end.

Link: FRRouting#17465
Fixes: 875511c ("topotests: add basic bmp collector")
Signed-off-by: Louis Scalbert <[email protected]>
pguibert6WIND pushed a commit to pguibert6WIND/frr that referenced this issue Dec 2, 2024
Multiple BMP tests can run in parallel but, when one instance ends,
it kills the BMP server process of all BMP tests.

Save the PID of a BMP server and only kill it at the end.

Link: FRRouting#17465
Fixes: 875511c ("topotests: add basic bmp collector")
Signed-off-by: Louis Scalbert <[email protected]>
pguibert6WIND pushed a commit to pguibert6WIND/frr that referenced this issue Dec 2, 2024
Multiple BMP tests can run in parallel but, when one instance ends,
it kills the BMP server process of all BMP tests.

Save the PID of a BMP server and only kill it at the end.

Link: FRRouting#17465
Fixes: 875511c ("topotests: add basic bmp collector")
Signed-off-by: Louis Scalbert <[email protected]>
sougata-github-nvidia pushed a commit to sougata-github-nvidia/frr that referenced this issue Dec 16, 2024
Multiple BMP tests can run in parallel but, when one instance ends,
it kills the BMP server process of all BMP tests.

Save the PID of a BMP server and only kill it at the end.

Link: FRRouting#17465
Fixes: 875511c ("topotests: add basic bmp collector")
Signed-off-by: Louis Scalbert <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

1 participant