Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Extremely slow "add policy route" #4822

Open
zsxsoft opened this issue Dec 12, 2024 · 14 comments
Open

[BUG] Extremely slow "add policy route" #4822

zsxsoft opened this issue Dec 12, 2024 · 14 comments
Labels
bug Something isn't working performance Anything that can make Kube-OVN faster subnet vpc

Comments

@zsxsoft
Copy link

zsxsoft commented Dec 12, 2024

Kube-OVN Version

v1.12.28

Kubernetes Version

v1.31.2

Operation-system/Kernel Version

TencentOS Server 4.2
6.6.47-12.tl4.x86_64

Description

This issue contains 2 problems.

I have a cluster with 10 nodes, 260 subnets in 1 vpc, ~5k ports. Today, I discovered that my ovs-ovn on some nodes was killed due to OOM. Therefore, I increased the memory limit and restarted the kube-ovn-controller.

Then I found my Work Queue Latency has remained at a very high level. (>10min)
image

I noticed that the controller was continuously performing "add policy route" operations in the logs at a VERY SLOW pace (approximately 1-3 seconds per entry). This's the first problem.

image

I understand that after restarting the KubeOVN controller, it needs to traverse all 10 nodes and 260 subnets. I expected the number of add policy route operations to be ~2600.

[root@vm-master-1 a]# cat 2.log | grep 'add policy route' | wc -l
3558

However, after waiting for a long time, I found that this number far exceeded than it, and there appeared to be a large number of duplicate operations. (Same node, same subnet, but executed twice)

[root@vm-master-1 a]# cat 2.log | grep 'add policy route' | grep 'net.a'
I1212 17:05:13.323328       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.1_ip4, action reroute, extrenalID map[node:node-1 subnet:net-a vendor:kube-ovn]
I1212 17:11:55.754750       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.2_ip4, action reroute, extrenalID map[node:node-2 subnet:net-a vendor:kube-ovn]
I1212 17:14:03.134696       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.4_ip4, action reroute, extrenalID map[node:node-4 subnet:net-a vendor:kube-ovn]
I1212 17:19:21.932002       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.3_ip4, action reroute, extrenalID map[node:node-3 subnet:net-a vendor:kube-ovn]
I1212 17:21:50.341122       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.vm.master.1_ip4, action reroute, extrenalID map[node:vm-master-1 subnet:net-a vendor:kube-ovn]
I1212 17:23:18.262599       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.9_ip4, action reroute, extrenalID map[node:node-9 subnet:net-a vendor:kube-ovn]
I1212 17:31:00.166875       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.5_ip4, action reroute, extrenalID map[node:node-5 subnet:net-a vendor:kube-ovn]
I1212 17:33:28.833554       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.8_ip4, action reroute, extrenalID map[node:node-8 subnet:net-a vendor:kube-ovn]
I1212 17:34:44.164926       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.1_ip4, action reroute, extrenalID map[node:node-1 subnet:net-a vendor:kube-ovn]
I1212 17:34:46.367902       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.2_ip4, action reroute, extrenalID map[node:node-2 subnet:net-a vendor:kube-ovn]
I1212 17:34:49.141808       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.3_ip4, action reroute, extrenalID map[node:node-3 subnet:net-a vendor:kube-ovn]
I1212 17:34:51.828600       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.vm.master.1_ip4, action reroute, extrenalID map[node:vm-master-1 subnet:net-a vendor:kube-ovn]
I1212 17:34:54.537169       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.4_ip4, action reroute, extrenalID map[node:node-4 subnet:net-a vendor:kube-ovn]
I1212 17:34:57.445022       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.8_ip4, action reroute, extrenalID map[node:node-8 subnet:net-a vendor:kube-ovn]
I1212 17:35:00.533108       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.5_ip4, action reroute, extrenalID map[node:node-5 subnet:net-a vendor:kube-ovn]
I1212 17:35:03.406251       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.6_ip4, action reroute, extrenalID map[node:node-6 subnet:net-a vendor:kube-ovn]
I1212 17:35:06.634137       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.7_ip4, action reroute, extrenalID map[node:node-7 subnet:net-a vendor:kube-ovn]
I1212 17:35:09.738121       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.9_ip4, action reroute, extrenalID map[node:node-9 subnet:net-a vendor:kube-ovn]
I1212 17:36:34.646594       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.6_ip4, action reroute, extrenalID map[node:node-6 subnet:net-a vendor:kube-ovn]
I1212 17:44:43.345692       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.7_ip4, action reroute, extrenalID map[node:node-7 subnet:net-a vendor:kube-ovn]

Now I'm unable to create new subnets, so I plan to wait overnight and check again the next day to see if the operations have completed. If more information is needed, please contact me.

Steps To Reproduce

/

Current Behavior

/

Expected Behavior

/

@zsxsoft zsxsoft added the bug Something isn't working label Dec 12, 2024
@zsxsoft zsxsoft changed the title [BUG] Creating multiple [BUG] Extremely slow "add policy route" Dec 12, 2024
Copy link

dosubot bot commented Dec 12, 2024

I found a similar closed issue that might be relevant to your problem: underlay subnet has an unexpected policy route. This issue was resolved by a pull request that fixed the logical router policy for the default subnet with logical gateway enabled [1].

Regarding the duplicate "add policy route" operations you're experiencing, Kube-OVN has a mechanism to handle policy routes and prevent duplicate entries. It uses functions like diffPolicyRouteWithExisted and diffPolicyRouteWithLogical to compare existing policy routes with target routes, determining which routes need to be added or deleted. This ensures that only necessary policy routes are added or removed, effectively handling duplicate entries by maintaining a single valid entry for each route [2][3].

If you're seeing a large number of duplicate operations, it might be worth checking if these functions are being correctly utilized in your setup or if there might be a specific configuration issue causing this behavior.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@zsxsoft
Copy link
Author

zsxsoft commented Dec 12, 2024

Check the Work Queue Depth, I found the Subnet is decreasing with very low speed:

18:12:00 AddSubnet 243
18:24:00 AddSubnet 242
18:32:30 AddSubnet 233
18:41:00 AddSubnet 242 (Why go backed?)
18:52:30 AddSubnet 243 (huh?)
18:54:00 AddSubnet 240
18:59:30 AddSubnet 226 (Decreacing now)
19:08:00 AddSubnet 202
19:14:00 AddSUbnet 184

image

@oilbeater oilbeater added the performance Anything that can make Kube-OVN faster label Dec 13, 2024
@zsxsoft
Copy link
Author

zsxsoft commented Dec 17, 2024

Any reason for slow add policy route?

I1212 17:11:55.754750       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.2_ip4, action reroute, extrenalID map[node:node-2 subnet:net-a vendor:kube-ovn]
I1212 17:14:03.134696       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.4_ip4, action reroute, extrenalID map[node:node-4 subnet:net-a vendor:kube-ovn]
I1212 17:19:21.932002       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.3_ip4, action reroute, extrenalID map[node:node-3 subnet:net-a vendor:kube-ovn]

Copy link

dosubot bot commented Dec 17, 2024

The slow "add policy route" operation in Kube-OVN is primarily due to two issues:

  1. Execution Time: Each policy route addition takes 1-3 seconds, which significantly slows down the processing of the entire queue.

  2. Duplicate Operations: There are duplicate operations for the same node and subnet combinations, leading to more operations than expected. This is evident from the logs where the same node and subnet combinations are processed multiple times.

These factors contribute to the slow decrease in the "Work Queue Depth" for "Subnet," with occasional increases due to re-processing after events like a controller restart following an Out of Memory (OOM) issue [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@oilbeater
Copy link
Collaborator

@zsxsoft This may be related to the logic of adding policies. When add policy, the kube-ovn-controller needs to check if the policy exists by performing a long string match. In your environment, each policy addition requires approximately 260 × 10 checks, which amounts to a total of 260 × 260 × 10 checks.

@zsxsoft
Copy link
Author

zsxsoft commented Dec 19, 2024

@oilbeater
I checked this

if err := c.addPolicyRouteToVpc(

It seems that each time addPolicyRouteToVpc is executed, a listLogicalRouterPoliciesByFilter is required. I found that running kubectl ko nbctl lr-policy-list ovn-cluster | wc -l is quite slow. So I guess listLogicalRouterPoliciesByFilter is slow too.
Despite the presence of a whereCache, I guess that each add operation may invalidate this cache. Given that the addPolicyRoute needs to be run count(node)*count(subnet) times, it might be beneficial to refactor the process to batch these API calls.

@zsxsoft
Copy link
Author

zsxsoft commented Dec 19, 2024

I see #4538, is the patch useful? I didn't measure the time.

@oilbeater
Copy link
Collaborator

@zsxsoft it should help. I tested v1.14.0 with this patch and v1.12.28 without this patch in a cluster with 5 nodes and 250 subnets.

For v1.14.0, it takes about 10ms to process a listLogicalRouterPoliciesByFilter and it takes about 600ms in v1.12.28.

@zsxsoft
Copy link
Author

zsxsoft commented Dec 20, 2024

@oilbeater O i see, I read the wrong version of code. I cherry-picked a6f13a6 into v1.12.30, with the help of b76c044, add policy route is fast now.

But it still used 27min to create 100 subnets. Is it expected?

image

@zsxsoft
Copy link
Author

zsxsoft commented Dec 20, 2024

Log:

It seems it will delay 10s before ResetLogicalSwitchAclSuccess

I1220 12:35:30.629704       7 event.go:377] Event(v1.ObjectReference{Kind:"Subnet", Namespace:"", Name:"net-a", UID:"a0c58e11-f526-419b-b991-8a27a87d40c0", APIVersion:"kubeovn.io/v1", ResourceVersion:"7983760", FieldPath:""}): type: 'Normal' reason: 'ValidateLogicalSwitchSuccess' 
I1220 12:35:30.662584       7 subnet.go:2378] add common policy route for router: ovn-cluster, match ip4.dst == 192.168.130.0/24, action allow, externalID map[subnet:net-a vendor:kube-ovn]
I1220 12:35:30.674329       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.5_ip4, action reroute, extrenalID map[node:node.5 subnet:net-a vendor:kube-ovn]
I1220 12:35:30.713788       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.8_ip4, action reroute, extrenalID map[node:node.8 subnet:net-a vendor:kube-ovn]
I1220 12:35:30.724851       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.0_ip4, action reroute, extrenalID map[node:node.0 subnet:net-a vendor:kube-ovn]
I1220 12:35:30.755934       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.1_ip4, action reroute, extrenalID map[node:node.1 subnet:net-a vendor:kube-ovn]
I1220 12:35:31.028653       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.6_ip4, action reroute, extrenalID map[node:node.6 subnet:net-a vendor:kube-ovn]
I1220 12:35:31.113869       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.7_ip4, action reroute, extrenalID map[node:node. subnet:net-a vendor:kube-ovn]
I1220 12:35:31.125554       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.9_ip4, action reroute, extrenalID map[node:node.9 subnet:net-a vendor:kube-ovn]
I1220 12:35:31.136399       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.2_ip4, action reroute, extrenalID map[node:node.2 subnet:net-a vendor:kube-ovn]
I1220 12:35:31.212118       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.3_ip4, action reroute, extrenalID map[node:node.3 subnet:net-a vendor:kube-ovn]
I1220 12:35:31.223307       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.vm.master.1_ip4, action reroute, extrenalID map[node:vm-master-1 subnet:net-a vendor:kube-ovn]
I1220 12:35:43.510052       7 event.go:377] Event(v1.ObjectReference{Kind:"Subnet", Namespace:"", Name:"net-a", UID:"a0c58e11-f526-419b-b991-8a27a87d40c0", APIVersion:"kubeovn.io/v1", ResourceVersion:"7983760", FieldPath:""}): type: 'Normal' reason: 'ResetLogicalSwitchAclSuccess' 


I1220 12:35:43.574351       7 subnet.go:350] format subnet net-b, changed false
I1220 12:35:43.574396       7 vpc.go:147] handle status update for vpc ovn-cluster
W1220 12:35:43.583090       7 warnings.go:70] metadata.finalizers: "kube-ovn-controller": prefer a domain-qualified finalizer name to avoid accidental conflicts with other finalizer writers
I1220 12:35:44.611513       7 event.go:377] Event(v1.ObjectReference{Kind:"Subnet", Namespace:"", Name:"net-b", UID:"a1659119-253b-4084-87c4-2f995037968f", APIVersion:"kubeovn.io/v1", ResourceVersion:"7983854", FieldPath:""}): type: 'Normal' reason: 'ValidateLogicalSwitchSuccess' 
I1220 12:35:44.670915       7 subnet.go:2378] add common policy route for router: ovn-cluster, match ip4.dst == 192.168.246.0/24, action allow, externalID map[subnet:net-b vendor:kube-ovn]
I1220 12:35:44.681741       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.b.node.2_ip4, action reroute, extrenalID map[node:node.2 subnet:net-b vendor:kube-ovn]
I1220 12:35:44.718989       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.b.node.3_ip4, action reroute, extrenalID map[node:node.3 subnet:net-b vendor:kube-ovn]
I1220 12:35:44.752259       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.b.vm.master.1_ip4, action reroute, extrenalID map[node:vm-master-1 subnet:net-b vendor:kube-ovn]
I1220 12:35:44.762959       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.b.node.6_ip4, action reroute, extrenalID map[node:node.6 subnet:net-b vendor:kube-ovn]
I1220 12:35:44.773720       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.b.node.7_ip4, action reroute, extrenalID map[node:node. subnet:net-b vendor:kube-ovn]
I1220 12:35:44.784415       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.b.node.9_ip4, action reroute, extrenalID map[node:node.9 subnet:net-b vendor:kube-ovn]
I1220 12:35:44.830172       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.b.node.1_ip4, action reroute, extrenalID map[node:node.1 subnet:net-b vendor:kube-ovn]
I1220 12:35:44.853651       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.b.node.5_ip4, action reroute, extrenalID map[node:node.5 subnet:net-b vendor:kube-ovn]
I1220 12:35:44.864214       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.b.node.8_ip4, action reroute, extrenalID map[node:node.8 subnet:net-b vendor:kube-ovn]
I1220 12:35:44.874536       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.b.node.0_ip4, action reroute, extrenalID map[node:node.0 subnet:net-b vendor:kube-ovn]
I1220 12:35:59.670445       7 event.go:377] Event(v1.ObjectReference{Kind:"Subnet", Namespace:"", Name:"net-b", UID:"a1659119-253b-4084-87c4-2f995037968f", APIVersion:"kubeovn.io/v1", ResourceVersion:"7983854", FieldPath:""}): type: 'Normal' reason: 'ResetLogicalSwitchAclSuccess' 


I1220 12:35:59.772088       7 subnet.go:350] format subnet net-e, changed false
I1220 12:35:59.772125       7 vpc.go:147] handle status update for vpc ovn-cluster
W1220 12:35:59.782419       7 warnings.go:70] metadata.finalizers: "kube-ovn-controller": prefer a domain-qualified finalizer name to avoid accidental conflicts with other finalizer writers
I1220 12:36:00.783064       7 event.go:377] Event(v1.ObjectReference{Kind:"Subnet", Namespace:"", Name:"net-e", UID:"a6366b08-a76a-4beb-aa4f-1163813a3312", APIVersion:"kubeovn.io/v1", ResourceVersion:"7983881", FieldPath:""}): type: 'Normal' reason: 'ValidateLogicalSwitchSuccess' 
I1220 12:36:00.852294       7 subnet.go:2378] add common policy route for router: ovn-cluster, match ip4.dst == 192.168.193.0/24, action allow, externalID map[subnet:net-e vendor:kube-ovn]
I1220 12:36:00.863058       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.e.node.8_ip4, action reroute, extrenalID map[node:node.8 subnet:net-e vendor:kube-ovn]
I1220 12:36:00.874105       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.e.node.0_ip4, action reroute, extrenalID map[node:node.0 subnet:net-e vendor:kube-ovn]
I1220 12:36:00.919294       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.e.node.1_ip4, action reroute, extrenalID map[node:node.1 subnet:net-e vendor:kube-ovn]
I1220 12:36:00.954065       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.e.node.5_ip4, action reroute, extrenalID map[node:node.5 subnet:net-e vendor:kube-ovn]
I1220 12:36:00.965210       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.e.node.7_ip4, action reroute, extrenalID map[node:node. subnet:net-e vendor:kube-ovn]
I1220 12:36:01.013636       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.e.node.9_ip4, action reroute, extrenalID map[node:node.9 subnet:net-e vendor:kube-ovn]
I1220 12:36:01.024934       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.e.node.2_ip4, action reroute, extrenalID map[node:node.2 subnet:net-e vendor:kube-ovn]
I1220 12:36:01.057751       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.e.node.3_ip4, action reroute, extrenalID map[node:node.3 subnet:net-e vendor:kube-ovn]
I1220 12:36:01.068185       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.e.vm.master.1_ip4, action reroute, extrenalID map[node:vm-master-1 subnet:net-e vendor:kube-ovn]
I1220 12:36:01.113917       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.e.node.6_ip4, action reroute, extrenalID map[node:node.6 subnet:net-e vendor:kube-ovn]
I1220 12:36:14.362929       7 event.go:377] Event(v1.ObjectReference{Kind:"Subnet", Namespace:"", Name:"net-e", UID:"a6366b08-a76a-4beb-aa4f-1163813a3312", APIVersion:"kubeovn.io/v1", ResourceVersion:"7983881", FieldPath:""}): type: 'Normal' reason: 'ResetLogicalSwitchAclSuccess' 


I1220 12:36:14.432461       7 subnet.go:350] format subnet net-c, changed false
I1220 12:36:14.432530       7 vpc.go:147] handle status update for vpc ovn-cluster
W1220 12:36:14.441528       7 warnings.go:70] metadata.finalizers: "kube-ovn-controller": prefer a domain-qualified finalizer name to avoid accidental conflicts with other finalizer writers
I1220 12:36:15.443150       7 event.go:377] Event(v1.ObjectReference{Kind:"Subnet", Namespace:"", Name:"net-c", UID:"3cc359b0-2a6e-4413-ae5d-29874086afe5", APIVersion:"kubeovn.io/v1", ResourceVersion:"7983939", FieldPath:""}): type: 'Normal' reason: 'ValidateLogicalSwitchSuccess' 
I1220 12:36:15.471941       7 subnet.go:2378] add common policy route for router: ovn-cluster, match ip4.dst == 192.168.112.0/24, action allow, externalID map[subnet:net-c vendor:kube-ovn]
I1220 12:36:15.510883       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.c.node.1_ip4, action reroute, extrenalID map[node:node.1 subnet:net-c vendor:kube-ovn]
I1220 12:36:15.522615       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.c.node.5_ip4, action reroute, extrenalID map[node:node.5 subnet:net-c vendor:kube-ovn]
I1220 12:36:15.555810       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.c.node.8_ip4, action reroute, extrenalID map[node:node.8 subnet:net-c vendor:kube-ovn]
I1220 12:36:15.567646       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.c.node.0_ip4, action reroute, extrenalID map[node:node.0 subnet:net-c vendor:kube-ovn]
I1220 12:36:15.579729       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.c.node.2_ip4, action reroute, extrenalID map[node:node.2 subnet:net-c vendor:kube-ovn]
I1220 12:36:15.618937       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.c.node.3_ip4, action reroute, extrenalID map[node:node.3 subnet:net-c vendor:kube-ovn]
I1220 12:36:15.651881       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.c.vm.master.1_ip4, action reroute, extrenalID map[node:vm-master-1 subnet:net-c vendor:kube-ovn]
I1220 12:36:15.663021       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.c.node.6_ip4, action reroute, extrenalID map[node:node.6 subnet:net-c vendor:kube-ovn]
I1220 12:36:15.672286       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.c.node.7_ip4, action reroute, extrenalID map[node:node. subnet:net-c vendor:kube-ovn]
I1220 12:36:15.709975       7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.c.node.9_ip4, action reroute, extrenalID map[node:node.9 subnet:net-c vendor:kube-ovn]
I1220 12:36:29.324412       7 event.go:377] Event(v1.ObjectReference{Kind:"Subnet", Namespace:"", Name:"net-c", UID:"3cc359b0-2a6e-4413-ae5d-29874086afe5", APIVersion:"kubeovn.io/v1", ResourceVersion:"7983939", FieldPath:""}): type: 'Normal' reason: 'ResetLogicalSwitchAclSuccess' 

@oilbeater
Copy link
Collaborator

ACLs may have same problem as listLogicalRouterPoliciesByFilter. Can you give an example of subnet yaml. I can try to reproduce it in my cluster.

@zsxsoft
Copy link
Author

zsxsoft commented Dec 20, 2024

apiVersion: kubeovn.io/v1
kind: Subnet
metadata:
  name: net-a
spec:
  acls:
  - action: drop
    direction: from-lport
    match: ip.dst == 192.168.103.2
    priority: 1021
  cidrBlock: 192.168.103.0/24
  default: false
  dhcpV4Options: lease_time=3600,router=192.168.103.1,server_id=192.168.103.1,mtu=1400
  enableDHCP: true
  enableLb: true
  excludeIps:
  - 192.168.103.1
  gateway: 192.168.103.1
  gatewayNode: ""
  gatewayType: distributed
  mtu: 1400
  natOutgoing: true
  policyRoutingPriority: 3368
  policyRoutingTableID: 3368
  private: false
  protocol: IPv4
  provider: ovn
  vpc: ovn-cluster

@oilbeater
Copy link
Collaborator

I am unable to reproduce this issue in the cluster I’m working on. Do you have a large number of ACL rules configured in your subnet? After reviewing the code, the only suspicious part I found in the logs is this: subnet.go, line 872.

@zsxsoft
Copy link
Author

zsxsoft commented Dec 23, 2024

@oilbeater No, only 1 ACL is applied, but all subnets contain it. I have lots of SecurityGroup(~300), does this affect?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance Anything that can make Kube-OVN faster subnet vpc
Projects
None yet
Development

No branches or pull requests

2 participants