OCPBUGS-55962: Inter advertised UDN isolation configurable#2569
OCPBUGS-55962: Inter advertised UDN isolation configurable#2569pperiyasamy wants to merge 2 commits intoopenshift:masterfrom
Conversation
b8124a9 to
2f9dcd9
Compare
b8fad75 to
b09dee2
Compare
|
@pperiyasamy: This pull request references Jira Issue OCPBUGS-55962, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
kyrtapz
left a comment
There was a problem hiding this comment.
Looks good overall.
I think you are missing a check in configureAdvertisedNetworkIsolation as well.
| // UDNLooseIsolation allows communication between two advertised UDN networks. | ||
| UDNLooseIsolation string = "loose" | ||
| // UDNLooseIsolation drops communication between two advertised UDN networks. | ||
| UDNSecureIsolation string = "secure" |
There was a problem hiding this comment.
NIT: I would remove this variable since it is not used anywhere.
| return "", "" | ||
| } | ||
|
|
||
| // IsLooseUDNIsolation returns true if two UDN networks are not configured to be |
There was a problem hiding this comment.
I would mention that this regards pod to pod on advertised networks isolation. The host->udn isolation is still in place with this pr.
There was a problem hiding this comment.
updated the comment.
b09dee2 to
45d72b1
Compare
ok @kyrtapz, thought it would be no harm having those address sets lying there. but anyway added check now for |
|
/retest |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: kyrtapz, pperiyasamy The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| if util.IsLooseUDNIsolation() { | ||
| klog.Infof("Skip creating global advertised networks addressset in loose UDN isolation mode") | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Shouldn't we actually ensure the address set is not there if start with loose UDN isolation?
There was a problem hiding this comment.
My assumption was that moving ovn-k to loose mode would only be "supported" when there is no advertised networks. If that is not the case I agree we need to add the necessary cleanup.
There was a problem hiding this comment.
yes, that was my assumption as well, AFAIU this is a stopgap arrangement for testing loose UDN isolation mode in a lab. This commit will be removed once it's implemented in upstream with a proper UDN isolation specific API.
There was a problem hiding this comment.
If this is an assumption, can we pass it by @trozet in case he is aware of something else?
There was a problem hiding this comment.
So toggling this knob won't do anything for networks that were already advertised. It would only affect networks that became advertised after.
There was a problem hiding this comment.
I think my current opinion is that we should do this upstream with a global config flag to enable/disable "RoutedUDNIsolation", default enabled. In the future we might make this a per RA thing, where it might be "UDNIsolation": default strict, maybe another mode is "AllowExternalRouting".
I think the code should sync the current state if the flag is flipped on or off. So if someone toggles the flag on an existing cluster, it should remove the address set, ACL, whatever that was configured to enforce isolation.
There was a problem hiding this comment.
@trozet @jcaamano agree to your thoughts, here is upstream PR as suggested above: ovn-kubernetes/ovn-kubernetes#5276. PTAL.
There was a problem hiding this comment.
I have to concerns with the upstream approach:
- lack of time: we are going to start backporting to 4.19 as soon as we lift feature gate in 4.20 which hopefully is any day now, and I am not sure we can start doing it without this in place. Unless we change plans of course.
- changing the behavior of the config flag down the line: the isolation we really want to have configurable is the one for traffic that egresses the cluster. This current implementation is a big hammer that just isolates all traffic regardless of whether it egresses the cluster or not. When we get the implementation right,we would change the behavior of the config flag. That can be confusing in general.
So the idea was doing a downstream temporary implementation while we get the upstream implementation right.
| if util.IsLooseUDNIsolation() { | ||
| klog.Infof("The network %s is configured with loose isolation mode, skip deleting tier-0 drop ACL rule", | ||
| bnc.GetNetworkName()) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Shouldn't we actually ensure the ACL is not there if start with loose UDN isolation?
The ovnk by default isolates advertised UDN networks isolated from each other, but there is a requirement to disable isolation so that BGP routing functionality can be tested between different UDN networks. Hence this commit consumes the UDN_ISOLATION_MODE env variable and isolation can be determined accordingly. By default it uses secure mode to isolate the networks and it can be overridden by CNO via config map. Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
The loose udn isolation option may be rolled out later when there are already existing BGP advertised networks in place, so this commit cleans up associated tier-0 pass and drop ACLs belonging to existing networks and also deleting global advertised networks address set. Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
45d72b1 to
a10b7b3
Compare
|
New changes are detected. LGTM label has been removed. |
|
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
|
@pperiyasamy: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
|
@openshift-bot: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
No description provided.