[wip] June 1st downstream merge part 2#1693
[wip] June 1st downstream merge part 2#1693tssurya wants to merge 59 commits intoopenshift:masterfrom
Conversation
which mean reset, not ignore. When acls are updated on initial sync (by handling add events) if namespace log levels were reset, acls should also be updated with nil Severity. Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
intention from implementation. Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Add new BuildACLFromDbIDs function, that should be the only one used in the end of acls indexes update. Add 2 owner Types for global and namespaced multicast resources. Generate ACL name based on dbIDs, don't include owner information to save some symbols. Add (objectIDs *DbObjectIDs) StringNoOwner() for this purpose. Add cleanup function for disabled multicast support and namespaced multicast cleanup. Move noneMatch to the gress_policy.go, where it is used. Add tests for multicast acl sync and cleanup. Rename address_set_syncer package to external_ids_syncer, add acl syncer sub-package. Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
1. Use dbIDs with new NetpolNodeOwnerType 2. Add aclSync for allow from node ACL 3. Rework AllowFromNode ACL unit tests to run everything for ipv4 and ipv6 Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Move gress-related constants from policy.go to gress_policy.go. Use old externalIDs for syncs only. policy test "correctly retries recreating a network policy with the same name" last expected data was updated, since newly generated ACLs Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Add NetpolNamespaceOwnerType for namespace-wide default deny acls. Add acl_sync PrimaryID check to make sure in case more than 1 ACLs have the same primaryID, only of them will be updated. This also eliminates the need for "deleting a network policy that failed half-way through creation succeeds" test, which is based on a "found multiple results for provided predicate" condition. Replace tests "stale ACLs should be cleaned up or updated at startup via syncNetworkPolicies" and "ACLs with long names and run syncNetworkPolicies" with the new "reconciles an existing networkPolicy updating stale ACLs with long names", and a part that deletes stale ARP ACLs is now updated and tested as a part of acl_sync. Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Add EgressFirewallOwnerType for egress firewall ACLs. Update syncEgressFirewall to use passed egressFirewalls objects for cleanup, create updateEgressFirewallACLsDbIndex function that updates old formatted ACLs. Update egressfirewall_test.go, move a part of setup to BeforeEach. Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
1. rename BuildACLFromDbIndex to BuildACL, since it is the only method that should be used now. 2. Rename getACLMatchFromACLDir to getACLMatch, remove stale version. 3. update isEquivalentACL to checkACLPrimaryID, that matches on primaryID until we get client indexes. 4. Add acls.md doc explaining all acls that are used by ovn-k and their dependencies with examples 5. Update multicast docs Add missing links to README.md Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Now Egress firewall ACLs are named "EF:<namespace>:<priority>" default deny netpol ACLs "NP:<namespace>:<direction>" gress ACLs "NP:<policyNamespace>:<policyName>:<direction>:<gressIdx>" Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
The current util.NextIP() function returns the wrong IP for IPv6 address with leading zeros. We can directly use the NextIP() function in vendor containernetworking/plugins instead. Signed-off-by: Yun Zhou <yunz@nvidia.com>
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Acls new indexes
Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
…once Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
…t that. Spotted in upstream ovn-org/ovn CI when running against ovn versions <=22.09 which don't support component templates: https://github.com/ovn-org/ovn/actions/runs/4628617882 Reported error: failed to sync chassis: error: failed to get template var list: error: Wrong parameter type (*nbdb.ChassisTemplateVar): Model not found in Database Model Fixes: 4b3475a ("services: Use OVN template load balancers for NodePort services.") Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Asserting on `netexec` response to not containing errors does not allow using retries, as the test would fail during the first attempt. Signed-off-by: Andrea Panattoni <apanatto@redhat.com>
services: Don't try to list/cleanup templates when OVN doesn't suppor…
Minor performance fixes
Use PrimaryID as a client index for ACL.
For unknown reasons, during the kube rebase, we just decided to flip from sorting the list of endpoints to not sorting it? : 3850450#diff-d3aa58d9b58a0a09264f072df46ab01d0501eb508c4656411ae2dc1ac68fb3c4L579 The allocateLBNodePorts feature depends on the list being sorted, else our logic is skewed and the random probability model we use to generate iptables gets skewed up as well. So during endpoints update and such we will behave badly and have left over DNATs towards pods that don't exist anymore which will lead to black-holing the traffic. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Fix the formatting of the log messages (%w is only valid when creating new errors, not when logging) and update the master test suite name since it hasn't been only gateway init operations for a long time. Signed-off-by: Dan Williams <dcbw@redhat.com>
the fakeAddresset code was originally created so that during testing we could directly access and figure out what ip addresses where added to addressSets because we could not access an ovn northbound database. Since the introduction of libovsdb we can use the fake database that is provided for us instread of the the fakeAddressSetFactory and fakeAddressSet Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com>
Fix e2e etp=local flake happening in LGW mode lanes
The compact mode means running ovnkube master and node in the same process. The OVN-K runs in this configuration in Microshift. Signed-off-by: Peng Liu <pliu@redhat.com>
Use a single node cluster with compact-mode enabled to mimic the ovnkube setup in Microshift. Signed-off-by: Peng Liu <pliu@redhat.com>
Signed-off-by: Peng Liu <pliu@redhat.com>
Signed-off-by: Peng Liu <pliu@redhat.com>
NextIP returns the wrong IP for IPv6 address with leading zeros
…tion Remove the no-longer-used node host subnet annotation to allow the default network controller's node code to proceed. Also redo how the default network controller counts host subnets to make sure we actually have the right number of each IP family, rather than just counting numbers. Because the allocation handling was split out into the ClusterManager, the network controller's addNode() no longer has the guarantees of a correct set of annotations before it runs, as it and the cluster manager run in parallel. Signed-off-by: Dan Williams <dcbw@redhat.com>
Users can specify IPV4 and IPv6 gateway nexthop for dual-stack cluster with '--gateway-nexthop=<ipv4_addr>,<ipv6_addr>' Signed-off-by: Peng Liu <pliu@redhat.com>
Signed-off-by: Andrea Panattoni <apanatto@redhat.com>
When a node gets a new IP addresses, node loadbalancers should be updated to serve NodePort services on the new address. Update node_tracker.go to track both `host-addresses` and `l3-gateway-config` IP addresses, as they are used in different ways for building load balancers. Add End2End test to control-plane suite. Update load balancer unit tests to include a multi address node. Signed-off-by: Andrea Panattoni <apanatto@redhat.com>
During the kube 1.26 bump the sets[type].List() call was deprecated and replaced with unsortedList(). This could cause some flakes as we depend on the order of the set in some cases. This fix updates the unsortedList calls to use the sets List function Signed-off-by: Ben Pickard <bpickard@redhat.com> s s
clustermanager: update node annotations on dual->single stack conversion
Make --gateway-nexthop support dual-stack
Revert predicate search optimization 3369d38 as predicate search should not bee needed anymore Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Use loadbalancer.Name as client index
Convert unsortedList() to sets.List()
Start to remove the fakeAddressSet
If the interface specified with '--gateway-interface' is already a bridge port, use the bridge name as the gateway interface. Signed-off-by: Peng Liu <pliu@redhat.com>
Scalable Functions (SFs) are "lightweight" Virtual Functions (VFs), meaning they are netdevices that implement representor model and can be used as container network interfaces. Unlike of VFs, SFs are Auxiliary devices and located on the respective system bus. Change the current code base to also accomodate the use of SF Auxiliary devices instead of only VF PCI devices for containers. Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Signed-off-by: Alin-Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Hareesh Puthalath <hareeshp@nvidia.com>
For NodePort Services with ExternalTrafficPolicy = Local, incoming connections shouldn't be DNATted to masqueradeIP. Signed-off-by: Andrea Panattoni <apanatto@redhat.com>
Serve NodePort services on secondary IP addresses
Detect bridge name when 'gateway-interface' is specified
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Temporary metallb fix
…nto june-1st-downstream-merge-part-2 Conflicts: go-controller/pkg/libovsdb/libovsdb.go
In CI we are observing:
E0531 19:52:52.309098 1 obj_retry.go:627] Failed to update *v1.Node, old=ip-10-0-133-36.ec2.internal, new=ip-10-0-133-36.ec2.internal, error: error creating gateway for node ip-10-0-133-36.ec2.internal: failed to init shared interface gateway: failed to sync stale SNATs on node ip-10-0-133-36.ec2.internal: unable to fetch podIPs for pod openshift-multus/network-metrics-daemon-cr75v: pod openshift-multus/network-metrics-daemon-cr75v: no pod IPs found
I0531 19:52:52.309185 1 event.go:285] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"ip-10-0-133-36.ec2.internal", UID:"65797716-b2a8-43fc-a216-8154fecee781", APIVersion:"v1", ResourceVersion:"203498", FieldPath:""}): type: 'Warning' reason: 'ErrorUpdatingResource' error creating gateway for node ip-10-0-133-36.ec2.internal: failed to init shared interface gateway: failed to sync stale SNATs on node ip-10-0-133-36.ec2.internal: unable to fetch podIPs for pod openshift-multus/network-metrics-daemon-cr75v: pod openshift-multus/network-metrics-daemon-cr75v: no pod IPs found
which is happening because pod is scheduled but IP allocation
has not happened and since this sync is called from watch nodes,
node add does not succeed, so we have a chicken-egg problem
where watch pods can't start.
In reality we don't care about pods that don't have
an IP because there is nothing to cleanup for that pod in
that case.
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
(cherry picked from commit 91e8a3b)
(cherry picked from commit fe63418)
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: tssurya The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com> (cherry picked from commit d556bde)
|
/test e2e-aws-ovn-shared-to-local-gateway-mode-migration |
|
/test e2e-aws-ovn-windows |
|
@tssurya: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
closed in favour of #1692 |
- What this PR does and why is it needed
- Special notes for reviewers
- How to verify it
- Description for the changelog