[DownstreamMerge] Downstream merge 2-1-22#940
[DownstreamMerge] Downstream merge 2-1-22#940openshift-merge-robot merged 108 commits intoopenshift:masterfrom
Conversation
The two metrics: - metricDBE2eTimestamp - probe_interval both make nb/sbctl calls to the OVN dbs, and to ensure we only make ovsdb client connections in the master process it makes sense to move these to the master's set of metrics It also renames the `probe_interval` metric to `northd_probe_interval` to make things a bit easier to parse Signed-off-by: astoycos <astoycos@redhat.com>
In order to make egressIPs and externalgws compatible, we re-add the SNAT to nodeIP when we delete the pod if disableSNATMultipleGWs is true. While doing this, we need to check if the pod exists or not, because delLogicalPort is called first before deletePodEgressIPAssignment. This will leave stale SNATs behind. Also enable the CI job pipeline for disableSNAT=true (From 4.10 OCP this is the default) Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Check if pod exists before re-adding SNAT
Gather process metrics from the host OVS process using the default pidfile locations. Signed-off-by: Dan Williams <dcbw@redhat.com>
ovn-kubernetes/ovn-kubernetes@1b94cbb Managed to break Hybrid overlay specifically with the creation of the MAC binding for the HO logical Port since we never actually passed a valid sbClient to the ho Controller. Also the unit tests were broken in the sense that we didn't have a test that ensured we actually created all the needed OVN objects from scratch, these objects are The HO Logical Switch Port The HO Logiacal Router Policy The HO Mac binding this commit fixes the above problems and more related to HO found while fixing the tests Signed-off-by: astoycos <astoycos@redhat.com>
Fix Hybrid Overlay
Corrects formatting error in ovsargs. Fixes: openshift#2726 Signed-off-by: Hareesh Puthalath <hareeshp@nvidia.com>
Fix log message for failed commands in pokeEndpointHostname. Signed-off-by: Andreas Karis <ak.karis@gmail.com>
CI: Fix log message for failed commands in pokeEndpointHostname
When the traffic directed to a service of type loadbalancer reaches the nodes, it's not redirected to the service's cluster ips. This is implemented for services of externalip / nodeport, but not for loadbalancer services. All the other logic is in place. This will enable the integration with metallb when the traffic reaches the node from an interface different from breth0. Also, adjust unit tests related to loadbalancer to look only at LoadBalancer IPs. In that scenario, externalIPs are not set. Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
Use absolute paths instead of relative paths so that kind.sh can be run from any directory. Signed-off-by: Andreas Karis <ak.karis@gmail.com>
Create iptables NAT rules also for loadbalancer services
Signed-off-by: Girish Moodalbail <gmoodalbail@nvidia.com>
Signed-off-by: Girish Moodalbail <gmoodalbail@nvidia.com>
the current regex isn't correctly matching log messages with single quote in the first word, for example: - can't - couldn't this commit doesn't fix regex yet since it was not straight forward Signed-off-by: Girish Moodalbail <gmoodalbail@nvidia.com>
Delete and wait for namespaces inside Context.AfterEach with f.DeleteNamespace(f.Namespace.Name). This makes sure that we actually wait for the Namespaces to be deleted and avoids that host network pods hog ports for too long. Signed-off-by: Andreas Karis <ak.karis@gmail.com>
CI: Wait on namespace deletion for host networked test pods
Rename ensure election timeout unit test to TestEnsureElectionTimeout and refactor it in preparation for new unit tests to come for OvnDbManager. Signed-off-by: Andreas Karis <ak.karis@gmail.com>
In preparation for new test cases, make all OvnDbManager methods mockable. Signed-off-by: Andreas Karis <ak.karis@gmail.com>
Implement unit tests for ensureLocalRaftServerID, ensureClusterRaftMembership, resetRaftDB Signed-off-by: Andreas Karis <ak.karis@gmail.com>
ovndbmanager: Implement unit tests for missing functions
kind.sh: Use absolute paths instead of relative paths
Move nb/sbctl metrics to master
On every pod add we assemble a new slice with all of the gatewayInfos. This had no capacity, so every append was an underlying array copy. Attempt to allocate at least some predictable capacity to avoid this. Signed-off-by: Tim Rozet <trozet@redhat.com>
Adds some checking to ensure user provided IPs are correct as well as detect any cache issues. Changes Include: - Ensure on exgw namespace annotation there are not duplicate IPs - Ensure for exgw pod addition, there is not already another pod with the same IP - If exgw pod cache becomes corrupt with duplicate IPs, emit a warning during pod add Signed-off-by: Tim Rozet <trozet@redhat.com>
When pods are added to the cache as exgws for a namespace, only the pod's name is used as the key. This breaks a scenario where 2 pods with the same name are serving as exgws for the same namespace. Consider this example: 1. app pod is created in ns foo 2. exgwAPod is created in ns exgw1 (172.0.1.1), serving ns foo 3. exgwAPod is created in ns exgw2 (172.0.1.2), serving ns foo In the above example, the app pod will only have one ECMP route for 172.0.1.2, because the cache is keyed only on pod name. Signed-off-by: Tim Rozet <trozet@redhat.com>
Host -> svc (ETP=local) backed by ovn pods does not work in SGW because we add the DNAT rule that converts the NP to CIP before it hits the LB on the GR. Current flow which is wrong: 1) traffic from host gets DNAT-ed to clusterIP svc using iptables 2) traffic sent to br-ex 3) hits the GR load balancer 4) gets DNAT-ed to the backend pods, 5) depending on if this is on the same node or a different one, we'll have packet delivered to the pod if its on the same node, or it passes via geneve tunnel to the destination node where the pod lives. Technically if the backends are not local to the node, the request should get rejected. This PR removes the iptable DNAT rule towards CIP if the traffic is of ETP=local type. Instead it adds the DNAT rule towards masqueradeIP which we already do for LGW mode. With that, it will send the packet from host into OVN via mp0 and hits the node-local-switch LB preserving sourceIP. New flow: 1) NP/EIP/LIP traffic from host hits PRE-ROUTING chain where it gets DNAT-ed to masqueradeIP:NP. 2) Then it hits OVN-KUBE-SNAT-MGMTPORT where its src-IP is preserved. 3) enters OVN via mp0, hits the load balancer on the switch 4) gets DNAT-ed to the backend pods if they exist locally on the node else get rejected. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Multiple ExGW cache validation/improvements
Signed-off-by: Flavio Fernandes <flaviof@redhat.com>
Drop defer statement to make function reserveJoinLRPIPs more readable. Signed-off-by: Andreas Karis <ak.karis@gmail.com>
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
21 similar comments
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
@trozet: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
2 similar comments
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/label qe-approved |
Passes unit tests