Skip to content

Comments

Bug 2092579: [DownstreamMerge] 6-1-22#1121

Closed
flavio-fernandes wants to merge 34 commits intoopenshift:masterfrom
flavio-fernandes:merge-6-1-22
Closed

Bug 2092579: [DownstreamMerge] 6-1-22#1121
flavio-fernandes wants to merge 34 commits intoopenshift:masterfrom
flavio-fernandes:merge-6-1-22

Conversation

@flavio-fernandes
Copy link
Contributor

jcaamano and others added 30 commits May 3, 2022 15:05
Transactions is an all or nothing thing so we just need one wait per
transaction to check against duplicates.

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Running the LB predicate that matches on name takes a long time if the
LB table has many LBs. For example, looking up ~40 LBs in a table with
~200k rows took aproximately 3s.

The service controller has a second level cache and knows which LBs need
to be added and which need to be updated. Avoid this lookup for LBs that
are to be added.

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
We've seen instances where the migration initially failed because the db
wasn't ready yet. This change adds some retrying to the migration to
handle that situation more gracefully and efficient, as failing this a
couple of times currently results in getting the pod into crashlooop
backoff.

A sample transient error for initial failures looks like e.G. this:
```
F0511 04:31:57.928725       1 ovndbmanager.go:44] NBDB Upgrade failed: %!w(*fmt.wrapError=&{failed to get schema version for NBDB, stderr: "ovsdb-client: transaction returned error: {\"details\":\"get_schema request specifies database OVN_Northbound which is not yet available because it has not completed joining its cluster\",\"error\":\"database not available\"}\n", error: OVN command '/usr/bin/ovsdb-client -t 10 get-schema-version unix:/var/run/ovn/ovnnb_db.sock OVN_Northbound' failed: exit status 1 0xc00042a160})
```

Signed-off-by: Alvaro Aleman <alvaroaleman@users.noreply.github.com>
Update libovsdb to commit:
  50ec17900991efba29734b2d11afe3dfc9ee02e0

Signed-off-by: Andreas Karis <ak.karis@gmail.com>
Extend ACL logging to EgressFirewall objects.

Signed-off-by: Andreas Karis <ak.karis@gmail.com>
`exportloopref` golangci linter finds pointers exported from within
for-range loop, see https://github.com/kyoh86/exportloopref for details.

The linter does expose 2 violations, the one in `pkg/ovn/pods.go` is a
real bug.

Signed-off-by: xqu <xqu@nvidia.com>
Concurrent create transactions can result in the
cache being stale for a very short period.
Adding a wait for logical router policies create
transactions should prevent duplicate policies in these cases
where the predicate does not find the policy in time.

Signed-off-by: Ori Braunshtein <obraunsh@redhat.com>
Add wait for logical router policies
…cl-3

EgressFirewall: Enable OVN-Kubernetes logging for egress firewall
Signed-off-by: Yun Zhou <yunz@nvidia.com>
This commit ensures node name is added into EgressIPStatusItem so that
executeCloudPrivateIPConfigOps removes the egress ip from cloud config
store even though assignment didn't actually happen on the node.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
addLogicalPort() looks up the LSP early and then looks it up again
if the pod had no IP/MAC annotations. This is kinda pointless and
just costs time and lock contention in libovsdb since if the port
was found early we can just use that one to get the MAC/IPs.

Since the only other user of GetPortAddresses() was the hybrid
overlay code move the port lookup into the hybrid overlay code
and just keep the address extraction code common.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Add node name into egress ip status for the removal
Direction of the ACLs built for multicast egress policy was always
"from-lport", so we cannot determine if we need to update egress ACL
direction based on the first egress ACL that found.

Signed-off-by: Yun Zhou <yunz@nvidia.com>
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
…into-pod-creation

Perf: Combine perPodSNAT add and LSP add into same transact
Remove time.Tick usages since "without a way to shut it down the
underlying Ticker cannot be recovered by the garbage collector; it "leaks"."
[https://pkg.go.dev/time#Tick]

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
1. change time.After() usage in loops to once-created NewTicker.
time.After() will create a new timer instance on every iteration and is
not efficient: "The underlying Timer is not recovered by the garbage
collector until the timer fires. If efficiency is a concern,
use NewTimer instead and call Timer.Stop if the timer is no longer
needed." [https://pkg.go.dev/time#After]
2. make sure to Stop() all tickers to release associated resources.
[https://pkg.go.dev/time#NewTicker]

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
KIND 0.13.0 or later is required for k8s 1.24.0 bump, as explained in https://github.com/kubernetes-sigs/kind/releases. Taking latest stable KIND release: 0.14.0

Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
Bump KIND version to 0.14.0 to prepare for k8s 1.24.0 bump
update all egress ACLs' direction to "from-lport"
Signed-off-by: Zenghui Shi <zshi@redhat.com>
Fix lflow-cache-limit-kb ovs external-id
Using ovs-dpctl can cause severe issues when there is a version mismatch between ovs-dpctl and ovs-vswitchd.
To address that use ovs-appctl dpctl/* since it provides the same data that is in the same format.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
dcbw and others added 3 commits June 1, 2022 09:56
Use ovs-appctl dpctl/* instead of ovs-dpctl
Recent retry logic was a little too harsh on deletion when the port
is not present in the annotation.

Closes: openshift#2973

Signed-off-by: Flavio Fernandes <flaviof@redhat.com>
pods: deleteLogicalPort should not fail if port is already deleted
@openshift-ci openshift-ci bot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Jun 1, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 1, 2022

@flavio-fernandes: This pull request references Bugzilla bug 2092579, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.11.0) matches configured target release for branch (4.11.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @anuragthehatter

Details

In response to this:

Bug 2092579: [DownstreamMerge] 6-1-22

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 1, 2022

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: flavio-fernandes
To complete the pull request process, please assign jacobtanenbaum after the PR has been reviewed.
You can assign the PR to them by writing /assign @jacobtanenbaum in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Flavio Fernandes <flaviof@redhat.com>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 2, 2022

@flavio-fernandes: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-openstack-ovn 44f4bb6 link false /test e2e-openstack-ovn
ci/prow/e2e-metal-ipi-ovn-ipv6 44f4bb6 link true /test e2e-metal-ipi-ovn-ipv6
ci/prow/e2e-vsphere-windows 44f4bb6 link false /test e2e-vsphere-windows
ci/prow/e2e-aws-ovn-upgrade 44f4bb6 link true /test e2e-aws-ovn-upgrade
ci/prow/okd-e2e-gcp-ovn 44f4bb6 link false /test okd-e2e-gcp-ovn
ci/prow/e2e-vsphere-ovn 44f4bb6 link false /test e2e-vsphere-ovn
ci/prow/e2e-metal-ipi-ovn-dualstack 44f4bb6 link true /test e2e-metal-ipi-ovn-dualstack
ci/prow/e2e-aws-ovn-windows 44f4bb6 link true /test e2e-aws-ovn-windows

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@andreaskaris
Copy link
Contributor

I thought that after FF, we only did downstream cherry-picks? I've got a 4.12 feature in there (the egressfw logging)

Release Cycle Timeline
OpenShift SDN’s release cycle operates with a feature freeze window. During initial release cycle development, all changes are pulled downstream via a git merge of upstream. This includes features as well as fixes. Post feature freeze, only fixes will be pulled downstream. This is done via git cherry-picking rather than merge.

https://docs.google.com/document/d/11Ax17QT0FB_OMV9_ifwWX3ppoLlc3OujjAikZP3J1v4/edit#heading=h.jp6iz3u39y1l

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 2, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 2, 2022

@flavio-fernandes: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@flavio-fernandes
Copy link
Contributor Author

Please ignore. We are not doing any downstream merges until 4.12 branch is created.
Closing.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 2, 2022

@flavio-fernandes: This pull request references Bugzilla bug 2092579. The bug has been updated to no longer refer to the pull request using the external bug tracker.

Details

In response to this:

Bug 2092579: [DownstreamMerge] 6-1-22

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@flavio-fernandes flavio-fernandes deleted the merge-6-1-22 branch June 2, 2022 14:24
@flavio-fernandes
Copy link
Contributor Author

I thought that after FF, we only did downstream cherry-picks? I've got a 4.12 feature in there (the egressfw logging)

Release Cycle Timeline
OpenShift SDN’s release cycle operates with a feature freeze window. During initial release cycle development, all changes are pulled downstream via a git merge of upstream. This includes features as well as fixes. Post feature freeze, only fixes will be pulled downstream. This is done via git cherry-picking rather than merge.

https://docs.google.com/document/d/11Ax17QT0FB_OMV9_ifwWX3ppoLlc3OujjAikZP3J1v4/edit#heading=h.jp6iz3u39y1l

yup; my bad! 🤦

danwinship pushed a commit to danwinship/ovn-kubernetes-1 that referenced this pull request Jun 6, 2022
…cceeded

Scalability: Delete logical ports for completed pods
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.