Skip to content

OCPBUGS-11453: [release-4.10] Batch potentially big transaction on egress firewall ACLs migration#1641

Merged
openshift-merge-robot merged 3 commits intoopenshift:release-4.10from
npinaeva:ocpbugs-11453
Apr 20, 2023
Merged

OCPBUGS-11453: [release-4.10] Batch potentially big transaction on egress firewall ACLs migration#1641
openshift-merge-robot merged 3 commits intoopenshift:release-4.10from
npinaeva:ocpbugs-11453

Conversation

@npinaeva
Copy link
Copy Markdown
Contributor

Backport of #1629
Had to make some changes, this one requires a proper review

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 14, 2023

@npinaeva: No Bugzilla bug is referenced in the title of this pull request.
To reference a bug, add 'Bug XXX:' to the title of this pull request and request another bug refresh with /bugzilla refresh.

Details

In response to this:

OCPBUGS-11453: [release-4.10] Batch potentially big transaction on egress firewall ACLs migration

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Apr 14, 2023
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@npinaeva: This pull request references Jira Issue OCPBUGS-11453, which is valid. The bug has been moved to the POST state.

6 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.10.z) matches configured target version for branch (4.10.z)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
  • dependent bug Jira Issue OCPBUGS-11110 is in the state Verified, which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE))
  • dependent Jira Issue OCPBUGS-11110 targets the "4.11.z" version, which is one of the valid target versions: 4.11.0, 4.11.z
  • bug has dependents

Requesting review from QA contact:
/cc @huiran0826

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Backport of #1629
Had to make some changes, this one requires a proper review

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@npinaeva
Copy link
Copy Markdown
Contributor Author

/retest-required

@jcaamano
Copy link
Copy Markdown
Contributor

@npinaeva any reason to skip the batch test?

@npinaeva
Copy link
Copy Markdown
Contributor Author

@npinaeva any reason to skip the batch test?

because the tests were written for int, and now Batch only works for ACLs, so it would require re-writing the whole test. So I though I can cheat and say we know the function works based on the tests in the previous versions :P

@jcaamano
Copy link
Copy Markdown
Contributor

The batching & tests could be []interface{} based instead and retain some of its generic nature?
That might be interesting as we might end up backporting other batching.

@npinaeva
Copy link
Copy Markdown
Contributor Author

The batching & tests could be []interface{} based instead and retain some of its generic nature? That might be interesting as we might end up backporting other batching.

Yeah I considered this option, but that would require copying the slice twice to convert types back and forth, so I thought we would just copy the Batch function for other types in the future if needed.
We can go the []interface{} way, or try to rewrite tests for acls (which is not that difficult), both approaches have their pros and cons, wdyt @jcaamano ?

@jcaamano
Copy link
Copy Markdown
Contributor

jcaamano commented Apr 19, 2023

I see. The other only thing I can think of is a batch function that gives you the indexes of the array you need to work with:

// the meat goes here
func batch(batchSize int, dataSize int, eachFn func(from, to int)) error

// adapter function per type that uses the previous function
func BatchACLs(batchSize int, data []nbdb.ACL, eachFn func([]nbdb.ACL) error) error {

Then we can just test batch.

The default transaction timeout is 10 seconds, it can be reached
when we delete all egress firewall acls during migration to port groups
from switches.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
(cherry picked from commit 1896e16)
(cherry picked from commit 7fb527e)
(cherry picked from commit 88ecd8b)

Conflicts:
	go-controller/pkg/ovn/egressfirewall.go -
egressFirewallACLPriorityKey is not used in 4.11, because logging
for egress firewall is not implemented

(cherry picked from commit 5a64c5b)

Conflicts:
	go-controller/pkg/ovn/egressfirewall.go
Update Batch to be typed, since generics are not availbale in go 1.17
Update Batch tests to use ACLs instead of ints.
Use RemoveACLsFromAllSwitches instead of
RemoveACLsFromLogicalSwitchesWithPredicate, use nbdb.ACL instead of
*nbdb.ACL
first argument is nil.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
(cherry picked from commit 11283d6)
(cherry picked from commit 74f95e9)
(cherry picked from commit 8ce4aa4)
(cherry picked from commit 6173c7b)
stale acls.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
(cherry picked from commit 81acdc2)
(cherry picked from commit 34eb562)

Conflict; egressfirewall.go - update apimachinery.sets to the previous
version

(cherry picked from commit 0bc8f14)
(cherry picked from commit dc46aa9)

Conflicts:
	go-controller/pkg/ovn/egressfirewall.go
had to move the fix to libovsdbops/switch
@npinaeva
Copy link
Copy Markdown
Contributor Author

I see. The other only thing I can think of is a batch function that gives you the indexes of the array you need to work with:

// the meat goes here
func batch(batchSize int, dataSize int, eachFn func(from, to int)) error

// adapter function per type that uses the previous function
func BatchACLs(batchSize int, data []nbdb.ACL, eachFn func([]nbdb.ACL) error) error {

Then we can just test batch.

I didn't want to introduce any new logic (which switching to indexes will require), I prefer to stay as close to the initial implementation as possible when it comes to backporting. So the most straightforward way for now I would say it to just make Batch type pre-defined. I updated the tests, since it is not that many changes I believe, lmk what you think

@jcaamano
Copy link
Copy Markdown
Contributor

/lgtm
/approve
/label backport-risk-assessed

@openshift-ci openshift-ci bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Apr 19, 2023
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 19, 2023
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 19, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jcaamano, npinaeva

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 19, 2023
@huiran0826
Copy link
Copy Markdown
Contributor

/label qe-approved
/label cherry-pick-approved

@openshift-ci openshift-ci bot added qe-approved Signifies that QE has signed off on this PR cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. labels Apr 20, 2023
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD f84b8d0 and 2 for PR HEAD e0b95a5 in total

@npinaeva
Copy link
Copy Markdown
Contributor Author

/retest-required

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 20, 2023

@npinaeva: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-serial e0b95a5 link false /test e2e-aws-ovn-serial
ci/prow/e2e-vsphere-windows e0b95a5 link false /test e2e-vsphere-windows
ci/prow/okd-e2e-gcp-ovn e0b95a5 link false /test okd-e2e-gcp-ovn
ci/prow/e2e-vsphere-ovn e0b95a5 link false /test e2e-vsphere-ovn

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 7215581 into openshift:release-4.10 Apr 20, 2023
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@npinaeva: Jira Issue OCPBUGS-11453: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-11453 has been moved to the MODIFIED state.

Details

In response to this:

Backport of #1629
Had to make some changes, this one requires a proper review

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@npinaeva npinaeva deleted the ocpbugs-11453 branch April 20, 2023 16:19
@openshift-merge-robot
Copy link
Copy Markdown
Contributor

Fix included in accepted release 4.10.0-0.nightly-2023-04-21-212037

@openshift-bot
Copy link
Copy Markdown
Contributor

[ART PR BUILD NOTIFIER]

This PR has been included in build ose-ovn-kubernetes-container-v4.10.0-202305011254.p0.g7215581.assembly.stream for distgit ose-ovn-kubernetes.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.