Skip to content

Conversation

@mdbooth
Copy link

@mdbooth mdbooth commented Oct 2, 2023

This is a rebase on to current upstream CAPO main branch.
This is a merge with ours strategy of:

  • a rebase on to current upstream CAPO main branch
  • the previous history

i.e. It is a rebase which also explicitly declares itself a successor to the previous history.

I identified downstream carry patches with:

$ git log --no-merges HEAD ^upstream/main

There were only 2:

However, several other patches were also returned which were backports to the upstream release-0.6 branch. We probably should not pull upstream release branches to main, but we will need a strategy for our own release branches.

For now I have renamed all carry patches to have a CARRY: prefix and added a couple more to add vendoring.

mdbooth and others added 30 commits February 1, 2023 11:36
This avoids an extremely surprising write-after-read consistency bug
when later returning after making subsequent object changes.

The issue stems from the fact that we don't write to the same place we
read from. We read from a cache: a shared informer which is populated
from a watch running against the apiserver. However we write directly to
the apiserver rather than writing through the cache.

If we patch the watched object without returning, while we continue to
execute the change will be written to the api server and will propagate
to our own shared informer. This change will result in a new reconcile
being queued for this version of the object which now has a Finalizer
but no other changes.

Taking the case of the machine controller we will then go on to create a
server and we will patch the object with a providerID and return. On
return we will write this change to the apiserver which will eventually
propagate to our shared informer and result in another reconcile.
However our previous patch has already propagated, so we will be
*immediately* called with the *old version* of the object. If we've been
careful this will hopefully be no more than inefficient, but can lead to
very hard to debug errors.

The safe way to patch objects when reconcile is called from an informer
is to always return after patching an object which will directly result
in another reconcile. We also *must not* set Requeue in the returned
Result as this will result in being called again with the stale object.
When patching an object we must return and wait for the informer to see
the change and call the reconciler again.
…abel

🐛 Fix Tilt by adding CAPO label in tilt-provider.json
…ources

✨ Add Tags to API-Loadbalancer resources
…patch

🐛 Return from reconciler after adding finalizer
…ertificates

📖 Add documentation about --ca-cert flag
…hercloud

🌱 Bump gophercloud to v1.2.0
Works round some go mod dependency issues upgrading to v1.51.1, and is
in line with what CAPI does.
While there, remove some legacy skipped paths and rename skipped_dirs to
reflect that it actually skips paths (including ensure-golangci-lint.sh,
which is a file, not a directory).
Download golangci-lint instead of building it
Failure of boilerplate.py did not cause the failure of
verify-boilerplate.sh.
Also document the behaviour of each.
The resource and machine tickers are responsible for fetching logs every
5 and 10 seconds respectively during a test run. Each fetch overwrites
the previous one. These are problematic because they cause failure state
to be overwritten by logs captured during deletion. That is, if a test
ends in failure, we continue to capture these logs regularly during
cleanup. This causes the test failure state to be overwritten by the
state during cleanup.

We already capture these resources after each test regardless of success
or failure, so regular capture during the test is redundant. This change
simply removes the tickers. We still get the same logs when we execute
DumpSpecResourcesAndCleanup() after each test.

Additionally, if there is an error during cleanup we optionally capture
that state too, but to a separate directory.
Remove the resource and machine tickers from e2e tests
Fix "internal ip doesn't exist (yet)" in e2e logs
Uplift golang to 1.19.6 and x/net to 0.7.0 due
https://osv.dev/vulnerability/GO-2023-1571
CAPO uses any ZA it finds, rather than checking whether the ZA is
actually available, meaning you can get into a state where it's
impossible to provision a cluster.  This allows clients to filter based
on AZ availability before scheduling.
🐛 Fix Provisioning to Unavailable AZs
🐛 Switch to "4" instead of "ipip" for rules
fix: fix typo of worker rules and controller rules
…ue_specs_gophercloud

✨ Support value specs for Ports
This adds additional information on how to create a test environment
similar to the one used during continuous integration.

Signed-off-by: Wolodja Wentland <[email protected]>
EmilienM and others added 2 commits September 27, 2023 14:11
CAPO doesn't use vendoring, so let's ignore that directory from git.
That will prevent someone from committing it.
@mdbooth mdbooth changed the title CAPO v0.8.0 Merge CAPO v0.8.0 Oct 2, 2023
@mdbooth
Copy link
Author

mdbooth commented Oct 2, 2023

/retest

@mdbooth mdbooth changed the title Merge CAPO v0.8.0 Rebase on to CAPO main Oct 3, 2023
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 3, 2023
@stephenfin
Copy link

Can't speak to the failing test, but this looks like a straight forward reset to upstream main:

$ git remote -v
origin  [email protected]:openshift/cluster-api-provider-openstack.git (fetch)
origin  [email protected]:openshift/cluster-api-provider-openstack.git (push)
shiftstack      [email protected]:shiftstack/cluster-api-provider-openstack.git (fetch)
shiftstack      [email protected]:shiftstack/cluster-api-provider-openstack.git (push)
upstream        [email protected]:kubernetes-sigs/cluster-api-provider-openstack.git (fetch)
upstream        [email protected]:kubernetes-sigs/cluster-api-provider-openstack.git (push)
$ git log shiftstack/capo-v0.8.0 --oneline
5f50f580 (shiftstack/capo-v0.8.0) CARRY: go mod vendor
1415c4d1 CARRY: Don't ignore vendor directories
878a986d CARRY: Add OCP CI config
5ec07227 CARRY: Downstream OWNERS
c73d55dd (upstream/main) Merge pull request #1690 from shiftstack/gitvendoring
...

Uses strategy `ours` to replace previous HEAD.
Copy link
Member

@mandre mandre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a straight copy of upstream's main branch, with openshift's needed commits clearly marked as CARRY patches.

❯ git log origin/capo-v0.8.0..k8s/main --oneline
(no result)
❯ git log k8s/main..origin/capo-v0.8.0 --oneline
5f50f580 (HEAD, origin/capo-v0.8.0) CARRY: go mod vendor
1415c4d1 CARRY: Don't ignore vendor directories
878a986d CARRY: Add OCP CI config
5ec07227 CARRY: Downstream OWNERS

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 3, 2023
@mdbooth
Copy link
Author

mdbooth commented Oct 3, 2023

I thought the test failure was just missing vendoring due to interaction with .gitignore, but it's more than that. I thought I had previously reproduced locally but I wasn't testing properly. To reproduce you need to explicitly set GOFLAGS:

$ make test GOFLAGS=-mod=vendor
GOBIN=/home/mbooth/src/openshift/cluster-api-provider-openstack/hack/tools/bin ./scripts/go_install.sh sigs.k8s.io/controller-runtime/tools/setup-envtest setup-envtest v0.0.0-20221201045826-d9912251cd81
rm: cannot remove '/home/mbooth/src/openshift/cluster-api-provider-openstack/hack/tools/bin/setup-envtest*': No such file or directory
go: sigs.k8s.io/controller-runtime/tools/[email protected]: cannot query module due to -mod=vendor
make: *** [Makefile:228: /home/mbooth/src/openshift/cluster-api-provider-openstack/hack/tools/bin/setup-envtest-v0.0.0-20221201045826-d9912251cd81] Error 1

Copy link
Member

@mandre mandre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Looking at the diff between the upstream's main branch and this, we only have the necessary openshift bits.

$ git diff k8s/main...origin/capo-v0.8.0 --name-only -- . ':!vendor' ':!hack/tools/vendor'
.ci-operator.yaml
.gitignore
Dockerfile.rhel
OWNERS
OWNERS_ALIASES

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 3, 2023
@mandre
Copy link
Member

mandre commented Oct 3, 2023

/override ci/prow/test
/approve

@openshift-merge-robot
Copy link

@mdbooth: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/unit 5b7d797 link false /test unit

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci
Copy link

openshift-ci bot commented Oct 3, 2023

@mandre: Overrode contexts on behalf of mandre: ci/prow/test

Details

In response to this:

/override ci/prow/test
/approve

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link

openshift-ci bot commented Oct 3, 2023

@mdbooth: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/unit 5b7d797 link false /test unit

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci
Copy link

openshift-ci bot commented Oct 3, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mandre

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 3, 2023
@openshift-ci openshift-ci bot merged commit 3e1c561 into openshift:main Oct 3, 2023
@mdbooth mdbooth deleted the capo-v0.8.0 branch October 3, 2023 15:21
@mdbooth
Copy link
Author

mdbooth commented Oct 3, 2023

The setup-envtest failure should be fixed by this upstream PR: kubernetes-sigs#1707

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.