Skip to content

SPLAT-1129: adding failure domain awareness for vSphere#228

Merged
openshift-merge-bot[bot] merged 9 commits intoopenshift:mainfrom
rvanderp3:cpms-vsphere-failure-domain
Nov 27, 2023
Merged

SPLAT-1129: adding failure domain awareness for vSphere#228
openshift-merge-bot[bot] merged 9 commits intoopenshift:mainfrom
rvanderp3:cpms-vsphere-failure-domain

Conversation

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 26, 2023
@openshift-ci openshift-ci bot requested review from damdo and elmiko July 26, 2023 22:39
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 8, 2023
@openshift-ci openshift-ci bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 8, 2023
@rvanderp3 rvanderp3 force-pushed the cpms-vsphere-failure-domain branch 2 times, most recently from 21a03f8 to 9831732 Compare August 9, 2023 14:35
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 9, 2023
@rvanderp3 rvanderp3 marked this pull request as draft August 10, 2023 13:41
@rvanderp3
Copy link
Contributor Author

converting to draft to prevent unnecessary CI runs.

@rvanderp3 rvanderp3 force-pushed the cpms-vsphere-failure-domain branch from 9831732 to ae64fda Compare August 10, 2023 14:32
@rvanderp3
Copy link
Contributor Author

/test lint

@rvanderp3 rvanderp3 force-pushed the cpms-vsphere-failure-domain branch from 042f8fd to b7e9d28 Compare August 10, 2023 18:00
@rvanderp3
Copy link
Contributor Author

/test lint
/test unit

@rvanderp3 rvanderp3 force-pushed the cpms-vsphere-failure-domain branch 2 times, most recently from 728f955 to 1777d77 Compare August 11, 2023 14:19
@rvanderp3
Copy link
Contributor Author

/test fmt
/test generate
/test lint
/test unit
/test vendor
/test vet

@rvanderp3
Copy link
Contributor Author

/test images

@rvanderp3 rvanderp3 force-pushed the cpms-vsphere-failure-domain branch from 719486c to a6d2e97 Compare August 14, 2023 16:04
@rvanderp3 rvanderp3 changed the title WIP: adding support for vSphere SPLAT-1129: WIP: adding support for vSphere Aug 14, 2023
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Aug 14, 2023
@openshift-ci-robot
Copy link

openshift-ci-robot commented Aug 14, 2023

@rvanderp3: This pull request references SPLAT-1129 which is a valid jira issue.

Details

In response to this:

Associated PRs:

Pre-merge testing:

  1. Build this branch and push to a repository that the eventual cluster can access
  2. Create a new release image for the cluster which configures the cluster-control-plane-machine-set-operator image to point to the built image
oc adm release new  --from-release "<source release image>" --name "<new version>" --to-image "<new release image>" cluster-control-plane-machine-set-operator="<built image>"
  1. Perform an installation of that release image
  2. Change the generated control plane machineset active from inactive

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rvanderp3 rvanderp3 force-pushed the cpms-vsphere-failure-domain branch 6 times, most recently from 6aedea0 to 2a6bacf Compare August 15, 2023 18:24
@jcpowermac
Copy link
Contributor

/payload-job periodic-ci-openshift-release-master-nightly-4.15-e2e-vsphere-ovn-serial

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 20, 2023

@jcpowermac: trigger 1 job(s) for the /payload-(job|aggregate) command

  • periodic-ci-openshift-release-master-nightly-4.15-e2e-vsphere-ovn-serial

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/0b745ac0-87b1-11ee-934c-9f87e50880b6-0

@jcpowermac
Copy link
Contributor

/test e2e-vsphere-multi-zone-operator e2e-vsphere-operator

.PHONY: lint
lint: ## Run golangci-lint over the codebase.
$(call ensure-home, ${GOLANGCI_LINT} run ./... --timeout 5m)
$(call ensure-home, GOGC=50 ${GOLANGCI_LINT} run ./... --timeout 5m --new-from-rev=${PULL_BASE_SHA} -v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the linter still work locally with these changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it does.

Client: mgr.GetClient(),
Scheme: mgr.GetScheme(),
Namespace: managedNamespace,
VSphereCPMSFeatureGateEnabled: vSphereCPMSFeatureGateEnabled,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably put the feature gate accessor as a member of the reconciler struct so that we can keep this around once this feature gate is gone/so it can be re-used by other features

api-approved.openshift.io: https://github.com/openshift/api/pull/1112
exclude.release.openshift.io/internal-openshift-hosted: "true"
include.release.openshift.io/self-managed-high-availability: "true"
release.openshift.io/feature-set: Default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make sure the TechPreview and Custom versions of this are included as well

return nil, fmt.Errorf("unable to generate control plane machine set spec: %w", err)
}
case configv1.VSpherePlatformType:
if !r.VSphereCPMSFeatureGateEnabled {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should move the feature gate accessor Enabled call to here IMO

Comment on lines 203 to 205
if cpmsFailureDomain.Platform == "" {
return nil, errNoFailureDomains
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we actually get here? Is there ever a generation on AWS/Azure/GCP without failure domains?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't appear we can, just removed this.

Comment on lines +174 to +176
var cpmsFailureDomain machinev1.FailureDomains

var err error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the linter complain if you remove this empty line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it does with 'declarations should never be cuddled'

return nil, fmt.Errorf("%w: %sFailureDomain{}", errUnsupportedPlatform, machineFailureDomains[0].Type())
cpmsFailureDomain, err := buildPlatformFailureDomains(platformType, failureDomains)
if err != nil {
return nil, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error should be wrapped

if cpmsFailureDomain.Platform == "" {
return nil, errNoFailureDomains
if cpmsFailureDomain == nil && err == nil {
return nil, nil //nolint:nilnil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this ok to be nil nil? Can you add a quick comment to explain it? Can we even get a nil nil? Looking at buildPlatformFailureDomains I don't see it as possible to return nil nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are correct, removing that statement

}
}

return machinev1.VSphereFailureDomain{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unresolved conversation thread

@rvanderp3 rvanderp3 force-pushed the cpms-vsphere-failure-domain branch from e08d75f to 1add572 Compare November 20, 2023 21:46
@rvanderp3
Copy link
Contributor Author

/hold

placing hold for additional work on comments

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 20, 2023
@rvanderp3
Copy link
Contributor Author

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 27, 2023
@rvanderp3
Copy link
Contributor Author

/test unit

@rvanderp3 rvanderp3 force-pushed the cpms-vsphere-failure-domain branch 2 times, most recently from 663d8b4 to 805d05e Compare November 27, 2023 17:16
@rvanderp3 rvanderp3 force-pushed the cpms-vsphere-failure-domain branch from 805d05e to 3f8ed21 Compare November 27, 2023 17:22
@JoelSpeed
Copy link
Contributor

/approve
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 27, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 27, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JoelSpeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 27, 2023
@openshift-merge-bot openshift-merge-bot bot merged commit 800e341 into openshift:main Nov 27, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 27, 2023

@rvanderp3: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-vsphere-operator 3f8ed21 link false /test e2e-vsphere-operator
ci/prow/e2e-vsphere-multi-zone-operator 3f8ed21 link false /test e2e-vsphere-multi-zone-operator
ci/prow/e2e-vsphere-ovn-etcd-scaling 3f8ed21 link false /test e2e-vsphere-ovn-etcd-scaling
ci/prow/e2e-azure-ovn-etcd-scaling 3f8ed21 link false /test e2e-azure-ovn-etcd-scaling

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link

[ART PR BUILD NOTIFIER]

This PR has been included in build ose-cluster-control-plane-machine-set-operator-container-v4.15.0-202311272131.p0.g800e341.assembly.stream for distgit ose-cluster-control-plane-machine-set-operator.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants

Comments