Skip to content

🐛 Handle errors returned by GetInstanceStatusByName in machine controller#2086

Merged
k8s-ci-robot merged 1 commit intokubernetes-sigs:mainfrom
zioc:fix-get-instance
May 16, 2024
Merged

🐛 Handle errors returned by GetInstanceStatusByName in machine controller#2086
k8s-ci-robot merged 1 commit intokubernetes-sigs:mainfrom
zioc:fix-get-instance

Conversation

@zioc
Copy link
Contributor

@zioc zioc commented May 16, 2024

What this PR does / why we need it:

Errors returned by GetInstanceStatusByName were ignored up to now, which could lead controller to attempt to recreate a server under specific circumstances (for example if nova API always returns 404 errors - see detailled explanation in related issue)

These errors shouldn't be ignored, otherwise we may consider that server does not exists whereas nova api is misbehaving.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):

Fixes #2085

TODOs:

  • squashed commits
  • if necessary:
    • includes documentation
    • adds unit tests

These errors were ignored up to now, which could lead controller to attempt
to recreate an existing instance under rare circumstances.
@k8s-ci-robot k8s-ci-robot requested review from EmilienM and mdbooth May 16, 2024 08:30
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 16, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @zioc. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@netlify
Copy link

netlify bot commented May 16, 2024

Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!

Name Link
🔨 Latest commit c633c55
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-cluster-api-openstack/deploys/6645c428e5da7a0008d4afb5
😎 Deploy Preview https://deploy-preview-2086--kubernetes-sigs-cluster-api-openstack.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Contributor

@mdbooth mdbooth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an excellent robustification, thanks!

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mdbooth

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 16, 2024
@mdbooth
Copy link
Contributor

mdbooth commented May 16, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 16, 2024
@mdbooth
Copy link
Contributor

mdbooth commented May 16, 2024

/cherry-pick release-0.10

@k8s-infra-cherrypick-robot

@mdbooth: once the present PR merges, I will cherry-pick it on top of release-0.10 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-0.10

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@huxcrux
Copy link
Contributor

huxcrux commented May 16, 2024

/lgtm

Awesome fix 🙏

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 16, 2024
@k8s-ci-robot k8s-ci-robot merged commit 046d6b7 into kubernetes-sigs:main May 16, 2024
@k8s-infra-cherrypick-robot

@mdbooth: new pull request created: #2087

Details

In response to this:

/cherry-pick release-0.10

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openStackMachine.SetFailure(capierrors.UpdateMachineError, errors.New("virtual machine no longer exists"))
return nil, nil
}
instanceStatus, err = computeService.GetInstanceStatusByName(openStackMachine, openStackMachine.Name)
Copy link
Contributor

@EmilienM EmilienM May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I thought you would fix too is:

https://github.com/zioc/cluster-api-provider-openstack/blob/821a1a2ef25ac615db5fb26379eb0c4b947ad284/pkg/cloud/services/compute/instance.go#L827-L829

I don't like the fact we don't return an error if more than one server with the same name was found. This is error prone and can lead to cluster issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @mdbooth for an opinion on ^

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, and we have an existing pattern for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Errors returned by GetInstanceStatusByName should be handled in machine controller.

6 participants