CORS-3437: infra/capi: add provisioning timeout#8307
CORS-3437: infra/capi: add provisioning timeout#8307openshift-merge-bot[bot] merged 2 commits intoopenshift:masterfrom
Conversation
|
@patrickdillon: This pull request references CORS-3437 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test altinfra-e2e-aws-ovn altinfra-e2e-azure-capi-ovn altinfra-e2e-nutanix-capi-ovn altinfra-e2e-vsphere-capi-ovn altinfra-e2e-openstack-capi-ovn altinfra-e2e-gcp-capi-ovn |
There was a problem hiding this comment.
I'm going to remove these timeouts from the hooks and only keep the timeout caps on the provisioning with the capi system. We don't need to handle hook failures with timeouts, instead we can leave that up to the hook implementors. For CAPI provisioning on the other hand, we need a timeout to prevent endlessly spinning.
Adds a 15m timeout to the infrastructure provisioning and machine provisioning stages of CAPI, so that the controllers do not spin indefinitely in the case of a failure. 15m is an arbitrary value, but the criteria for the timeout should be based on the balance of ample time to provision the resources with not making users wait too long if something goes wrong.
c3a1feb to
9335bf4
Compare
|
/test altinfra-e2e-aws-ovn altinfra-e2e-azure-capi-ovn altinfra-e2e-nutanix-capi-ovn altinfra-e2e-vsphere-capi-ovn altinfra-e2e-openstack-capi-ovn altinfra-e2e-gcp-capi-ovn |
|
/cc |
|
Nice! |
r4f4
left a comment
There was a problem hiding this comment.
How do you feel about adding
untilTime := time.Now().Add(timeout)
timezone, _ := untilTime.Zone()
logrus.Infof("Waiting up to %v (until %v %s) for infrastructure to provision...", timeout, untilTime.Format(time.Kitchen), timezone)
Too much information?
9335bf4 to
eab21fd
Compare
|
/test altinfra-e2e-aws-ovn |
|
All comments incorporated. |
r4f4
left a comment
There was a problem hiding this comment.
LGTM. Will wait for the linting fix before tagging.
|
Looking good: |
eab21fd to
ecb2293
Compare
Adds timers to each stage of CAPI infrastructure provisioning. These times will be logged at install complete, and can be used as a guide if we need to change the provisioning timeouts.
ecb2293 to
4e2c8f6
Compare
|
fixed linter and reworked stage names a little: "InfrastructureReady" would not be obvious to users. /test altinfra-e2e-aws-ovn altinfra-e2e-azure-capi-ovn altinfra-e2e-nutanix-capi-ovn altinfra-e2e-vsphere-capi-ovn altinfra-e2e-openstack-capi-ovn altinfra-e2e-gcp-capi-ovn |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: r4f4 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@patrickdillon: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
[ART PR BUILD NOTIFIER] This PR has been included in build ose-installer-altinfra-container-v4.16.0-202404291018.p0.g9c8cfd4.assembly.stream.el9 for distgit ose-installer-altinfra. |
Implements basic safeguard so that provisioning does not spin indefinitely. A timeout (currently 15m) is set for each provisioning stage.