Enable spot instances for AWS masters#51664
Enable spot instances for AWS masters#516642uasimojo wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
/hold for testing |
ccc7d19 to
89f6ca9
Compare
|
/pj-rehearse Sure, let's try five at random. |
|
@2uasimojo: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@2uasimojo, If the problem persists, please contact Test Platform. |
mtulio
left a comment
There was a problem hiding this comment.
Hey Eric, overall looks good. I have some questions about Machine and CPMS manifest changes.
Are you planning to create a new job using installer altinfra(CAPI) setting the var SPOT_MASTERS?
Let's trigger an existing job to validate the spot on workers:
/pj-rehearse pull-ci-openshift-machine-api-provider-aws-release-4.16-e2e-aws
| echo "Spot instances for masters can only be used with CAPI installs." | ||
| exit 1 | ||
| fi | ||
| manifests="${dir}/cluster-api/machines/10_inframachine_*.yaml" |
There was a problem hiding this comment.
We may need to glob the three machine and CPMS manifests too.
What about creating one list?
There was a problem hiding this comment.
Added CPMS for the reason @r4f4 points out below; but leaving out the Machine manifests for the reason I point out below :)
ci-operator/step-registry/ipi/install/install/ipi-install-install-commands.sh
Show resolved
Hide resolved
ci-operator/step-registry/ipi/install/install/ipi-install-install-commands.sh
Outdated
Show resolved
Hide resolved
|
Triggering a capi job: |
|
@r4f4: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
ci-operator/step-registry/ipi/install/install/ipi-install-install-commands.sh
Outdated
Show resolved
Hide resolved
| documentation: "Use AWS Spot Instances for worker nodes. Set to 'true' to opt into spot instances. Explicitly set to 'false' to opt out. Leave unset for the default, which may change." | ||
| documentation: "Use AWS Spot Instances for *worker* nodes. Set to 'true' to opt into spot instances. Explicitly set to 'false' to opt out. Leave unset for the default, which may change." | ||
| - name: SPOT_MASTERS | ||
| default: "false" |
There was a problem hiding this comment.
What about
| default: "false" | |
| default: "true" |
plus a /hold so we can rehearse this change for a while with capi/aws jobs and fix any issues? Then we can flip it back to false before merge and add the SPOT_MASTERS=true to the capi jobs definitions.
|
/pj-rehearse pull-ci-openshift-machine-api-provider-aws-release-4.16-e2e-aws |
|
@mtulio: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
Not sure why this is happening (haven't dug in yet): |
@2uasimojo Oh, I know this one because I made the same mistake. Here https://github.com/openshift/release/pull/51664/files#diff-65cce809dc5ddaa91711f6709acd41e5c24b735c073708e60f10aae28faf8712R511 you're saving the |
Cool, thanks. I suspected that might be it, but I don't understand how yq can tell the difference by the time shell parsing is done. 🤷 |
89f6ca9 to
c707fdb
Compare
Well, it turns out it is difficult or impossible to add an empty dict using yq v3. I tried |
|
/pj-rehearse pull-ci-openshift-machine-api-provider-aws-release-4.16-e2e-aws |
|
@2uasimojo: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
c707fdb to
37f1da9
Compare
|
Let's try a capi job: |
fb51d21 to
f594928
Compare
|
[REHEARSALNOTIFIER]
A total of 15375 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs. A full list of affected jobs can be found here Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
/pj-rehearse ack Rehearsals done via #51721 /hold cancel |
|
@2uasimojo: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/assign @patrickdillon |
|
Issues in openshift/release go stale after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
|
Stale issue in openshift/release rot after 15d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
|
/remove-lifecycle rotten |
|
/approve /hold Feel free to remove this hold as you see fit, but not sure if you want to do what is discussed in the comment. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 2uasimojo, mtulio, patrickdillon, r4f4 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
This work has merged as part of #51721 |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/close |
|
@r4f4: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Add a new variable for the AWS IPI flows,
$SPOT_MASTERS. When using CAPI installs (featureGates[].ClusterAPIInstall=true) this can be set to'true'to injectspotMarketOptionsinto master machine manifests.The existing
$SPOT_INSTANCESvariable is unchanged: as before, it only results in worker nodes using spot instances. (We may at some point wish to rename this to$SPOT_WORKERSfor clarity.)NOTE: Spot instances are unreliable. Using them may cause additional flakes in your tests.
Needed by RFE-5545