Skip to content

HIVE-2476: OpenStack.MachineSets(): Allow nil Replicas#8227

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
2uasimojo:HIVE-2476/osp-msets-nil-replicas
Apr 5, 2024
Merged

HIVE-2476: OpenStack.MachineSets(): Allow nil Replicas#8227
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
2uasimojo:HIVE-2476/osp-msets-nil-replicas

Conversation

@2uasimojo
Copy link
Member

@2uasimojo 2uasimojo commented Apr 2, 2024

A nil pointer exception occurs when the OpenStack MachineSets() func is called with Replicas==nil, as it would be when hive is attempting to generate autoscaling MachineSets -- we set the replica count later on after we find out how many MachineSets to spread the autoscaling min/max across.

Add a safeguard to default to zero in this scenario.

Needed by HIVE-2476

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Apr 2, 2024

@2uasimojo: This pull request references HIVE-2476 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.16.0" version, but no target version was set.

Details

In response to this:

A nil pointer exception occurs when the OpenStack MachineSets() func is called with Replicas==nil, as it would be when hive is attempting to generate autoscaling MachineSets -- we set the replica count later on after we find out how many MachineSets to spread the autoscaling min/max across.

Add a safeguard to default to zero in this scenario.

Needed by HIVE-2476

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 2, 2024
@openshift-ci openshift-ci bot requested review from gryf and stephenfin April 2, 2024 21:59
@2uasimojo
Copy link
Member Author

/assign @EmilienM
/cc @pierreprinetti

Now that I'm resurrecting hive's unit tests around OpenStack MachinePools, this has come to light -- and is in fact the subject of the referenced card.

The provenance of this goes through #7380 I believe.

This one I can work around on the hive side if necessary by making Replicas = &0 rather than nil; but other providers have a similar safeguard in place, so I wanted to at least propose that we do the same here.

Copy link
Member

@pierreprinetti pierreprinetti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

It's very interesting for us to know that our code is used in ways we didn't expect. We might need to drop some assumptions tied to the Installer itself.

I have a comment, but otherwise this LGTM!

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 3, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pierreprinetti

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 3, 2024
@2uasimojo
Copy link
Member Author

It's very interesting for us to know that our code is used in ways we didn't expect. We might need to drop some assumptions tied to the Installer itself.

We have a long-standing wishlist item to get rid of our vendoring of installer entirely. It would massively reduce our git footprint as well as making ongoing maintenance way easier. Unfortunately it would be a pretty massive undertaking, and I don't have high hopes we'll actually get around to it ever :(

But yeah, in short, we call into installer code for two main things:

  • Day 2 MachinePools -- the stuff that's been coming across your desk lately.
  • Destroy. For historical reasons we don't save the whole metadata.json after install time, but rather carry specific fields through piecemeal; and then we build and run a destroyer directly from your go code. Where this tends to bite us is if installer adds metadata.json fields that are then required by the destroyer (else infra resources are leaked) -- we need to know about them and incorporate them explicitly.

@pierreprinetti
Copy link
Member

Destroy. For historical reasons we don't save the whole metadata.json after install time, but rather carry specific fields through piecemeal; and then we build and run a destroyer directly from your go code. Where this tends to bite us is if installer adds metadata.json fields that are then required by the destroyer (else infra resources are leaked) -- we need to know about them and incorporate them explicitly.

Indeed, in our CI tooling we call openshift-install destroy cluster with a homebrewed metadata.json but we haven't changed it in ages in OpenStack.

A nil pointer exception occurs when the OpenStack MachineSets() func is
called with `Replicas==nil`, as it would be when hive is attempting to
generate autoscaling MachineSets -- we set the replica count later on
after we find out how many MachineSets to spread the autoscaling min/max
across.

Add a safeguard to default to zero in this scenario.

Needed by HIVE-2476
@pierreprinetti
Copy link
Member

/lgtm
👍

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 5, 2024
@pierreprinetti
Copy link
Member

/retest-required

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD ac3ac89 and 2 for PR HEAD eea5219 in total

@2uasimojo
Copy link
Member Author

/retest-required

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 5, 2024

@2uasimojo: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-openstack-nfv-intel eea5219 link false /test e2e-openstack-nfv-intel
ci/prow/e2e-openstack-proxy eea5219 link false /test e2e-openstack-proxy

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 304af67 into openshift:master Apr 5, 2024
@2uasimojo 2uasimojo deleted the HIVE-2476/osp-msets-nil-replicas branch April 5, 2024 20:24
@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

This PR has been included in build ose-installer-altinfra-container-v4.16.0-202404051443.p0.g304af67.assembly.stream.el8 for distgit ose-installer-altinfra.
All builds following this will include this PR.

2uasimojo added a commit to 2uasimojo/hive that referenced this pull request Apr 5, 2024
...to pick up openshift/installer#8227 which
should fix the nil pointer exception described in the referenced card.

HIVE-2476
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants