Skip to content

Conversation

@imain
Copy link
Contributor

@imain imain commented Oct 2, 2019

Extend the baremetal platform fields to include provisioning information
for the pod started by the Machine API Operator. This is a work in progress
and I would appreciate feedback on this approach.

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 2, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: imain
To complete the pull request process, please assign wking
You can assign the PR to them by writing /assign @wking in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@imain
Copy link
Contributor Author

imain commented Oct 2, 2019

To test with dev-scripts: openshift-metal3/dev-scripts#818

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RHCOS image URL shouldn't be provided via the installConfig, it's generated from the data/data/rhcos.json - e.g see ac89074

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add more detail, I think you'll need to use a similar approach to https://github.com/openshift/installer/blob/master/pkg/asset/cluster/tfvars.go#L77 where "github.com/openshift/installer/pkg/asset/rhcos" gets imported, then you add new(rhcos.Image), to the Dependencies of this asset, and to the parents.Get in the Generate function

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above I don't think we should add the rhcos url here, instead we'll reference the platform.Image

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this approach is acceptable we'll need to add actual validation for all of these similar to what I did in #2320

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes just a POC. The tricky part is that a lot of these values don't work with the validations.. eg the provisioning IP is in the form of a CIDR, the interface doesn't exist so it errors out etc.

Copy link

@hardys hardys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding the install-config values is OK except for the RHCOS image, and perhaps we can default some of the *IP values based on the provisioningCIDR (but that can be a follow-up).

What I'd really like to see is how we then wire these values into the CRD or configmap for the MAO?

@hardys
Copy link

hardys commented Oct 3, 2019

/label platform/baremetal

@openshift-ci-robot openshift-ci-robot added the platform/baremetal IPI bare metal hosts platform label Oct 3, 2019
@hardys
Copy link

hardys commented Oct 3, 2019

Ok so overall this looks good, we'll need feedback from the core installer team about the approach using the infrastructure manifest to provide this data to the MAO but it seems reasonable to me.

As a next step I'd suggest we fix the RHCOS image issue mentioned, defer adding the install-config validation, and create a corresponding MAO PR which demonstrates how this data will be used, then we can ask for more detailed feedback from the core reviewers.

@hardys
Copy link

hardys commented Oct 3, 2019

Also we need defaults for all the new install-config variables in pkg/types/baremetal/defaults/platform.go

@metal3ci
Copy link

metal3ci commented Oct 3, 2019

Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/1192/

@metal3ci
Copy link

metal3ci commented Oct 3, 2019

Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/1193/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the format of this? Something like "172.22.0.1-172.22.0.10"? For validation, it might be easier if it's two separate fields with a start, and end.

@stbenjam
Copy link
Member

stbenjam commented Oct 4, 2019

In general, the approach looks fine to me, but I think the fields need validation. The installer team is definitely going to ask for that. CIDR fields can be validated with https://golang.org/pkg/net/#ParseCIDR

@imain
Copy link
Contributor Author

imain commented Oct 4, 2019

In general, the approach looks fine to me, but I think the fields need validation. The installer team is definitely going to ask for that. CIDR fields can be validated with https://golang.org/pkg/net/#ParseCIDR

Awesome thanks for that. It was the approach I wanted to check before diving in any further. Yeah we could do a start/end for the dhcp range too and test that it's in the network CIDR.

@metal3ci
Copy link

metal3ci commented Oct 4, 2019

Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/1194/

@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 11, 2019
@imain imain force-pushed the metal3-config branch 2 times, most recently from a4b403f to 0d1903c Compare October 11, 2019 04:48
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this replaces the hard-coded default for ClusterProvisioningIP, and we should probably also calculate BootstrapProvisioningIP?

Copy link

@hardys hardys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a few changes needed I think @imain @sadasu if you'd like I can take care of them, would you prefer to give me push access to the fork or shall I create a new PR from my own repo?

@hardys
Copy link

hardys commented Oct 24, 2019

We should also tag this as related to #2091 - we also need to either fix #2251 or update the templating of the startironic.sh script to consume these new values instead of hard-coding

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates the existing ClusterProvisioningIP interface so I'll remove it

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @sadasu let me know if you'd prefer to move this to the MAO

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have modified https://github.com/openshift/api/pull/480/files to accept provisioningIP and provisioningNetworkCIDR as 2 seperate config items so that we do not add the CIDR to provisioningIP. As you had mentioned earlier, this is the more intuitive way of getting config.

@metal3ci
Copy link

Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1271/

@hardys
Copy link

hardys commented Oct 25, 2019

/retitle WIP: Configure Metal3 from the installer

@openshift-ci-robot openshift-ci-robot changed the title Configure Metal3 from the installer. WIP: Configure Metal3 from the installer Oct 25, 2019
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 25, 2019
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openshift/api#480 currently assumes that the DHCP range is provided as comma separated IP address pair where the 1st IP address represents the start of the range and 2nd IP address represents the last address in the range. The enhancement request (openshift/enhancements#90) has been raised with the same format in mind.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack we can join that here - there was some open discussion on the API PR regarding the comma-separated format vs using a CIDR to express the DHCP range, I'll assume we're going with the comma-separated approach for now.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sadasu another approach to resolve the API abiguity would be to also adopt explicit start/end parameters I guess, but I'll update this to align with the API PR so we can proceed with testing

Extend the baremetal platform fields to include provisioning information
for the pod started by the Machine API Operator.  Now includes defaults
and the ability to set defaults based on just the network CIDR.

Note this depends on openshift/api#480

Co-Authored-By: Steven Hardy <[email protected]>
@openshift-ci-robot
Copy link
Contributor

@imain: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws-disruptive e62edfe8db377ee374c84e8260ed4ba43f424b3f link /test e2e-aws-disruptive
ci/prow/gofmt 211cfd1 link /test gofmt
ci/prow/verify-vendor 211cfd1 link /test verify-vendor
ci/prow/govet 211cfd1 link /test govet
ci/prow/unit 211cfd1 link /test unit
ci/prow/e2e-aws-scaleup-rhel7 211cfd1 link /test e2e-aws-scaleup-rhel7
ci/prow/e2e-aws-upgrade 211cfd1 link /test e2e-aws-upgrade
ci/prow/e2e-libvirt 211cfd1 link /test e2e-libvirt
ci/prow/e2e-aws 211cfd1 link /test e2e-aws
ci/prow/images 211cfd1 link /test images
ci/prow/e2e-openstack 211cfd1 link /test e2e-openstack

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@metal3ci
Copy link

Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1277/

}
case baremetal.Name:
config.Status.PlatformStatus.Type = configv1.BareMetalPlatformType
// The MAO expects the ProvisioiningDHCPRange in comma-separated format
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The MAO expects the ProvisioiningDHCPRange in comma-separated format
// The MAO expects the ProvisioningDHCPRange in comma-separated format

HardwareProfile = "default"
APIVIP = ""
IngressVIP = ""
ProvisioningInterface = "ens3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should there be an interface name default?

@hardys
Copy link

hardys commented Dec 2, 2019

Note this is blocked on openshift/enhancements#119 since openshift/enhancements#90 and related API change this was based on got rejected

@openshift-ci-robot
Copy link
Contributor

@imain: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hardys
Copy link

hardys commented Dec 16, 2019

@imain do you have a new version of this you can share?

As mentioned in #2758 I wonder if it'd be easier to approach this as either two PRs, or at least two commits - one which adds the install-config interfaces and templating for the boostrap hosted ironic, and another which writes out the CR based on openshift/api#540

I think that would simplify reviews/testing, and also enable us to fix #2091 which would be useful for some ongoing testing.

@imain
Copy link
Contributor Author

imain commented Dec 17, 2019

Should have shortly. Sorry it's taking so long.

@fabianofranz
Copy link
Member

Is there anything blocking this other than openshift/machine-api-operator#460?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. platform/baremetal IPI bare metal hosts platform size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants