Skip to content

Commit 5dd2ca5

Browse files
committed
Add baremetal provisioning configuration to a new CR
Enhancement request details the configuration items that are going to part of the new CR and the motivations for adding this CR.
1 parent ee0368a commit 5dd2ca5

File tree

1 file changed

+223
-0
lines changed

1 file changed

+223
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,223 @@
1+
---
2+
title: Baremetal IPI Config Enhancement
3+
authors:
4+
- "@sadasu"
5+
reviewers:
6+
- "@deads2k"
7+
- "@hardys"
8+
- "@enxebre"
9+
- "@squeed"
10+
- "@abhinavdahiya"
11+
- "@dhellmann"
12+
13+
approvers:
14+
- "@abhinavdahiya"
15+
- "@enxebre"
16+
- "@deads2k"
17+
18+
creation-date: 2019-11-14
19+
last-updated: 2019-11-26
20+
status: not implemented
21+
see-also:
22+
-
23+
replaces:
24+
- https://github.com/openshift/enhancements/pull/90
25+
superseded-by:
26+
-
27+
---
28+
29+
# Config required for Baremetal IPI deployments
30+
31+
## Release Signoff Checklist
32+
33+
- [*] Enhancement is `implementable`
34+
- [*] Design details are appropriately documented from clear requirements
35+
- [ ] Test plan is defined
36+
- [ ] Graduation criteria for dev preview, tech preview, GA
37+
- [ ] User-facing documentation is created in [openshift/docs]
38+
39+
## Open Questions
40+
41+
1. Which is a preferred name for the new CR "Metal3ProvisioningController" or
42+
"Metal3Controller"?
43+
44+
[Closed] Based on review comments it appears that "Metal3ProvisioningController"
45+
is the preferred name since it makes the contents of the config very clear.
46+
47+
2. There is a possibility that the MCO would also have to refer to this new CR
48+
created in operator scope. Would it be OK for the MCO to access the CR with this
49+
(limited) scope?
50+
51+
[Closed] MCO will be able to read config from this new CR if needed.
52+
53+
## Summary
54+
55+
The configuration representing different provisioning network parameters used
56+
for provisioning baremetal servers need to be made available to the Machine
57+
API Operator (MAO) in a Config Resource. We had submitted an earlier proposal
58+
[1] to augment the BareMetalInfrastructureStatus CR with these config items.
59+
Based on feedback we recieved we are revising our proposal to create a new CR
60+
for these configuration items.
61+
62+
This new CR needs to be accessed by just the Installer and the Machine API
63+
Operator and hence does not need to be created with global scope. The installer
64+
would be responsible for instantiating the CR with user input values. The MAO is
65+
responsible for deploying the metal3 cluster. And the configuration values from
66+
this CR would be read by the MAO and used to generate some other configuration
67+
parameters. All of these values (user provided and MAO derived) need to be
68+
passed in as env vars to the containers that are part of the metal3 cluster
69+
before starting the containers. In most cases, the containers will not start
70+
without these init configs available as env vars.
71+
72+
For a background on the work being done for BareMetal IPI installs please refer
73+
to [2] for the necessary context for the enhancements proposed here.
74+
75+
## Motivation
76+
77+
The Baremetal IPI deployments are different from the other platform types currently
78+
being supported by OpenShift in that there is no underlying cloud platform
79+
exposing an API as in public clouds e.g AWS, until the Baremetal Operator (BMO)
80+
along with Ironic services are run exposing an inventory of available hosts as
81+
custom resources.
82+
83+
The "metal3" pod deployed by the Machine API operator (MAO) contains the BareMetal
84+
Operator (BMO) and a provisioning service (Ironic) that are together responsible for
85+
PXE booting baremetal servers and enrolling them as BareMetal Hosts. The MAO is
86+
responsible for deploying the BMO and the Ironic containers.
87+
88+
This enhancement request proposes to add a new CR for configs required only for
89+
the provisioning service to PXE boot a baremetal server and make it available as a
90+
BareMetal host. The configurations in this baremetal provisioning CR, are seperate
91+
from the configuration in the BareMetalHost CR. This new CR will allow for setting
92+
some sane defaults for these configurations with the ability to overrride them if
93+
required. After the baremetal server is provisioned, the configurations in the
94+
BareMetalHost CR contain information to connect to the controller and also to
95+
fulfill a Machine.
96+
97+
### Goals
98+
99+
The goal of this enhancement request is to provide details about the configuration
100+
being added to a new CR used only in "metal3" deployments. This new CR would be
101+
within the operator scope and will only contain configuration required by the
102+
provisioning service to boot baremetal servers.
103+
104+
### Non-Goals
105+
106+
The provisioning network or the provisioning IP are not expected to change after
107+
deployment. But, it is possible for the DHCP range to be expanded after nodes
108+
have already been deployed. This proposal is not considering the update path for
109+
these configurations items.
110+
111+
### Proposal
112+
113+
Before BareMetal Hosts can be matched to Machines, they need to be connected to their
114+
provisioning network for them to be PXE booted and given an IP address. To make
115+
this happen, the provisioning service needs to know which NIC, provisioning network,
116+
IP address and image URLs to use to download and boot images on these servers.
117+
118+
A new CR called "Metal3ProvisioningController" would be created within the operator scope.
119+
120+
This new CR would consist of the following:
121+
122+
1. ProvisioningInterface : This is the interface name on the underlying baremetal
123+
server which would be physically connected to the provisioning network. This
124+
configuration is needed only for the underlying provisioning service (Ironic)
125+
and could have values like "eth1" or "ens3".
126+
127+
2. ProvisioningIP : This is the IP address used to to bring up a NIC on the
128+
baremetal server for provisioning. This is also a value that is useful just to the
129+
provisioning service. This value should not be in the DHCP range and should not
130+
be used in the provisioning network for any other purpose (should be a free IP
131+
address.) It is expected to be provided in the format : 10.1.0.3.
132+
133+
3. ProvisioningNetworkCIDR : This is the provisioning network with its CIDR. The
134+
ProvisioningIP and the ProvisioningDHCPRange are expected to be within this network.
135+
The value for this config is expected to be provided in the format : 10.1.0.0/24.
136+
137+
4. ProvisioningDHCP : This configuration needs to convey two values: if the DHCP
138+
service needs to be managed within the cluster and if so what is the range of IP
139+
addresses that can used. Towards that end, this configuration would be a struct
140+
with 2 members:
141+
1. ManagementType - The ManagementType is a string that can have any of
142+
the following states "Managed", "Unmanaged", "Force", "Removed". In this
143+
case, we are interested only in 2 states, "Managed" and "Removed". If the
144+
ManagementType is "Removed" it means that the metal3 cluster would not be
145+
responsible for managing DHCP addresses and an external DHCP server is
146+
expected to be available and reachable by the cluster. If the ManagementType
147+
is "Managed", then the DHCP range indicates the pool of IP addresses that
148+
can used to assign to the baremetal hosts. This value cannot be changed
149+
after installation.
150+
151+
2. DHCPRange - The DHCPRange when set, is a string which consists of a pair of
152+
comma seperated IP addresses representing the start and end of the IP address
153+
range. If unset, then the default IP address range (.10 to .100) would be
154+
used. The value of the DHCP range can be changed even after insallation.
155+
156+
### User Stories [optional]
157+
158+
1. As a Deployment Operator, I want Barametal IPI deployments to be customizable to
159+
hardware and network requirements.
160+
161+
2. As an Openshift Administrator, I want Baremetal IPI deployments to take place without
162+
manual workarounds like creating a ConfigMap for the config (which is the current approach
163+
being used in 4.2 and 4.3.)
164+
165+
## Design Details
166+
167+
This new baremetal CR would be created in the "openshift-machine-api" namespace and as
168+
mentioned earlier would be in operator scope. Only one instance of this CR would be
169+
created by the installer and hence it is a singleton CR.
170+
171+
Important details of the CR:
172+
173+
Resource name - provisioningconfig.baremetal.operator.openshift.io
174+
Instance name - main/default
175+
Namespace - openshift-machine-api
176+
Version - apiextensions.k8s.io/v1
177+
178+
The new config items would be set by the installer and will be used by the MAO to
179+
generate more config items that are derivable from these basic parameters. Put
180+
together, these config parameters are passed in as environment variables to the various
181+
containers that make up a metal3 baremetal IPI deployment.
182+
183+
This baremetal provisioning CR contains configuration data for the provisioning services,
184+
which are not values that should be configured by the end user via BareMetalHost objects.
185+
186+
The configs described in this enhancement doc would be part of the Spec field of the CR.
187+
Only the ProvisioningDHCP.DHCPRange field can change after installtion, so this will be
188+
marked as editable. All other config items will be marked as not editable.
189+
190+
### Test Plan
191+
192+
The test plan should involve making sure the openshift/installer generates
193+
all configuration items within the BaremetalPlatformStatus when the platform
194+
type is Baremetal.
195+
196+
MAO reads this configuration and uses these to derive additional configuration
197+
required to bring up a metal3 cluster. E2e testing should make sure that MAO
198+
is able to bring up a metal3 cluster using config from this new Metal3Controller
199+
CR which has operator scope.
200+
201+
Once metal3 is up, the next level of testing should involve bringing up worker nodes.
202+
Also, testing needs to make sure we are still able to bring up worker nodes when there
203+
is an external DHCP server and we donot bring up DHCP services within the cluster.
204+
205+
Test plan should also include tests to dynamically increase the DHCP range after a
206+
metal3 cluster has been up and a few workers have come up successfully.
207+
208+
### Upgrade / Downgrade Strategy
209+
210+
Baremetal Platform type will be available for customers to use for the first
211+
time in Openshift 4.3. And, when it is installed, it will always start as a
212+
fresh baremetal installation at least in 4.3. There is no use case where a 4.2
213+
installation would be upgraded to a 4.3 installation with Baremetal Platform
214+
support enabled.
215+
216+
To ensure a hitless upgrade from 4.3 to 4.4, the implementation in 4.4 would try to
217+
read the configuration from the new CR and the ConfigMap. If MAO is unable to find the
218+
provsioning configuration in the new CR, it will fallback to reading it from the ConfigMap.
219+
And, this decision will be made per config item and not based just on the presence of the
220+
new CR.
221+
222+
[1] - https://github.com/openshift/enhancements/pull/90
223+
[2] - https://github.com/openshift/enhancements/pull/102

0 commit comments

Comments
 (0)