Skip to content
This repository was archived by the owner on Oct 22, 2024. It is now read-only.

Commit 3245084

Browse files
committed
operator: revise deployment status API and operator reconcile loop
Revised the Deployment.Status to accommodate the deployment state conditions and driver state. Currently, Deployment has 3 conditions named CertsVerified, CertsReady, and DriverDeployed. It also records the summary of controller and node driver state, .i.e, no. of nodes the driver is running. In order to record real time status of the driver current had to rewrite the current reconcile loop. The existing reconcile loop was keen on the deployment CR changes and redeploy *only* the sub-objects that requires to redeploy. Instead the new reconcile logic *refresh* all the objects and CR status to keep the state consistent. The refresh chooses to merge patching the objects to avoid all unnecessary updates. There are two reconcile entry points: - CR reconcile loop: refreshes all the sub-objects and CR status - sub-object vent handler: redeploy only the deleted/changed resource and updates CR status if required. This also includes other code cleanups that come across. FIXES: #611
1 parent 3d2367e commit 3245084

File tree

6 files changed

+1553
-690
lines changed

6 files changed

+1553
-690
lines changed

docs/install.md

+74-26
Original file line numberDiff line numberDiff line change
@@ -270,36 +270,73 @@ pmem-csi.intel.com 50s
270270

271271
$ kubectl describe deployment.pmem-csi.intel.com/pmem-csi.intel.com
272272
Name: pmem-csi.intel.com
273-
Namespace: default
273+
Namespace:
274274
Labels: <none>
275275
Annotations: <none>
276276
API Version: pmem-csi.intel.com/v1alpha1
277277
Kind: Deployment
278278
Metadata:
279-
Creation Timestamp: 2020-01-23T13:40:32Z
279+
Creation Timestamp: 2020-10-07T07:31:58Z
280280
Generation: 1
281-
Resource Version: 3596387
282-
Self Link: /apis/pmem-csi.intel.com/v1alpha1/deployments/pmem-csi.intel.com
283-
UID: 454b5961-5aa2-41c3-b774-29fe932ae236
281+
Managed Fields:
282+
API Version: pmem-csi.intel.com/v1alpha1
283+
Fields Type: FieldsV1
284+
fieldsV1:
285+
f:spec:
286+
.:
287+
f:deviceMode:
288+
f:nodeSelector:
289+
.:
290+
f:storage:
291+
Manager: kubectl-create
292+
Operation: Update
293+
Time: 2020-10-07T07:31:58Z
294+
API Version: pmem-csi.intel.com/v1alpha1
295+
Fields Type: FieldsV1
296+
fieldsV1:
297+
f:status:
298+
.:
299+
f:conditions:
300+
f:driverComponents:
301+
f:lastUpdated:
302+
f:phase:
303+
Manager: pmem-csi-operator
304+
Operation: Update
305+
Time: 2020-10-07T07:32:22Z
306+
Resource Version: 1235740
307+
Self Link: /apis/pmem-csi.intel.com/v1alpha1/deployments/pmem-csi.intel.com
308+
UID: d8635490-53fa-4eec-970d-cd4c76f53b23
284309
Spec:
285-
Controller Resources:
286-
Requests:
287-
Cpu: 200m
288-
Memory: 100Mi
289310
Device Mode: lvm
290-
Image: localhost/pmem-csi-driver:canary
291-
Node Resources:
292-
Requests:
293-
Cpu: 200m
294-
Memory: 100Mi
311+
Node Selector:
312+
Storage: pmem
295313
Status:
296-
Phase: Running
314+
Conditions:
315+
Last Update Time: 2020-10-07T07:32:00Z
316+
Reason: Driver certificates are available.
317+
Status: True
318+
Type: CertsReady
319+
Last Update Time: 2020-10-07T07:32:02Z
320+
Reason: Driver deployed successfully.
321+
Status: True
322+
Type: DriverDeployed
323+
Driver Components:
324+
Component: Controller
325+
Last Updated: 2020-10-08T07:45:13Z
326+
Reason: 1 instance(s) of controller driver is running successfully
327+
Status: Ready
328+
Component: Node
329+
Last Updated: 2020-10-08T07:45:11Z
330+
Reason: All 3 node driver pod(s) running successfully
331+
Status: Ready
332+
Last Updated: 2020-10-07T07:32:21Z
333+
Phase: Running
334+
Reason: All driver components are deployed successfully
297335
Events:
298-
Type Reason Age From Message
299-
---- ------ ---- ---- -------
300-
Normal NewDeployment 34s pmem-csi-operator Processing new driver deployment
301-
Normal Running 2s (x10 over 26s) pmem-csi-operator Driver deployment successful
302-
336+
Type Reason Age From Message
337+
---- ------ ---- ---- -------
338+
Normal NewDeployment 58s pmem-csi-operator Processing new driver deployment
339+
Normal Running 39s pmem-csi-operator Driver deployment successful
303340

304341
$ kubectl get po
305342
NAME READY STATUS RESTARTS AGE
@@ -1176,21 +1213,32 @@ active volumes.
11761213

11771214
#### DeploymentStatus
11781215

1179-
A PMEM-CSI Deployment's `status` field is a `DeploymentStatus` object, which has
1180-
a `phase` field. The phase of a Deployment is high-level summary of where the
1181-
Deployment is in it's lifecycle.
1216+
A PMEM-CSI Deployment's `status` field is a `DeploymentStatus` object, which
1217+
carries the detailed state of the driver deployment. It comprises of deployment
1218+
conditions, driver component status, and a `phase` field. The phase of a
1219+
Deployment is a high-level summary of where the Deployment is in its lifecycle.
11821220

11831221
The possible `phase` values and their meaning are as below:
11841222

11851223
| Value | Meaning |
11861224
|---|---|
11871225
| empty string | A new deployment. |
1188-
| Initializing | All the direct sub-resources of the `Deployment` are created, but some indirect ones (like pods controlled by a daemon set) may still be missing. |
11891226
| Running | The operator has determined that the driver is usable<sup>1</sup>. |
1190-
| Failed | For some reason the state of the `Deployment` failed and cannot be progressed<sup>2</sup>. |
1227+
| Failed | For some reason the state of the `Deployment` failed and cannot be progressed. |
11911228

11921229
<sup>1</sup> This check has not been implemented yet. Instead, the deployment goes straight to `Running` after creating sub-resources.
1193-
<sup>2</sup> Failure reason is supposed to be carried by one of additional `DeploymentStatus` field, but not implemented yet.
1230+
1231+
#### Deployment Conditions
1232+
1233+
PMEM-CSI `DeploymentStatus` has an array of `conditions` through witch the
1234+
PMEM-CSI Deployment has or has not passed. Below are the possible condition
1235+
types and their meanings:
1236+
1237+
| Condition type | Meaning |
1238+
|---|---|
1239+
| CertsReady | Driver certificates/secrets are available. |
1240+
| CertsVerified | Verified that the provided certificates are valid. |
1241+
| DriverDeployed | All the componentes required for the PMEM-CSI deployment has been deployed. |
11941242

11951243
#### Deployment Events
11961244

pkg/apis/pmemcsi/v1alpha1/deployment_types.go

+191-3
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ const (
5555
// Related issue : https://github.com/kubernetes-sigs/controller-tools/issues/478
5656
// Fails setting min/max for integers: https://github.com/helm/helm/issues/5806
5757

58+
// +k8s:deepcopy-gen=true
5859
// DeploymentSpec defines the desired state of Deployment
5960
type DeploymentSpec struct {
6061
// Important: Run "make operator-generate-k8s" to regenerate code after modifying this file
@@ -109,13 +110,77 @@ type DeploymentSpec struct {
109110
KubeletDir string `json:"kubeletDir,omitempty"`
110111
}
111112

113+
// DeploymentConditionType type for representing a deployment status condition
114+
type DeploymentConditionType string
115+
116+
const (
117+
// CertsVerified means the provided deployment secrets are verified and valid for usage
118+
CertsVerified DeploymentConditionType = "CertsVerified"
119+
// CertsReady means secrests/certificates required for running the PMEM-CSI driver
120+
// are ready and the deployment could progress further
121+
CertsReady DeploymentConditionType = "CertsReady"
122+
// DriverDeployed means that the all the sub-resources required for the deployment CR
123+
// got created
124+
DriverDeployed DeploymentConditionType = "DriverDeployed"
125+
)
126+
127+
// +k8s:deepcopy-gen=true
128+
type DeploymentCondition struct {
129+
// Type of condition.
130+
Type DeploymentConditionType `json:"type"`
131+
// Status of the condition, one of True, False, Unknown.
132+
Status corev1.ConditionStatus `json:"status"`
133+
// Message human readable text that explain why this condition is in this state
134+
// +optional
135+
Reason string `json:"reason,omitempty"`
136+
// Last time the condition was probed.
137+
// +optional
138+
LastUpdateTime metav1.Time `json:"lastUpdateTime,omitempty"`
139+
}
140+
141+
type DriverType int
142+
143+
const (
144+
ControllerDriver DriverType = iota
145+
NodeDriver
146+
)
147+
148+
func (t DriverType) String() string {
149+
switch t {
150+
case ControllerDriver:
151+
return "Controller"
152+
case NodeDriver:
153+
return "Node"
154+
}
155+
return ""
156+
}
157+
158+
// +k8s:deepcopy-gen=true
159+
type DriverStatus struct {
160+
// Type represents type of the driver: controller or node
161+
DriverComponent string `json:"component"`
162+
// Status represents the driver status : Ready, NotReady
163+
Status string `json:"status"`
164+
// Reason represents the human readable text that explains why the
165+
// driver is in this state.
166+
Reason string `json:"reason"`
167+
// LastUpdated time of the driver status
168+
LastUpdated metav1.Time `json:"lastUpdated,omitempty"`
169+
}
170+
171+
// +k8s:deepcopy-gen=true
172+
112173
// DeploymentStatus defines the observed state of Deployment
113174
type DeploymentStatus struct {
114175
// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
115176
// Important: Run "make operator-generate-k8s" to regenerate code after modifying this file
116177

117178
// Phase indicates the state of the deployment
118-
Phase DeploymentPhase `json:"phase,omitempty"`
179+
Phase DeploymentPhase `json:"phase,omitempty"`
180+
Reason string `json:"reason,omitempty"`
181+
// Conditions
182+
Conditions []DeploymentCondition `json:"conditions,omitempty"`
183+
Components []DriverStatus `json:"driverComponents,omitempty"`
119184
// LastUpdated time of the deployment status
120185
LastUpdated metav1.Time `json:"lastUpdated,omitempty"`
121186
}
@@ -206,8 +271,6 @@ type DeploymentPhase string
206271
const (
207272
// DeploymentPhaseNew indicates a new deployment
208273
DeploymentPhaseNew DeploymentPhase = ""
209-
// DeploymentPhaseInitializing indicates deployment initialization is in progress
210-
DeploymentPhaseInitializing DeploymentPhase = "Initializing"
211274
// DeploymentPhaseRunning indicates that the deployment was successful
212275
DeploymentPhaseRunning DeploymentPhase = "Running"
213276
// DeploymentPhaseFailed indicates that the deployment was failed
@@ -259,6 +322,35 @@ func (c DeploymentChange) String() string {
259322
}[c]
260323
}
261324

325+
func (d *Deployment) SetCondition(t DeploymentConditionType, state corev1.ConditionStatus, reason string) {
326+
for _, c := range d.Status.Conditions {
327+
if c.Type == t {
328+
c.Status = state
329+
c.Reason = reason
330+
c.LastUpdateTime = metav1.Now()
331+
return
332+
}
333+
}
334+
d.Status.Conditions = append(d.Status.Conditions, DeploymentCondition{
335+
Type: t,
336+
Status: state,
337+
Reason: reason,
338+
LastUpdateTime: metav1.Now(),
339+
})
340+
}
341+
342+
func (d *Deployment) SetDriverStatus(t DriverType, status, reason string) {
343+
if d.Status.Components == nil {
344+
d.Status.Components = make([]DriverStatus, 2)
345+
}
346+
d.Status.Components[t] = DriverStatus{
347+
DriverComponent: t.String(),
348+
Status: status,
349+
Reason: reason,
350+
LastUpdated: metav1.Now(),
351+
}
352+
}
353+
262354
// EnsureDefaults make sure that the deployment object has all defaults set properly
263355
func (d *Deployment) EnsureDefaults(operatorImage string) error {
264356
if d.Spec.Image == "" {
@@ -408,6 +500,78 @@ func (d *Deployment) GetHyphenedName() string {
408500
return strings.ReplaceAll(d.GetName(), ".", "-")
409501
}
410502

503+
// RegistrySecretName returns the name of the registry
504+
// Secret object used by the deployment
505+
func (d *Deployment) RegistrySecretName() string {
506+
return d.GetHyphenedName() + "-registry-secrets"
507+
}
508+
509+
// NodeSecretName returns the name of the node-controller
510+
// Secret object used by the deployment
511+
func (d *Deployment) NodeSecretName() string {
512+
return d.GetHyphenedName() + "-node-secrets"
513+
}
514+
515+
// CSIDriverName returns the name of the CSIDriver
516+
// object name for the deployment
517+
func (d *Deployment) CSIDriverName() string {
518+
return d.GetName()
519+
}
520+
521+
// ControllerServiceName returns the name of the controller
522+
// Service object used by the deployment
523+
func (d *Deployment) ControllerServiceName() string {
524+
return d.GetHyphenedName() + "-controller"
525+
}
526+
527+
// MetricsServiceName returns the name of the controller metrics
528+
// Service object used by the deployment
529+
func (d *Deployment) MetricsServiceName() string {
530+
return d.GetHyphenedName() + "-metrics"
531+
}
532+
533+
// ServiceAccountName returns the name of the ServiceAccount
534+
// object used by the deployment
535+
func (d *Deployment) ServiceAccountName() string {
536+
return d.GetHyphenedName() + "-controller"
537+
}
538+
539+
// ProvisionerRoleName returns the name of the provisioner's
540+
// RBAC Role object name used by the deployment
541+
func (d *Deployment) ProvisionerRoleName() string {
542+
return d.GetHyphenedName() + "-external-provisioner-cfg"
543+
}
544+
545+
// ProvisionerRoleBindingName returns the name of the provisioner's
546+
// RoleBinding object name used by the deployment
547+
func (d *Deployment) ProvisionerRoleBindingName() string {
548+
return d.GetHyphenedName() + "-csi-provisioner-role-cfg"
549+
}
550+
551+
// ProvisionerClusterRoleName returns the name of the
552+
// provisioner's ClusterRole object name used by the deployment
553+
func (d *Deployment) ProvisionerClusterRoleName() string {
554+
return d.GetHyphenedName() + "-external-provisioner-runner"
555+
}
556+
557+
// ProvisionerClusterRoleBindingName returns the name of the
558+
// provisioner ClusterRoleBinding object name used by the deployment
559+
func (d *Deployment) ProvisionerClusterRoleBindingName() string {
560+
return d.GetHyphenedName() + "-csi-provisioner-role"
561+
}
562+
563+
// NodeDriverName returns the name of the driver
564+
// DaemonSet object name used by the deployment
565+
func (d *Deployment) NodeDriverName() string {
566+
return d.GetHyphenedName() + "-node"
567+
}
568+
569+
// ControllerDriverName returns the name of the controller
570+
// StatefulSet object name used by the deployment
571+
func (d *Deployment) ControllerDriverName() string {
572+
return d.GetHyphenedName() + "-controller"
573+
}
574+
411575
// GetOwnerReference returns self owner reference could be used by other object
412576
// to add this deployment to it's owner reference list.
413577
func (d *Deployment) GetOwnerReference() metav1.OwnerReference {
@@ -423,6 +587,30 @@ func (d *Deployment) GetOwnerReference() metav1.OwnerReference {
423587
}
424588
}
425589

590+
// HaveCertificatesConfigured checks if the configured deployment
591+
// certificate fields are valid. Returns true if valid else appropriate
592+
// error.
593+
func (d *Deployment) HaveCertificatesConfigured() (bool, error) {
594+
// Encoded private keys and certificates
595+
caCert := d.Spec.CACert
596+
registryPrKey := d.Spec.RegistryPrivateKey
597+
ncPrKey := d.Spec.NodeControllerPrivateKey
598+
registryCert := d.Spec.RegistryCert
599+
ncCert := d.Spec.NodeControllerCert
600+
601+
// sanity check
602+
if caCert == nil {
603+
if registryCert != nil || ncCert != nil {
604+
return false, fmt.Errorf("incomplete deployment configuration: missing root CA certificate by which the provided certificates are signed")
605+
}
606+
return false, nil
607+
} else if registryCert == nil || registryPrKey == nil || ncCert == nil || ncPrKey == nil {
608+
return false, fmt.Errorf("incomplete deployment configuration: certificates and corresponding private keys must be provided")
609+
}
610+
611+
return true, nil
612+
}
613+
426614
func GetDeploymentCRDSchema() *apiextensions.JSONSchemaProps {
427615
One := float64(1)
428616
Hundred := float64(100)

0 commit comments

Comments
 (0)