Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 24 additions & 12 deletions pkg/operator/status.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,15 +28,22 @@ func (optr *Operator) syncAvailableStatus() error {
}

optrVersion, _ := optr.vStore.Get("operator")
progressing := cov1helpers.IsStatusConditionTrue(co.Status.Conditions, configv1.OperatorProgressing)
failing := cov1helpers.IsStatusConditionTrue(co.Status.Conditions, configv1.OperatorFailing)
message := fmt.Sprintf("Cluster has deployed %s", optrVersion)

available := configv1.ConditionTrue

if failing && !progressing {
available = configv1.ConditionFalse
message = fmt.Sprintf("Cluster not available for %s", optrVersion)
}

// set available
cov1helpers.SetStatusCondition(&co.Status.Conditions, configv1.ClusterOperatorStatusCondition{
Type: configv1.OperatorAvailable, Status: configv1.ConditionTrue,
Message: fmt.Sprintf("Cluster is available at %s", optrVersion),
Type: configv1.OperatorAvailable, Status: available,
Message: message,
})
// clear progressing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this drop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting Available doesn't necessarily clear progressing (taken from docs https://github.com/openshift/installer/blob/master/docs/user/overview.md)

Failing is true with a detailed message Unable to apply 4.0.1: could not update 0000_70_network_deployment.yaml because the resource type NetworkConfig has not been installed on the server.
Available is true with message Cluster has deployed 4.0.0
Progressing is true with message Unable to apply 4.0.1: a required object is missing

It also looks like something other operators are doing (setting Avaiable can still be true while Failing and Progressing are also true)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other scenarios I've seen in other operators set the Conditions indipendently from the other conditions as well (which makes sense I believe)

Copy link
Contributor

@abhinavdahiya abhinavdahiya Feb 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting Available doesn't necessarily clear progressing (taken from docs https://github.com/openshift/installer/blob/master/docs/user/overview.md)

Failing is true with a detailed message Unable to apply 4.0.1: could not update 0000_70_network_deployment.yaml because the resource type NetworkConfig has not been installed on the server.
Available is true with message Cluster has deployed 4.0.0
Progressing is true with message Unable to apply 4.0.1: a required object is missing

It also looks like something other operators are doing (setting Avaiable can still be true while Failing and Progressing are also true)

you are quoting an example when operator is progressing or failing...
when the operator is available
from https://github.com/openshift/cluster-version-operator/blob/master/docs/dev/clusteroperator.md#conditions

    Failing is false with no message
    Available is true with message Cluster has deployed 4.0.1
    Progressing is false with message Cluster version is 4.0.1

you cannot be failing, progressing and available for the same version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should check the other Conditions I guess right

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Member Author

@runcom runcom Feb 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway, the status you quoted can already converge to without my latest additions (though they're indeed needed). Progressing and Failing are set before available anyway so when we're at Avaiable, we know where we are (I think)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cov1helpers.SetStatusCondition(&co.Status.Conditions, configv1.ClusterOperatorStatusCondition{Type: configv1.OperatorProgressing, Status: configv1.ConditionFalse})
// clear failure
cov1helpers.SetStatusCondition(&co.Status.Conditions, configv1.ClusterOperatorStatusCondition{Type: configv1.OperatorFailing, Status: configv1.ConditionFalse})

co.Status.Versions = optr.vStore.GetAll()
optr.setMachineConfigPoolStatuses(&co.Status)
Expand All @@ -55,15 +62,20 @@ func (optr *Operator) syncProgressingStatus() error {
}

optrVersion, _ := optr.vStore.Get("operator")
var message string
progressing := configv1.ConditionFalse
message := fmt.Sprintf("Cluster version is %s", optrVersion)

if optr.vStore.Equal(co.Status.Versions) {
// syncing the state to existing version.
message = fmt.Sprintf("Running resync for %s", optrVersion)
if optr.inClusterBringup {
progressing = configv1.ConditionTrue
}
} else {
message = fmt.Sprintf("Progressing towards %s", optrVersion)
message = fmt.Sprintf("Working towards %s", optrVersion)
progressing = configv1.ConditionTrue
}

cov1helpers.SetStatusCondition(&co.Status.Conditions, configv1.ClusterOperatorStatusCondition{
Type: configv1.OperatorProgressing, Status: configv1.ConditionTrue,
Type: configv1.OperatorProgressing, Status: progressing,
Message: message,
})

Expand Down Expand Up @@ -91,7 +103,7 @@ func (optr *Operator) syncFailingStatus(ierr error) error {
// syncing the state to exiting version.
message = fmt.Sprintf("Failed to resync %s because: %v", optrVersion, ierr.Error())
} else {
message = fmt.Sprintf("Failed when progressing towards %s because: %v", optrVersion, ierr.Error())
message = fmt.Sprintf("Unable to apply %s: %v", optrVersion, ierr.Error())
}
// set failing condition
cov1helpers.SetStatusCondition(&co.Status.Conditions, configv1.ClusterOperatorStatusCondition{
Expand Down
4 changes: 2 additions & 2 deletions pkg/operator/sync.go
Original file line number Diff line number Diff line change
Expand Up @@ -47,15 +47,15 @@ func (optr *Operator) syncAll(rconfig renderConfig) error {
if optr.inClusterBringup {
glog.Infof("[init mode] synced %s in %v", sf.name, time.Since(startTime))
}
optr.syncProgressingStatus()
}

agg := utilerrors.NewAggregate(errs)
if agg != nil {
errs = append(errs, optr.syncFailingStatus(agg))
agg = utilerrors.NewAggregate(errs)
return fmt.Errorf("error syncing: %v", agg.Error())
} else if optr.inClusterBringup {
}
if optr.inClusterBringup {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? The intention was to stay in "initialization" stage until we'd done a successful sync.

(Not saying it's wrong, I just want to understand why you're making the change)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just golang linting, the previous if branch has a return at the end, the else branch is therefore superfluous

glog.Infof("Initialization complete")
optr.inClusterBringup = false
}
Expand Down