Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve deployment status report #142

Merged
merged 1 commit into from
Jun 11, 2024
Merged

improve deployment status report #142

merged 1 commit into from
Jun 11, 2024

Conversation

eguzki
Copy link
Contributor

@eguzki eguzki commented May 28, 2024

What

Fixes #140

After initial deployment, some changes in the limitador CR trigger pods rolling out to capture latest desired configuration. During that rolling out process, the deployment object's status condition Available remains true.

Deployment status object during steady state

availableReplicas: 1
conditions:
  - lastTransitionTime: "2024-05-28T11:14:18Z"
    lastUpdateTime: "2024-05-28T11:14:18Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2024-05-28T11:14:07Z"
    lastUpdateTime: "2024-05-28T11:23:06Z"
    message: ReplicaSet "limitador-limitador-bd768d89d" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
observedGeneration: 7
readyReplicas: 1
replicas: 1
updatedReplicas: 1

Deployment status object during pods being rolled out

availableReplicas: 1
conditions:
  - lastTransitionTime: "2024-05-28T11:14:18Z"
    lastUpdateTime: "2024-05-28T11:14:18Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2024-05-28T11:14:07Z"
    lastUpdateTime: "2024-05-28T11:23:51Z"
    message: ReplicaSet "limitador-limitador-696b84d58f" is progressing.
    reason: ReplicaSetUpdated
    status: "True"
    type: Progressing
observedGeneration: 8
readyReplicas: 1
replicas: 2
unavailableReplicas: 1
updatedReplicas: 1

Few things change during rolling out. First and foremost, unavailableReplicas field is not 0. Additionally, readyReplicas is not equal to replicas.

This PR enhances controller status reporting capturing aforementioned new conditions to report limitador's state as Ready.

Verification steps

dev setup

make local-setup

Deploy limitador with defaults.

k apply -f - <<EOF
---
apiVersion: limitador.kuadrant.io/v1alpha1
kind: Limitador
metadata:
  name: limitador
spec: {}
EOF

Wait for limitador to be ready

kubectl wait --timeout=300s --for=condition=Ready limitador limitador

Check status

kubectl get limitador limitador -o jsonpath='{.status}' | yq e -P

It should report "Ready"

conditions:
  - lastTransitionTime: "2024-05-28T12:53:05Z"
    message: Limitador is ready
    reason: Ready
    status: "True"
    type: Ready
observedGeneration: 1
service:
  host: limitador-limitador.default.svc.cluster.local
  ports:
    grpc: 8081
    http: 8080

Let's update verbosity level to trigger pods rolling out. "Quickly", we need to check Limitador CR's status to verify that status reports not ready during rolling out. Eventually, status should report "Ready".

kubectl patch limitador limitador --type=merge --patch '{"spec": {"verbosity": 2}}'

Check status

kubectl get limitador limitador -o jsonpath='{.status}' | yq e -P

During rolling out,

conditions:
  - lastTransitionTime: "2024-05-28T12:58:38Z"
    message: Deployment has unavailable replicas
    reason: LimitadorNotAvailable
    status: "False"
    type: Ready
observedGeneration: 2
service:
  host: limitador-limitador.default.svc.cluster.local
  ports:
    grpc: 8081
    http: 8080

you should get Deployment has unavailable replicas. And eventually, back to Ready state.

kubectl wait --timeout=300s --for=condition=Ready limitador limitador

@eguzki eguzki requested a review from a team May 28, 2024 12:49
@codecov-commenter
Copy link

codecov-commenter commented May 28, 2024

Codecov Report

Attention: Patch coverage is 63.63636% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 84.70%. Comparing base (8df92db) to head (1bc2191).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #142      +/-   ##
==========================================
- Coverage   86.06%   84.70%   -1.36%     
==========================================
  Files          19       19              
  Lines         990      994       +4     
==========================================
- Hits          852      842      -10     
- Misses         91       98       +7     
- Partials       47       54       +7     
Flag Coverage Δ
integration 78.67% <63.63%> (-1.94%) ⬇️
unit 66.51% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
api/v1alpha1 (u) 100.00% <ø> (ø)
pkg/helpers (u) 83.87% <ø> (ø)
pkg/log (u) 94.73% <ø> (ø)
pkg/reconcilers (u) 74.67% <ø> (ø)
pkg/limitador (u) 98.11% <ø> (ø)
controllers (i) 74.33% <63.63%> (-4.39%) ⬇️
pkg/upgrades 88.88% <ø> (ø)
Files Coverage Δ
controllers/limitador_status.go 81.70% <63.63%> (-4.20%) ⬇️

... and 1 file with indirect coverage changes

@eguzki eguzki marked this pull request as ready for review May 28, 2024 13:01
@eguzki eguzki added the kind/bug Something isn't working label May 28, 2024
Comment on lines +144 to +147
if deployment.Status.ReadyReplicas != deployment.Status.Replicas {
return ptr.To("Deployment has replicas not ready yet"), nil
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts about an edge case where an user sets limitador.spec.replicas to 0, so deployment.status.replicas is 0. The current logic will mark the CR status as Ready: true, would this be the expected status for this edge case? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. I think we should follow the same approach as the replication controller for the Deployment object.

I set limtiador CR with replica: 0

spec:
  replicas: 0

The operator updates the deployment object to:

spec:
  replicas: 0
  ....

And the status of the deployment is:

status:
  conditions:
  - lastTransitionTime: "2024-06-10T08:26:11Z"
    lastUpdateTime: "2024-06-10T08:26:11Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2024-06-10T08:25:51Z"
    lastUpdateTime: "2024-06-10T08:26:11Z"
    message: ReplicaSet "limitador-limitador-76c6d7bd46" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  observedGeneration: 2

Note that as status.availableReplicas, status.readyReplicas, status.replicas are all 0, they are not shown in the response (considered "empty").

Yet, the status condition Available is True. Thus, we can account that as "available".

Thus, IMO, limitador CR status should also be considered "Ready". I agree, though, that "Ready" does not reflect the actual state. It would be more meaningful, "Reconciled" or "Stable" or something that means, limitador is what it should be, what the user wanted to be. When the replicas are 0, I can leave "Ready" to "True", and the message would be "Limitador is ready, but without running instances". (or something like that).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, sounds good 👍

Copy link
Contributor

@KevFan KevFan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified, looks good to me 👍

@eguzki eguzki merged commit 9187002 into main Jun 11, 2024
14 checks passed
@eguzki eguzki deleted the improve-status-report branch June 11, 2024 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Limitador CR shows Ready State while Deployment is still rolling out changes
3 participants