Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promote STS minReadySeconds to beta #2824

Conversation

ravisantoshgudimetla
Copy link
Contributor

  • One-line PR description:
    Promote STS minReadySeconds to beta
  • Other comments:

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 14, 2021
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/apps Categorizes an issue or PR as relevant to SIG Apps. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 14, 2021
@ravisantoshgudimetla
Copy link
Contributor Author

/sig apps

@soltysh

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 19, 2021
@soltysh
Copy link
Contributor

soltysh commented Aug 19, 2021

@@ -1,3 +1,30 @@
kep-number: 2599
alpha:
approver: "@ehashman"
owning-sig: sig-apps
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these changes are not necessary, not for PRR metadata file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY. Updated :)

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 19, 2021
@ravisantoshgudimetla ravisantoshgudimetla force-pushed the add-minReadySeconds-beta branch 3 times, most recently from b611fb8 to 4d51bc1 Compare August 19, 2021 16:48
Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 19, 2021
@@ -438,6 +446,8 @@ Ideally, this should be a metric. Operations against the Kubernetes API (e.g.,
checking if there are objects with field X set) may be a last resort. Avoid
logs or events for this purpose.
-->
By checking the StatefulSets's `.status.AvailableReplicas` field. If that field is populated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this set in cases when minreadyseconds==0?

they could count how many statefulsets are doing that directly, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh you mean, I should set the value to be > 0?

@@ -463,6 +473,7 @@ high level (needs more precise definitions) those may be things like:
job creation time) for cron job <= 10%
- 99,9% of /health requests per day finish with 200 code
-->
All the `Available` pods created should be more than the time specified in `.spec.minReadySeconds` 99.99% of the time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how can a cluster-admin know this is happening? Or more generally, if the statefulset controller starts failing, is there a metric that tracks errors from the statefulset controller that could be alerted upon if it starts failing very often?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the above metric.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear how to track this from the description of the above metric. Please be more specific.

Furthermore, I don't think this SLO is appropriate. There's nothing in the KEP that suggests this is engineered for (or that we can accurately measure) 99.99% correctness.

I think it's reasonable to think about expected latencies and correctness here, but 99.99% is a very stringent target and this SLO as written doesn't seem universal.

You may want to take a look at the updated question template: https://github.com/kubernetes/enhancements/blame/master/keps/NNNN-kep-template/README.md#L512-L513

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the SLO to 99% instead of 4 9's. I also added a sentence in the previous question on how the metric can be used to determine correctness of the functionality.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 1, 2021
@gracenng
Copy link
Member

gracenng commented Sep 2, 2021

Hi there 👋 1.23 enhancement shadow here.

PRR has not been approved for this enhancement stage and the implementable portion has not been checked in README.md. The current status of the enhancement is at risk. Enhancement freeze is Sept 9th.

Thanks

Copy link
Member

@ehashman ehashman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @deads2k for taking a first pass at this while I was out of the office. I have some questions/feedback, the PRR questionnaire as filled out is lacking some detail.


###### What specific metrics should inform a rollback?

<!--
What signals should users be paying attention to when the feature is young
that might indicate a serious problem?
-->
`minReadySeconds` in StatefulSet doesn't get respected and all the `Ready` pods would be shown as `Available`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a metric; how does one measure this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By looking at the logs. I tried explaining it below or should we always use a metric? I thought looking at the logs is a signal that cluster-admin or developer can use to check if the feature is working or not.

the number of StatefulSets without `AvailableReplicas` growing overtime which can be used by
cluster-admin to track th failures. We also a metric called `kube_statefulset_status_replicas_available`
which we added recently to track the number of available replicas. The cluster-admin could use
this metric to track the problems.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would they use this metric? What are acceptable and unacceptable values for it?

@@ -438,19 +450,20 @@ Ideally, this should be a metric. Operations against the Kubernetes API (e.g.,
checking if there are objects with field X set) may be a last resort. Avoid
logs or events for this purpose.
-->
By checking the `kube_statefulset_status_replicas_available` metric.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the metric exists, the feature is in use? Please specify what the user needs to check.

- Details:
- Components exposing the metric: StatefulSet controller via kube_state_metrics

The `kube_statefulset_status_replicas_available` gives the number of replicas available.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give a little more detail on the metric labels? Will a timeseries exist for every workload that this is enabled for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We use labels to identify the individual workload and timeseries would exist for it.

@@ -463,6 +473,7 @@ high level (needs more precise definitions) those may be things like:
job creation time) for cron job <= 10%
- 99,9% of /health requests per day finish with 200 code
-->
All the `Available` pods created should be more than the time specified in `.spec.minReadySeconds` 99.99% of the time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear how to track this from the description of the above metric. Please be more specific.

Furthermore, I don't think this SLO is appropriate. There's nothing in the KEP that suggests this is engineered for (or that we can accurately measure) 99.99% correctness.

I think it's reasonable to think about expected latencies and correctness here, but 99.99% is a very stringent target and this SLO as written doesn't seem universal.

You may want to take a look at the updated question template: https://github.com/kubernetes/enhancements/blame/master/keps/NNNN-kep-template/README.md#L512-L513

@@ -589,6 +604,10 @@ details). For now, we leave it here.

###### How does this feature react if the API server and/or etcd is unavailable?

This feature will not work if the API server or etcd is unavailable as the controller-manager won't be even able get events or updates for StatefulSets.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this feature have a dependency on events? They are lossy, that doesn't seem right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None. I meant the update/create events not the events that individual components create.

@@ -589,6 +604,10 @@ details). For now, we leave it here.

###### How does this feature react if the API server and/or etcd is unavailable?

This feature will not work if the API server or etcd is unavailable as the controller-manager won't be even able get events or updates for StatefulSets.
If the API server and/or etcd is unavailable during the mid-rollout, the featuregate may be enabled but it won't have any effect on the StatefulSet as
the controller-manager cannot communicate with the API server
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also doesn't sound right; if the field is present on the object that the controller manager is working on, why would this stop working?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs to do an update on the status subresource of the STS object which is not possible without communicating with kube-apiserver

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest simple statement like:

The controller won't be able to progress, all currently queued resources are re-queued. This feature does not change current behavior of the controller in this regard.

Copy link
Contributor Author

@ravisantoshgudimetla ravisantoshgudimetla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review @ehashman. I addressed your comments and added few more sentences which can clarify things. PTAL


###### What specific metrics should inform a rollback?

<!--
What signals should users be paying attention to when the feature is young
that might indicate a serious problem?
-->
`minReadySeconds` in StatefulSet doesn't get respected and all the `Ready` pods would be shown as `Available`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By looking at the logs. I tried explaining it below or should we always use a metric? I thought looking at the logs is a signal that cluster-admin or developer can use to check if the feature is working or not.

- Details:
- Components exposing the metric: StatefulSet controller via kube_state_metrics

The `kube_statefulset_status_replicas_available` gives the number of replicas available.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We use labels to identify the individual workload and timeseries would exist for it.

@@ -589,6 +604,10 @@ details). For now, we leave it here.

###### How does this feature react if the API server and/or etcd is unavailable?

This feature will not work if the API server or etcd is unavailable as the controller-manager won't be even able get events or updates for StatefulSets.
If the API server and/or etcd is unavailable during the mid-rollout, the featuregate may be enabled but it won't have any effect on the StatefulSet as
the controller-manager cannot communicate with the API server
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs to do an update on the status subresource of the STS object which is not possible without communicating with kube-apiserver

@@ -589,6 +604,10 @@ details). For now, we leave it here.

###### How does this feature react if the API server and/or etcd is unavailable?

This feature will not work if the API server or etcd is unavailable as the controller-manager won't be even able get events or updates for StatefulSets.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None. I meant the update/create events not the events that individual components create.

@@ -463,6 +473,7 @@ high level (needs more precise definitions) those may be things like:
job creation time) for cron job <= 10%
- 99,9% of /health requests per day finish with 200 code
-->
All the `Available` pods created should be more than the time specified in `.spec.minReadySeconds` 99.99% of the time.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the SLO to 99% instead of 4 9's. I also added a sentence in the previous question on how the metric can be used to determine correctness of the functionality.

the number of StatefulSets without `AvailableReplicas` growing overtime which can be used by
cluster-admin to track th failures. We also have a metric called `kube_statefulset_status_replicas_available`
which we added recently to track the number of available replicas. The cluster-admin could use
this metric to track the problems. If the value is immediately equal to the value of `Ready` replicas or if it is `0`, it can be considered as a feature failure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ravisantoshgudimetla let's focus just on the metrics, it'll be simpler and the question goal is to help cluster-admin identify problems with the feature (when enabled) and not necessarily regular users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it.

- Components exposing the metric:
- [ ] Other (treat as last resort)
- Details:
- Components exposing the metric: StatefulSet controller via kube_state_metrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kube-controller-manager is the component

- [ ] Metrics
- Metric name:
- [x] Metrics
- Metric name: `kube_statefulset_status_replicas_available`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe availability ratio will be a better option for this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get the same ratio number using ready and available metrics. I don't have a strong opinion one or the other. Having said that, the current metrics are coming from kube_state_metrics and if we decide to go with this metric, we have to add the same metric for all the controllers. We can also revisit this during implementation.

@@ -589,6 +604,10 @@ details). For now, we leave it here.

###### How does this feature react if the API server and/or etcd is unavailable?

This feature will not work if the API server or etcd is unavailable as the controller-manager won't be even able get events or updates for StatefulSets.
If the API server and/or etcd is unavailable during the mid-rollout, the featuregate may be enabled but it won't have any effect on the StatefulSet as
the controller-manager cannot communicate with the API server
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest simple statement like:

The controller won't be able to progress, all currently queued resources are re-queued. This feature does not change current behavior of the controller in this regard.

Copy link
Contributor Author

@ravisantoshgudimetla ravisantoshgudimetla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback @soltysh. I included the changes you suggested. PTAL

the number of StatefulSets without `AvailableReplicas` growing overtime which can be used by
cluster-admin to track th failures. We also have a metric called `kube_statefulset_status_replicas_available`
which we added recently to track the number of available replicas. The cluster-admin could use
this metric to track the problems. If the value is immediately equal to the value of `Ready` replicas or if it is `0`, it can be considered as a feature failure.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it.

- [ ] Metrics
- Metric name:
- [x] Metrics
- Metric name: `kube_statefulset_status_replicas_available`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get the same ratio number using ready and available metrics. I don't have a strong opinion one or the other. Having said that, the current metrics are coming from kube_state_metrics and if we decide to go with this metric, we have to add the same metric for all the controllers. We can also revisit this during implementation.

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 9, 2021
Copy link
Member

@ehashman ehashman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
for PRR

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ehashman, ravisantoshgudimetla, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 9, 2021
@k8s-ci-robot k8s-ci-robot merged commit 851990a into kubernetes:master Sep 9, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Sep 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/apps Categorizes an issue or PR as relevant to SIG Apps. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants