You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -32,13 +39,17 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
32
39
-[x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
33
40
-[x] (R) KEP approvers have approved the KEP status as `implementable`
34
41
-[x] (R) Design details are appropriately documented
35
-
-[x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
36
-
-[x] (R) Graduation criteria is in place
42
+
-[ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
43
+
-[ ] e2e Tests for all Beta API Operations (endpoints)
44
+
-[ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
45
+
-[ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
46
+
-[ ] (R) Graduation criteria is in place
47
+
-[ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
37
48
-[ ] (R) Production readiness review completed
38
-
-[ ] Production readiness review approved
49
+
-[ ](R) Production readiness review approved
39
50
-[ ] "Implementation History" section is up-to-date for milestone
40
51
-[ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
41
-
-[ ] Supporting documentatione.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
52
+
-[ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
42
53
43
54
44
55
## Summary
@@ -122,7 +133,6 @@ API changes to Service:
122
133
Unit tests:
123
134
- unit tests for the ipvs and iptables rules
124
135
- unit tests for the validation
125
-
- unit tests for a new util in pkg/proxy
126
136
127
137
E2E tests:
128
138
- The default behavior for `ipMode` does not break any existing e2e tests
@@ -149,3 +159,284 @@ On downgrade, the feature gate will simply be disabled, and as long as `kube-pro
149
159
### Version Skew Strategy
150
160
151
161
Version skew from the control plane to `kube-proxy` should be trivial since `kube-proxy` will simply ignore the `ipMode` field.
162
+
163
+
## Production Readiness Review Questionnaire
164
+
165
+
### Feature Enablement and Rollback
166
+
167
+
###### How can this feature be enabled / disabled in a live cluster?
168
+
169
+
-[x] Feature gate (also fill in values in `kep.yaml`)
170
+
- Feature gate name: LoadBalancerIPMode
171
+
- Components depending on the feature gate: kube-proxy, kube-apiserver
172
+
173
+
###### Does enabling the feature change any default behavior?
174
+
175
+
No.
176
+
177
+
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
178
+
179
+
Yes.
180
+
181
+
###### What happens if we reenable the feature if it was previously rolled back?
182
+
183
+
The forwarding rules for services which have the value of `ipMode` set to "Proxy" will be removed by kube-proxy.
184
+
185
+
###### Are there any tests for feature enablement/disablement?
186
+
187
+
Yes. There are some unit tests and an integration test added for this feature enablement/disablement.
188
+
189
+
### Rollout, Upgrade and Rollback Planning
190
+
191
+
<!--
192
+
This section must be completed when targeting beta to a release.
193
+
-->
194
+
195
+
###### How can a rollout or rollback fail? Can it impact already running workloads?
196
+
197
+
<!--
198
+
Try to be as paranoid as possible - e.g., what if some components will restart
199
+
mid-rollout?
200
+
201
+
Be sure to consider highly-available clusters, where, for example,
202
+
feature flags will be enabled on some API servers and not others during the
203
+
rollout. Similarly, consider large clusters and how enablement/disablement
204
+
will rollout across nodes.
205
+
-->
206
+
207
+
###### What specific metrics should inform a rollback?
208
+
209
+
<!--
210
+
What signals should users be paying attention to when the feature is young
211
+
that might indicate a serious problem?
212
+
-->
213
+
214
+
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
215
+
216
+
<!--
217
+
Describe manual testing that was done and the outcomes.
218
+
Longer term, we may want to require automated upgrade/rollback tests, but we
219
+
are missing a bunch of machinery and tooling and can't do that now.
220
+
-->
221
+
222
+
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
223
+
224
+
<!--
225
+
Even if applying deprecation policies, they may still surprise some users.
226
+
-->
227
+
228
+
### Monitoring Requirements
229
+
230
+
<!--
231
+
This section must be completed when targeting beta to a release.
232
+
233
+
For GA, this section is required: approvers should be able to confirm the
234
+
previous answers based on experience in the field.
235
+
-->
236
+
237
+
###### How can an operator determine if the feature is in use by workloads?
238
+
239
+
<!--
240
+
Ideally, this should be a metric. Operations against the Kubernetes API (e.g.,
241
+
checking if there are objects with field X set) may be a last resort. Avoid
242
+
logs or events for this purpose.
243
+
-->
244
+
245
+
###### How can someone using this feature know that it is working for their instance?
246
+
247
+
<!--
248
+
For instance, if this is a pod-related feature, it should be possible to determine if the feature is functioning properly
249
+
for each individual pod.
250
+
Pick one more of these and delete the rest.
251
+
Please describe all items visible to end users below with sufficient detail so that they can verify correct enablement
252
+
and operation of this feature.
253
+
Recall that end users cannot usually observe component logs or access metrics.
254
+
-->
255
+
256
+
-[ ] Events
257
+
- Event Reason:
258
+
-[ ] API .status
259
+
- Condition name:
260
+
- Other field:
261
+
-[ ] Other (treat as last resort)
262
+
- Details:
263
+
264
+
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
265
+
266
+
<!--
267
+
This is your opportunity to define what "normal" quality of service looks like
268
+
for a feature.
269
+
270
+
It's impossible to provide comprehensive guidance, but at the very
271
+
high level (needs more precise definitions) those may be things like:
272
+
- per-day percentage of API calls finishing with 5XX errors <= 1%
273
+
- 99% percentile over day of absolute value from (job creation time minus expected
274
+
job creation time) for cron job <= 10%
275
+
- 99.9% of /health requests per day finish with 200 code
276
+
277
+
These goals will help you determine what you need to measure (SLIs) in the next
278
+
question.
279
+
-->
280
+
281
+
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
282
+
283
+
<!--
284
+
Pick one more of these and delete the rest.
285
+
-->
286
+
287
+
-[ ] Metrics
288
+
- Metric name:
289
+
-[Optional] Aggregation method:
290
+
- Components exposing the metric:
291
+
-[ ] Other (treat as last resort)
292
+
- Details:
293
+
294
+
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
295
+
296
+
<!--
297
+
Describe the metrics themselves and the reasons why they weren't added (e.g., cost,
298
+
implementation difficulties, etc.).
299
+
-->
300
+
301
+
### Dependencies
302
+
303
+
<!--
304
+
This section must be completed when targeting beta to a release.
305
+
-->
306
+
307
+
###### Does this feature depend on any specific services running in the cluster?
308
+
309
+
<!--
310
+
Think about both cluster-level services (e.g. metrics-server) as well
311
+
as node-level agents (e.g. specific version of CRI). Focus on external or
312
+
optional services that are needed. For example, if this feature depends on
313
+
a cloud provider API, or upon an external software-defined storage or network
314
+
control plane.
315
+
316
+
For each of these, fill in the following—thinking about running existing user workloads
317
+
and creating new ones, as well as about cluster-level services (e.g. DNS):
318
+
- [Dependency name]
319
+
- Usage description:
320
+
- Impact of its outage on the feature:
321
+
- Impact of its degraded performance or high-error rates on the feature:
322
+
-->
323
+
324
+
### Scalability
325
+
326
+
<!--
327
+
For alpha, this section is encouraged: reviewers should consider these questions
328
+
and attempt to answer them.
329
+
330
+
For beta, this section is required: reviewers must answer these questions.
331
+
332
+
For GA, this section is required: approvers should be able to confirm the
333
+
previous answers based on experience in the field.
334
+
-->
335
+
336
+
###### Will enabling / using this feature result in any new API calls?
0 commit comments