@@ -149,7 +149,7 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
149149## Summary
150150
151151When a cluster has multiple apiservers at mixed versions (such as during an
152- upgrade or downgrate ), not every apiserver can serve every resource at every
152+ upgrade or downgrade ), not every apiserver can serve every resource at every
153153version.
154154
155155To fix this, we will add a filter to the handler chain in the aggregator which
@@ -189,19 +189,20 @@ incorrectly or objects being garbage collected mistakenly.
189189
190190## Proposal
191191
192- API change:
193- * To the apiservices API, add an "alternates" clause, a list of
194- apiservers which believe they can serve the group-version.
192+ We will use the existing ` StorageVersion ` API to figure out which group, versions,
193+ and resources an apiserver can serve.
194+
195195
196196API server change:
197- * A controller adds the apiserver to the list of alternates for its built-in
198- group-versions.
199- * The same controller removes expired apiservers from the list. (Enabled by the
200- apiserver identity work.)
201197* A new handler is added to the stack:
202- - If the request is for a group/version the apiserver doesn't have locally (we
203- can use the StorageVersion API), it will proxy the request to one of the
204- alternates instead.
198+
199+ - If the request is for a group/version/resource the apiserver doesn't have
200+ locally (we can use the StorageVersion API), it will proxy the request to
201+ one of the apiservers that is listed in the object. If an apiserver fails
202+ to respond is not available, then we will return a 503 (there is a small
203+ possibility of a race between the controller registering the apiserver
204+ with the resources it can serve and receiving a request for a resource
205+ that is not yet available on that apiserver).
205206
206207### User Stories (Optional)
207208
@@ -257,8 +258,6 @@ TODO: security / cert stuff.
257258
258259## Design Details
259260
260- TODO: specific API change(s)
261-
262261TODO: explanation of how the handler will determine a request is for a resource
263262that should be proxied.
264263
@@ -269,6 +268,24 @@ TODO: explanation of how the security handshake between apiservers works.
269268* generate self-signed cert on startup, put pubkey in apiserver identity lease
270269 object?
271270
271+ ### Unresolved (how we will make discovery consistent)
272+
273+ One option is routing discovery requests from old-apiservers to the new api-server,
274+ so that all discovery requests reflect the newest one. We specifically rule out
275+ merging discovery docs, because merging discovery is:
276+
277+ * complicated
278+ * represents an intermediate state which may not even make sense
279+ * the problems that merging discovery solves (i.e. preventing orphaned objects) can actually
280+ be solved by the dynamic feature flag KEP, so solving it here would be redundant and
281+ unnecessarily complex.
282+
283+ By routing all discovery requests to the newest apiserver, we can ensure that namespace and gc
284+ controllers do what they would be doing if the upgrade happened instantaneously.
285+
286+ Alternatively, we can use the storage version objects to reconstruct a merged discovery
287+ document and serve that in all apiservers.
288+
272289
273290### Test Plan
274291
0 commit comments