-
Notifications
You must be signed in to change notification settings - Fork 39.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create per-object sequence number and report last value seen in status of each object #7328
Comments
Is reporting sufficient or do you need to infer order between versions? I think to do what you want you need to be able to sort resource version to know if you have seen a more recent value which we have been hesitant to do since they are supposed to be opaque. Sent from my iPhone
|
Good point. It may be time to introduce a per-object sequence number. |
@bgrant0607 @derekwaynecarr if we add a per-object sequence, will we use
|
@pmorie We need to tease apart the different uses:
|
For this to be useful I think we need order.
Resource version is generated by etcd so we can't rely on different members/shards of the cluster giving you an ever increasing resource version (and non-etcd datastores won't have it). My understanding of sequence numbers is as long as the database is consistent, it's part of the data, so it should always be whatever we wrote across all members. |
Also worth noting the proposed v3 etcd API, which distinguishes "index" from per-object "version": https://github.com/coreos/etcd/blob/master/Documentation/rfc/v3api.proto#L172 |
Responding to some questions from #9739 in this more general issue. With respect to the name of the sequence field, etcd v3 uses "version", and internally we typically use "version" or "generation" (in various APIs -- Borg and Chubby use the latter). I'm somewhat partial to "generation", since it's what I'm used to and since "version" is already fairly overloaded. Re. metadata vs. spec: The sequence number implemented here and discussed in #7328 should not be incremented when updating the status. Furthermore, as discussed in #2726 and #8625, sometime soon we're going to need to move status to a separate key in etcd. metadata should remain with spec, since at least namespace/name, labels, and deletion-related fields affect the desired state and annotations typically reflect additional information about the desired state, and the sequence number should be incremented upon changes to those fields, as well. In general, such a sequence number needs to exist for each sub-part of the object that we'd like to update independently. For instance, Chubby has separate generation numbers for the payload, ACLs, and lock: http://static.googleusercontent.com/media/research.google.com/en/us/archive/chubby-osdi06.pdf. I could imagine that controllers and the client cache might want a sequence number on status, in addition to the one covering spec and metadata, but that's much less critical and could be added later by prefixing the new field's name with a qualifier (e.g., "statusGeneration"), so I'm inclined to ignore that for now. Putting the field in metadata makes the most sense to me because:
The name of the corresponding field in status should clarify that it's the most recent generation that has been observed by the responsible controller, such as observedGeneration or enactingGeneration. Re. multiple entities updating status: There should be a single component that is primarily responsible for reifying the desired state. Only that component should update the observedGeneration. In cases, like the node controller, where another component fills in part of the status when the primary component is unresponsive, that backup component should leave the observedGeneration unchanged. If there were really independently updated sub-parts of status reflecting the degree to which the desired state had been acknowledged, those sub-parts should each have their own observedGeneration fields. This applies to components with internal concurrent processes, as well. An update to observedGeneration should imply that the responsible component should no longer be working towards previously specified desired states. |
What about TPR? I have a controller which needs this mechanism so that external clients can tell if it has seen the spec update and the current status reflects the updated spec. |
This is out for CRD in 1.7. I recommend a specific issue to add it to make it easier to schedule. @Kargakis specific types should have specific issues. It's special for each one at the moment, so I think its a type owner activity, not general machinery. Removing milestone. |
The reality is that it's the same for all controller-type objects (or at least all handled cases are doing it in the same way) but we are left with handling it on a case-by-case basis. One issue we should solve with ObservedGeneration that warrants case-by-case handling is #25170 (but still I suspect the solution to it will need to be applied holistically because every core controller will likely end up reusing it). |
It seems like each one ends up as a snowflake, "bump my generation when spec changes unless its one of these field". It ends up looking like a strategy and that ends up back where we are today. |
For special spec fields, we need special Observed* status fields. ObservedGeneration is meant to cover the whole Spec AFAIK. |
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
/remove-lifecycle stale I would still like generation/observedGeneration to be implemented for all relevant resources |
Custom resources support Do we still want it for CRDs as well? |
/cc @sttts |
It's hard to write race-free clients without knowing whether controllers have observed mutations. Controllers should report the most recent resourceVersion seen in status when they post status. Just returning it in responses (#1184) is not sufficient.
An example problem:
https://github.com/GoogleCloudPlatform/kubernetes/pull/7321/files#r29097446
The text was updated successfully, but these errors were encountered: