Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create per-object sequence number and report last value seen in status of each object #7328

Open
bgrant0607 opened this issue Apr 25, 2015 · 18 comments
Assignees
Labels
area/api Indicates an issue on api area. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps.

Comments

@bgrant0607
Copy link
Member

It's hard to write race-free clients without knowing whether controllers have observed mutations. Controllers should report the most recent resourceVersion seen in status when they post status. Just returning it in responses (#1184) is not sufficient.

An example problem:
https://github.com/GoogleCloudPlatform/kubernetes/pull/7321/files#r29097446

@bgrant0607 bgrant0607 added area/api Indicates an issue on api area. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. team/master labels Apr 25, 2015
@derekwaynecarr
Copy link
Member

Is reporting sufficient or do you need to infer order between versions? I think to do what you want you need to be able to sort resource version to know if you have seen a more recent value which we have been hesitant to do since they are supposed to be opaque.

Sent from my iPhone

On Apr 25, 2015, at 1:04 AM, Brian Grant [email protected] wrote:

It's hard to write race-free clients without knowing whether controllers have observed mutations. Controllers should report the most recent resourceVersion seen in status when they post status. Just returning it in responses (#1184) is not sufficient.

An example problem:
https://github.com/GoogleCloudPlatform/kubernetes/pull/7321/files#r29097446


Reply to this email directly or view it on GitHub.

@bgrant0607
Copy link
Member Author

Good point. It may be time to introduce a per-object sequence number.

@pmorie
Copy link
Member

pmorie commented Apr 25, 2015

@bgrant0607 @derekwaynecarr if we add a per-object sequence, will we use
that or resourceVersion to get a prior value for an object?
On Sat, Apr 25, 2015 at 12:47 PM Brian Grant [email protected]
wrote:

Good point. It may be time to introduce a per-object sequence number.


Reply to this email directly or view it on GitHub
#7328 (comment)
.

@bgrant0607 bgrant0607 changed the title Report last resourceVersion seen in status of each object Report last resource version seen in status of each object Apr 25, 2015
@bgrant0607
Copy link
Member Author

@pmorie We need to tease apart the different uses:

  • concurrency control (read-modify-write transactions)
  • RAW and WAW consistency
  • watch

@bgrant0607 bgrant0607 added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label May 14, 2015
@bgrant0607 bgrant0607 changed the title Report last resource version seen in status of each object Create per-object sequence number and report last value seen in status of each object Jun 3, 2015
@bprashanth
Copy link
Contributor

Is reporting sufficient or do you need to infer order between versions?

For this to be useful I think we need order.

I think to do what you want you need to be able to sort resource version to know if you have seen a more recent value which we have been hesitant to do since they are supposed to be opaque.

Resource version is generated by etcd so we can't rely on different members/shards of the cluster giving you an ever increasing resource version (and non-etcd datastores won't have it). My understanding of sequence numbers is as long as the database is consistent, it's part of the data, so it should always be whatever we wrote across all members.

@bgrant0607
Copy link
Member Author

Also worth noting the proposed v3 etcd API, which distinguishes "index" from per-object "version": https://github.com/coreos/etcd/blob/master/Documentation/rfc/v3api.proto#L172

@bgrant0607
Copy link
Member Author

Responding to some questions from #9739 in this more general issue.

With respect to the name of the sequence field, etcd v3 uses "version", and internally we typically use "version" or "generation" (in various APIs -- Borg and Chubby use the latter). I'm somewhat partial to "generation", since it's what I'm used to and since "version" is already fairly overloaded.

Re. metadata vs. spec:

The sequence number implemented here and discussed in #7328 should not be incremented when updating the status. Furthermore, as discussed in #2726 and #8625, sometime soon we're going to need to move status to a separate key in etcd. metadata should remain with spec, since at least namespace/name, labels, and deletion-related fields affect the desired state and annotations typically reflect additional information about the desired state, and the sequence number should be incremented upon changes to those fields, as well.

In general, such a sequence number needs to exist for each sub-part of the object that we'd like to update independently. For instance, Chubby has separate generation numbers for the payload, ACLs, and lock: http://static.googleusercontent.com/media/research.google.com/en/us/archive/chubby-osdi06.pdf. I could imagine that controllers and the client cache might want a sequence number on status, in addition to the one covering spec and metadata, but that's much less critical and could be added later by prefixing the new field's name with a qualifier (e.g., "statusGeneration"), so I'm inclined to ignore that for now.

Putting the field in metadata makes the most sense to me because:

  1. it is a generic pattern
  2. it's metadata about the desired state rather than part of the desired state
  3. the user won't ever specify it (which, btw, means that it must be optional, so it should be tagged with omitempty)

The name of the corresponding field in status should clarify that it's the most recent generation that has been observed by the responsible controller, such as observedGeneration or enactingGeneration.

Re. multiple entities updating status: There should be a single component that is primarily responsible for reifying the desired state. Only that component should update the observedGeneration. In cases, like the node controller, where another component fills in part of the status when the primary component is unresponsive, that backup component should leave the observedGeneration unchanged. If there were really independently updated sub-parts of status reflecting the degree to which the desired state had been acknowledged, those sub-parts should each have their own observedGeneration fields. This applies to components with internal concurrent processes, as well. An update to observedGeneration should imply that the responsible component should no longer be working towards previously specified desired states.

@bgrant0607 bgrant0607 added this to the v1.2-candidate milestone Sep 12, 2015
@bgrant0607 bgrant0607 added team/api priority/backlog Higher priority than priority/awaiting-more-evidence. and removed sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Jan 29, 2016
@bgrant0607 bgrant0607 removed this from the next-candidate milestone Jul 31, 2016
@bgrant0607 bgrant0607 added the sig/apps Categorizes an issue or PR as relevant to SIG Apps. label Jul 31, 2016
@ash2k
Copy link
Member

ash2k commented Mar 20, 2017

What about TPR? I have a controller which needs this mechanism so that external clients can tell if it has seen the spec update and the current status reflects the updated spec.

@deads2k
Copy link
Contributor

deads2k commented Jun 2, 2017

What about TPR? I have a controller which needs this mechanism so that external clients can tell if it has seen the spec update and the current status reflects the updated spec.

This is out for CRD in 1.7. I recommend a specific issue to add it to make it easier to schedule.

@Kargakis specific types should have specific issues. It's special for each one at the moment, so I think its a type owner activity, not general machinery.

Removing milestone.

@deads2k deads2k modified the milestones: next-candidate, v1.7 Jun 2, 2017
@0xmichalis
Copy link
Contributor

@Kargakis specific types should have specific issues. It's special for each one at the moment, so I think its a type owner activity, not general machinery.

The reality is that it's the same for all controller-type objects (or at least all handled cases are doing it in the same way) but we are left with handling it on a case-by-case basis. One issue we should solve with ObservedGeneration that warrants case-by-case handling is #25170 (but still I suspect the solution to it will need to be applied holistically because every core controller will likely end up reusing it).

@deads2k
Copy link
Contributor

deads2k commented Jun 2, 2017

One issue we should solve with ObservedGeneration that warrants case-by-case handling is #25170 (but still I suspect the solution to it will need to be applied holistically because every core controller will likely end up reusing it).

It seems like each one ends up as a snowflake, "bump my generation when spec changes unless its one of these field". It ends up looking like a strategy and that ends up back where we are today.

@0xmichalis
Copy link
Contributor

For special spec fields, we need special Observed* status fields. ObservedGeneration is meant to cover the whole Spec AFAIK.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 26, 2017
@bgrant0607
Copy link
Member Author

/remove-lifecycle stale
/lifecycle frozen

I would still like generation/observedGeneration to be implemented for all relevant resources

@nikhita
Copy link
Member

nikhita commented May 27, 2018

I would still like generation/observedGeneration to be implemented for all relevant resources

Custom resources support observedGeneration. And added validation for it in #64379.

Do we still want it for CRDs as well?

@nikhita
Copy link
Member

nikhita commented May 27, 2018

/cc @sttts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Indicates an issue on api area. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps.
Projects
Status: Needs Triage
Development

No branches or pull requests