RFD 0153 Resource Guidelines#34103
Conversation
0bea8f1 to
a8aecf4
Compare
c6cf29e to
c156608
Compare
codingllama
left a comment
There was a problem hiding this comment.
This is excellent, I really like seeing this RFD come to life.
Lots of comments, as protos and API design are usually topics I care about, but overall it's pretty solid.
Thanks!
| ### API | ||
|
|
||
| All APIs should follow the following conventions that are largely based on | ||
| the [Google API style guide](https://cloud.google.com/apis/design/standard_methods). |
There was a problem hiding this comment.
An important distinction of Google API style is that it is heavily resource-based / REST-like. This means that instead of having "interesting" standalone RPCs, the API focuses in having interesting resources with relatively "boring" CRUD operations on them. That is not to say that resources can't have interesting RPCs attached to them, but more to highlight that the "average" resource exposes mainly CRUD-like operations. The conceptual definitions of the resources are the basis of the API.
Another nice characteristic of Google style is that whatever goes in always comes out. Because the RPCs are focused in the resources, if you have appropriate CRUD coverage then whatever is written can be read again. This seems silly, but I've seen many APIs out there where information gets "swallowed" in one way or another.
| returning them if a `with_sercrets` flag is provided cause a variety of problems and introduce opportunity for human | ||
| error to accidentally include secrets when they should not have been. It would then be the responsibility of the caller | ||
| to get both the base resource and the corresponding secret resource if required. | ||
|
|
There was a problem hiding this comment.
Should we talk about the various ways resources are represented? As in API, "tctl resource", operator, etc?
Is this resource definition to be applied to all cases?
Are there guidelines for JSON / YAML conversion, or could we just use protojson and be done with it?
There was a problem hiding this comment.
Bump? I think proto->json conversion and whether we use protos as-is for tctl and the like are still somewhat open.
For example, device trust defines "presentation" protos for tctl, but I'm not sure I would do that again today.
There was a problem hiding this comment.
I think we should avoid using presentation types just for the sake of using presentation types. There is already a lot of boilerplate required to be written when adding a new resource I'd hate to add another layer to the onion. That said there are a time and place for them. If a resource includes an enum but the desire is for tctl to display a friendly string representation instead of a number then I'd advise to use a presentation type. For all else I think the traditional pattern employed is fine to continue with.
There was a problem hiding this comment.
If a resource includes an enum but the desire is for tctl to display a friendly string representation instead of a number then I'd advise to use a presentation type
That was my main motivation, but today I would even question that. The reality is that we have plenty of types that expose numerical enums already, so the work felt like a drop in the ocean (but nevertheless took time and effort).
WDYT? Is this an area we should delve into? (Maybe not now but in a future update of the RFD?)
There was a problem hiding this comment.
I think we could probably punt on this for now. There are still several aspects of a resources lifespan that aren't 100% covered here. My fear is that if we try to include every possible bit of information about resources here we'll never get this merged.
| - [ ] Optional: Add support for resource to Teleport Operator | ||
| - [ ] Optional: Add support for resource to Teleport Terraform Provider |
There was a problem hiding this comment.
Could we add a guideline for when this needs to happen?
There was a problem hiding this comment.
I think in general we should always be adding new resources to them. However, time and barrier to entry have often left them to be an after thought. I'm happy to remove the optional here to enforce that it should always be done.
There was a problem hiding this comment.
Though after talking with Hugo it seems that our current code generation has a lot of assumptions built around resources using gogo proto. So anyone that follows the advice here is likely to have a rather hard time checking off either of these boxes until #31919 is addressed.
|
The PR changelog entry failed validation: Changelog entry not found in the PR body. Please add a "no-changelog" label to the PR, or changelog lines starting with |
|
The PR changelog entry failed validation: Changelog entry not found in the PR body. Please add a "no-changelog" label to the PR, or changelog lines starting with |
strideynet
left a comment
There was a problem hiding this comment.
Looks good to me, I think that it is:
- Covering enough of what I'd expect to fall in scope of Resource guidelines, and not wandering too far beyond them.
- Giving practical advice without being overly specific/driving too large of a divergence from the current way of working
- Largely in agreement with how I personally feel how things should be done
- A significant improvement over the current informal practices
ravicious
left a comment
There was a problem hiding this comment.
I managed to get to the Update RPC in the API section, will continue reading this tomorrow.
I've never had to add a backend resource and having these guidelines is great, even for non-backend work such as adding RPCs in Connect that mirror backend APIs.
Recently I was considering adding a new endpoint to the backend (#34531, but we'll probably end up not adding that endpoint) and I had to ask Brian to give me an overview on how to do this because there's pretty much no docs about it. With this RFD in place, I believe it'd be much easier to establish which parts of the codebase need to be updated.
ravicious
left a comment
There was a problem hiding this comment.
As I said already, from the perspective of someone who've never added a backend resources, this RFD is clear and helpful.
I left comments under some minor typos or grammatical errors.
0576ce7 to
73019c0
Compare
codingllama
left a comment
There was a problem hiding this comment.
LGTM, #34103 (comment) is the only open question I have.
| } | ||
| ``` | ||
|
|
||
| #### List |
There was a problem hiding this comment.
I think we should add a note regarding the expected ordering.
Resource's name field ascending, perhaps.
Even if we add a disclaimer saying that clients must not depend on the ordering.
There was a problem hiding this comment.
If there's pagination then there must be some order. I think the order guarantees should be specific by each RPC, but it's a good idea to encourage APIs to make explicit decisions (as in, say whether there is a guaranteed order or not).
There was a problem hiding this comment.
I think the machinery within the List mechanism is largely up to each RPC like @codingllama mentioned. However, if the order is unspecified then resources are would be order by backend key.
73019c0 to
a2c9280
Compare
codingllama
left a comment
There was a problem hiding this comment.
LGTM.
There are still a couple of open Qs and nices-to-have, but happy to see this land.
| // The kind of resource represented. | ||
| string kind = 1; | ||
| // Differentiates variations of the same kind. All resources should | ||
| // contain one, even if it is never populated. | ||
| string sub_kind = 2; | ||
| // The version of the resource being represented. | ||
| string version = 3; | ||
| // Common metadata that all resources share. | ||
| teleport.header.v1.Metadata metadata = 4; |
There was a problem hiding this comment.
This all seems to fall under metadata, would it be a lot of trouble to create a teleport.header.v2.Metadata with all of the common fields, and then use some reflection where necessary to figure out whether we're looking at new vs legacy resources?
There was a problem hiding this comment.
If we did this we would have to create a custom marshaler so that we didn't alter the shape of resources when they are converted to yaml.
We would want our resources to continue to look like this:
kind: windows_desktop
metadata:
name: fake
expires: "2026-11-08T16:20:24.431481Z"
labels:
cluster: remote.example.com
spec:
addr: 127.0.0.1:5022
domain: test
host_id: fake_desktop
non_ad: true
version: v3and not be converted to this:
metadata:
kind: windows_desktop
version: v3
name: fake
expires: "2026-11-08T16:20:24.431481Z"
labels:
cluster: remote.example.com
spec:
addr: 127.0.0.1:5022
domain: test
host_id: fake_desktop
non_ad: true
| } | ||
| ``` | ||
|
|
||
| #### Update |
There was a problem hiding this comment.
There's some discussion of update/upsert over in #35441 (comment), I will suggest we move that here.
There was a problem hiding this comment.
You almost certainly always want to use UpdateFoo backed by optimistic locking over UpsertFoo. I think the most common instances where UpsertFoo makes sense are tctl create -f and within the terraform provider and operator, when they are reconciling state we want them to override any drift created by a human manually altering a resource owned by the IaC tools.
There was a problem hiding this comment.
What about my point about the uselessness of patch-style operations under this paradigm (which pertains to your FieldMask comment below)?
As I understand it, in order to do an UpdateFoo with optimistic locking, you will need to have first done a GetFoo to get the object's revision for the call to Update. So then wouldn't that make the FieldMask useless, since you'll have the entire object on-hand anyhow? You may as well just call UpdateFoo with the full object, including the newly modified fields, right?
There was a problem hiding this comment.
I suppose that's possible, but it may also be possible that you might only have the subset of fields that are going to be updated via the field mask and the revision. For example, the web ui may have a page that shows resource Foo. The Proxy api server may do a ListFoos to retrieve the Foo resources from the backend, but before sending them to the UI perform a transformation into a UIFoo that only has a subset of the resource. If a user tried to update Foo A the UI will only send back the fields it knows about. Should the Proxy api server then perform another Get for Foo A and try to reconcile which fields were changed, or should the Proxy just call Update using a field mask and let the backend resolve things?
There was a problem hiding this comment.
Got it, yeah that use case makes sense.
That brings up a question for me, though: do we need to implement a patch-style fieldmask operation in each supported Backend individually. Or is there some existing abstraction that takes care of that for us? I don't know that I've ever actually added a new resource to Teleport, so I'm not quite sure what that process entails generally, nor for a resource that supports FieldMask specifically.
There was a problem hiding this comment.
Rafal asked a similar question. There is nothing automated to date and it's recommended to roll your own patching. While there may be tools or reflecting we could employ to automate it I don't know that they are worth it.
There was a problem hiding this comment.
Thanks, that's similar although I think I'm also asking a further question -- as I understand it we support various backends like sqlite, etcd, dynamodb, etc. In adding a new resource, would we also need to understand how patching works in each of those and extend them?
Regardless of the answer, since this is likely to be used as a reference doc it might be worthwhile including a link to somewhere in the codebase where field masks were done well.
There was a problem hiding this comment.
I think that patching at the backend level isn't possible due to our backend schemas. Patching likely has to be performed up one level by doing a Get, then applying any patched fields, followed by an Update. To my knowledge the only place we actually have this implemented today is in the device trust service: https://github.com/gravitational/teleport.e/blob/master/lib/devicetrust/devicetrustv1/service.go#L268-L280.
There was a problem hiding this comment.
The new BotService (#35441) also makes use of field_masks for patching.
77fd47f to
f053131
Compare
…se-anon-key * origin/master: (344 commits) Undelete CreateHostUserMode_HOST_USER_MODE_DROP (gravitational#36273) allow cwd to be changed in difftest (gravitational#35946) Auth device list component (gravitational#36235) make unified resources responsive (gravitational#35961) Support running Teleport in a "hot reload" mode (gravitational#35040) Prevent deleting enum values, allow deleting enum reservations in types.proto (gravitational#36248) Remove support for legacy (Amazon Linux 2) AMIs (gravitational#36153) Bump version(s) used for teleport-lab and teleport-quickstart (gravitational#36167) Allow Reconciler update handler to examine old value during update (gravitational#36171) Validate the user still exists during account reset (gravitational#35676) ButtonTextWithAddIcon shared component (gravitational#36103) Refactor hostname resolution for SSH connections via the WebUI (gravitational#35773) add structuredClone to jest JSDOMEnvironment (gravitational#36213) fix flaky `lib/auth` cache-enabled tests (gravitational#36216) Report resource usage counts by handling heartbeat events (gravitational#35968) Reviewer bot should use the stable version of Go (gravitational#36242) RFD 0153 Resource Guidelines (gravitational#34103) Use cmp and cmpots properly in operator tests (gravitational#36215) Relax Kubernetes CRD discovery when building cache (gravitational#36214) Add Access List messages to TAG protobuf (gravitational#36176) ...
Rendered