Edwarddowling/ec2 labels#21079
Conversation
label grabbing from describe instance not metadata api
|
I think we should hold off on implementation until the RFD is approved and merged |
r0mant
left a comment
There was a problem hiding this comment.
Haven't done full review yet, just one point.
r0mant
left a comment
There was a problem hiding this comment.
@espadolini @fspmarshall Mind taking a look at this again since you looked at it before?
| UpsertDiscoveredServer(context.Context, types.DiscoveredServer) (*types.KeepAlive, error) | ||
|
|
||
| // GetDiscoveredServer gets a DiscoveredServer. | ||
| GetDiscoveredServer(context.Context, string, string) (*types.DiscoveredServerV1, error) |
There was a problem hiding this comment.
I don't think GetDiscoveredServer should be a part of the Announcer interface.
There was a problem hiding this comment.
Would access point be a better spot for them? Or was there somewhere more appropriate you were thinking of?
| if err := a.action(apidefaults.Namespace, types.KindDiscoveredServer, types.VerbRead); err != nil { | ||
| return nil, trace.Wrap(err) | ||
| } | ||
| return a.authServer.GetDiscoveredServer(ctx, instanceID, accountID) |
There was a problem hiding this comment.
@espadolini @fspmarshall This goes through cache first, correct? Since authServer embeds Cache.
There was a problem hiding this comment.
Yes, but it's preferable to explicitly invoke the method against the cache for something like this (e.g. a.authServer.Cache.GetDiscoveredServer(...)), to better communicate the fact that hitting the cache isn't just preferred, it is required in order to prevent an outage.
| // Use labels from discovered resources instead of ones reported by discovered ec2 instances | ||
| s, err := a.filterEC2Labels(ctx, s) | ||
| if err != nil { | ||
| return nil, trace.Wrap(err) | ||
| } |
There was a problem hiding this comment.
I don't know if this logic belongs in the ACL layer tbh. Wouldn't GRPC server be a better place for it, we already update parts of the received Server resource there.
There was a problem hiding this comment.
Other way around, needs to be on the level of auth.Server to ensure it always gets called no matter where the UpsertNode call is coming from (see #21079 (comment)).
Also fix some unsafe casting and move some values to constants
espadolini
left a comment
There was a problem hiding this comment.
The node needs to be aware of its labels to make RBAC decisions, and having the node heartbeat resource (and thus the auth) disagree with what the node thinks its labels are is going to be confusing at best, and a security issue at worst.
| } | ||
| node.SetAddr(utils.ReplaceLocalhost(node.GetAddr(), p.Addr.String())) | ||
|
|
||
| // Use labels from discovered resources instead of ones reported by discovered ec2 instances | ||
| filteredNode, err := auth.ServerWithRoles.FilterEC2Labels(ctx, node) | ||
| if err != nil { | ||
| return nil, trace.Wrap(err) | ||
| } | ||
| node, ok = filteredNode.(*types.ServerV2) | ||
| if !ok { | ||
| return nil, trace.BadParameter("unexpected type %T", filteredNode) | ||
| } | ||
|
|
||
| keepAlive, err := auth.ServerWithRoles.UpsertNode(ctx, node) | ||
| if err != nil { | ||
| return nil, trace.Wrap(err) |
There was a problem hiding this comment.
This will have no effect in most cases. UpsertNode is only called by the node itself if it doesn't have a healthy control stream. When healthy, its heartbeats are injected by the auth server directly:
teleport/lib/inventory/controller.go
Line 462 in 70833b6
If you want to always intercept a node heartbeat, this should be happening at the level of auth.Server, since that will affect all calls to UpsertNode, not just calls that get routed through the GRPC server.
|
@espadolini Good point. Should we force apply RBAC for auto-discovered nodes on the proxy then, or have the node sync the "actual" labels somehow? What do you think would be a good approach? |
@r0mant Forcing-applying RBAC on the proxy isn't really an option for direct-dial nodes. Short of completely deprecating the concept of node-side RBAC, I'd be very hesitant to rely on proxy-side RBAC actually being enforced consistently... Honestly, having a node's RBAC state be partially defined by an external authority is a totally new concept in teleport... I'm betting any solution we come up with is going to need pretty rigorous verification to be certain it works in a consistent manner. Even if we have the node sync labels, nodes allow incoming connections even if they don't have a healthy connection to auth... at the very least, I'd say we'd want to disallow the use of ec2 labels in |
|
@fspmarshall @espadolini What if we did the following:
What do you think? If this works, we will update the RFD to clarify this approach. |
We just did a bunch of work to ensure that connections to nodes don't require auth (RFD). It would be a shame if we walked this back just to support this feature. |
IIUC as currently implemented, ec2 labels are "discovered" for any instance that matches the discovery selector, meaning that it might be discovering labels for instances that were added by another means. I think we'd need to be careful to make sure individual nodes know statically wether or not they are supposed to be taking on this behavior, and ensure that labels never get applied to nodes that aren't statically configured to take on this behavior. Likely, this means that we'd want the node's join token to force it into a special mode where it accepts externally applied labels, and have the label discovery applied only to nodes that exhibit this setting. Even then, since these externally applied labels are dynamically discovered rather than static, it feels brittle. Any disruption of label discovery is effectively an outage for all discovery nodes since we cannot evaluate RBAC sanely for them. IMO we'd be much better off accepting that discovered labels are populated lazily and treating them as ineligible for matching in deny rules... the failure case is much more graceful in that case. |
|
@fspmarshall @rosstimothy @espadolini Thanks guys. Let me put this PR to draft for now, we're probably better off moving this discussion back to the RFD. |
|
Side note: we've talked for a long time about the idea of statically associating labels with join tokens s.t. the labels are guaranteed to exist for the lifetime of the instance's credentials. That's much easier to enforce since it's not dynamically propagated... could we add a feature that let users statically assign labels based on the instance labels observed at time of initial discovery instead? That'd make all of this much easier I think. |
|
We're gonna be implementing a different approach described in #22033. |
Part of #18329
Teleport nodes running on EC2 instances can automatically import labels using metadata endpoint but some users disable the metadata endpoint but still would like to import tags.
This PR introduces a discovered server resource to store the labels retrieved from the describeInstances call we use when discovering the ec2 instances.