Conversation
88dfaa0 to
26d49ea
Compare
| In this callback is where it will attempt to dial the target server with the | ||
| requested username, and complete the per-session MFA ceremony if necessary. | ||
| By doing this in the callback we can return an authentication error to the user | ||
| if they fail to authenticate to the target server. |
There was a problem hiding this comment.
This is more of an implementation detail that doesn't necessarily need to be mentioned in the RFD, but this actually supports dropping into keyboard-interactive if the remote server is configured for publickey,keyboard-interactive in a much better way than our current forwarding (some customers have asked about better support for PAM authentication modules, and x/crypto/ssh doesn't quite have APIs flexible enough for that if the connecting client is trying to use publickey first).
| Next to the usual "Connect" button on the SSH server card, she sees a new VNet | ||
| button. |
There was a problem hiding this comment.
Yeah I was actually thinking of something like that, i think it's a good idea. would be nice to have some type of button on the server card for discoverability
There was a problem hiding this comment.
Perhaps we should run this by the design team? cc @roraback
| 1. VNet will apply any matching proxy templates found in the user's | ||
| `TELEPORT_HOME`. |
There was a problem hiding this comment.
This might be a bit of a pain for some users, as I'm sure that people usually configure tsh through ~/.tsh/config/config.yaml which Teleport Connect does not use. But it is something we might tackle in Q2, so it's not that big of a problem.
There was a problem hiding this comment.
Yeah I don't think proxy templates are going to be that common to use for this, but I know there are some specific customers that use them heavily, and I wanted to make it at least possible to use them from connect, but idk using TELEPORT_HOME seemed better than the default tsh home
There was a problem hiding this comment.
If our end goal is to unify the Connect and tsh config location this is probably fine for the short term. If someone really wants to make proxy templates work they can do so by at least for the short term copying the file.
| 1. Because Alice has not started VNet before, VNet starts for the first time. | ||
| 1. The VNet panel in the top left opens and displays: | ||
| > 🟢 Proxying TCP connections to teleport.example.com<br> | ||
| > 🔴 VNet SSH is not configured<br> | ||
| > Open Status Page | ||
| 1. A toast pops up in the bottom right and displays: | ||
| "VNet SSH is not configured, click here (link) to enable VNet SSH" | ||
|
|
||
| Clicking either the "Open Status Page" button in the VNet panel or the link | ||
| in the toast both open a new VNet Status tab in Connect. | ||
| This tab will include the existing VNet diagnostic report, and a new section at | ||
| the top on the status of VNet SSH. | ||
|
|
||
| The SSH section in the VNet Status page will display: | ||
|
|
||
| > VNet SSH lets any SSH client connect to Teleport SSH servers.<br> | ||
| > When you enable VNet SSH, Connect will automatically add configuration options | ||
| > to `~/.ssh/config`. | ||
| > | ||
| > Enable VNet SSH (button) |
There was a problem hiding this comment.
@ravicious I'm just winging this let me know if you have thoughts 🙏
There was a problem hiding this comment.
Why not have it enabled by default and let the user disable individual components if not needed? Do we even need to let them disable individual components at all?
If disabling individual components is something we want to have available, I think it might be nicer to expand the VNet panel with two sections rather than adding extra stuff to the diag report tab. I think it'd feel nicer to be able to control everything from that panel.
There was a problem hiding this comment.
The idea was to avoid editing the user's ~/.ssh/config until they explicitly opt-in, I don't want to just do that by default. Do you think we have room to warn them about that before turning on VNet SSH in the VNet panel?
There was a problem hiding this comment.
We should be able to add a confirmation dialog for that.
There was a problem hiding this comment.
+1 for not touching ~/.ssh/config without explicit permission.
A nice effect of using Include is that it's also pretty easy to detect if the line is still there, and to offer to add it again. It should also not be a requirement to do so to enable the server IMO, ssh -F ~/path/to/vnet_ssh_config will work just as well.
There was a problem hiding this comment.
The idea is everything in VNet will actually be enabled by default, ssh -F ~/path/to/vnet_ssh_config should work as Edoarda mentioned, Connect will just give you the option to automatically add the Include to ~/.ssh/config
There was a problem hiding this comment.
One could think of presence of the Include line in ~/.ssh/config as a sort of diagnostic, with an option to remedy that problem with a button on the diagnostic page. For the confirmation dialog you mentioned @ravicious I'm not sure when we'd open that
There was a problem hiding this comment.
Ah sorry, my messages were not clear at all, I just noticed that. 🤦
If we have to make the SSH feature opt-in, then instead of putting it in the diag report page, we could have it just in the VNet panel. I imagine the toggle could look somewhat like this:
I didn't want to call it "VNet SSH" because that would be the third time this panel says "VNet". If VNet TCP should also be configurable, then we could replace that status indicator with a toggle maybe.
Now, I imagine that when Alice clicks on "Connect with VNet" next to an SSH node, we could show a notification as you mentioned. The notification would say "VNet SSH needs to be explicitly enabled" or something along those lines. Instead of a link, it'd have a button that says "Enable VNet SSH". When Alice clicks on that button or when she clicks on the VNet SSH toggle, we could then show the confirmation dialog. The dialog would explain that enabling VNet SSH is going to modify ~/.ssh/config. After clicking "Proceed", VNet SSH would be enabled.
One minor problem is the very first start of VNet. For TCP apps, when you click on "Connect" for the first time next to a TCP app, we don't turn on VNet immediately. Instead, we first show the info tab about VNet. From there, you can learn what VNet is and then turn it on. After the first launch, clicking on "Connect" starts VNet immediately without showing the info tab.
For SSH, I think it'd be a bit too much to go from "Connect with VNet" -> the info tab -> turn on VNet -> show notification about VNet SSH -> enable VNet SSH -> finally connect through SSH. But IMHO for SSH we can skip the info tab completely. It makes sense to show it for TCP apps I guess where the flow is a little bit more complex and it might not be obvious what to do next. But for SSH nodes, once you click "Connect with VNet" and see a notification that says "Connect with any SSH client to <address> (copied to clipboard)", I think it's pretty clear what the user should do next. Besides, the user might click "Connect" next to a TCP app without knowing what VNet is. For SSH, it's unlikely that someone is going to open the three dot menu and then click "Connect with VNet" without having an idea what VNet is.
| 1. Because Alice has not started VNet before, VNet starts for the first time. | ||
| 1. The VNet panel in the top left opens and displays: | ||
| > 🟢 Proxying TCP connections to teleport.example.com<br> | ||
| > 🔴 VNet SSH is not configured<br> | ||
| > Open Status Page | ||
| 1. A toast pops up in the bottom right and displays: | ||
| "VNet SSH is not configured, click here (link) to enable VNet SSH" | ||
|
|
||
| Clicking either the "Open Status Page" button in the VNet panel or the link | ||
| in the toast both open a new VNet Status tab in Connect. | ||
| This tab will include the existing VNet diagnostic report, and a new section at | ||
| the top on the status of VNet SSH. | ||
|
|
||
| The SSH section in the VNet Status page will display: | ||
|
|
||
| > VNet SSH lets any SSH client connect to Teleport SSH servers.<br> | ||
| > When you enable VNet SSH, Connect will automatically add configuration options | ||
| > to `~/.ssh/config`. | ||
| > | ||
| > Enable VNet SSH (button) |
There was a problem hiding this comment.
Why not have it enabled by default and let the user disable individual components if not needed? Do we even need to let them disable individual components at all?
If disabling individual components is something we want to have available, I think it might be nicer to expand the VNet panel with two sections rather than adding extra stuff to the diag report tab. I think it'd feel nicer to be able to control everything from that panel.
|
|
||
| Clicking either the "Open Status Page" button in the VNet panel or the link | ||
| in the toast both open a new VNet Status tab in Connect. | ||
| This tab will include the existing VNet diagnostic report, and a new section at |
There was a problem hiding this comment.
Are there any diagnostics we can add for VNet SSH that could be included in the report?
There was a problem hiding this comment.
I think "whether or not the Include directive is in ~/.ssh/config" is kind of a diagnostic, not sure what else we could really do though
| If per-session MFA is required, Connect will prompt the user to tap their | ||
| hardware key. | ||
|
|
||
| #### User story - VSCode remote development |
There was a problem hiding this comment.
Now that there's a marketing tab for VNet in Connect, it'd be good to have something about SSH in there too. This use case in particular could be nice – it's shows what is possible rather than just describing how it works (which might be better suited for docs). We'd just need to have this use case described somewhere so that we can direct the user there if they're interested in it. A section in the docs would be enough. A blog post would be good too, but that requires much more effort.
| 1. VNet will apply any matching proxy templates found in the user's | ||
| `TELEPORT_HOME`. | ||
| 1. VNet will query the matching cluster to see if any SSH servers have a | ||
| matching SSH server using the `ResolveSSHTarget` rpc. | ||
| 1. If a match is found a free IP will be assigned to that SSH server and it | ||
| will be returned in an authoritative DNS answer. | ||
| 1. If no matching server is found a `NXDOMAIN` response will be returned with | ||
| no cache TTL. |
There was a problem hiding this comment.
How quickly can this react to configuration changes, both in the proxy templates and in the cluster? It'd be nice if we were in control of how long a hostname to hostID mapping will last, but I don't think it's possible if we want to respond NXDOMAIN to unknown hostnames (and the alternative is to reject connections if there's no longer a pairing, which is confusing to the user although it's probably the same thing that happens if a host goes down?)
There was a problem hiding this comment.
In the latest revision, we never query for host IDs (unless a proxy template requires it) we always dial by hostname so we can react ~instantly
| She gets an error message from `ssh` on the CLI reading | ||
| `ERROR: access denied to root connecting to devbox`. |
There was a problem hiding this comment.
Can we return such a detailed error message to openssh during ssh auth with x/crypto/ssh?
(please don't say "we act like the authentication succeeded and the session was opened and then we write on stderr in the session")
There was a problem hiding this comment.
After some silly checks, for openssh ssh we can either display a banner, which is likely not going to be surfaced well in GUI clients but should be always displayed by ssh in a terminal, or we can fork x/crypto to allow using a custom disconnect message which is probably going to be more visible in all clients but is formatted kind of weird:
Received disconnect from [<ipv6 addr>] port 22:<integer>: <error message>
Disconnected from <ipv6 addr> port 22
There was a problem hiding this comment.
I know @zmb3 would very much like it if we could return more actionable error messages when the SSH handshake fails. Any errors stemming from device trust issues exposed to users are very opaque and unhelpful message.
There was a problem hiding this comment.
yeah unfortunately this was just an example error message i stole from tsh ssh and forgot to double check, the real error message displayed by openssh clients is more likely to be root@devbox: Permission denied (publickey). unless we hack an error message into the banner or fork x/crypto/ssh (the ideas you mentioned in the other thread)
There was a problem hiding this comment.
these are really just 2 different ways of displaying the same error though, the tsh ssh error doesn't have any additional information, we just format it "nicer" on any ssh auth error
Lines 339 to 350 in e0030e5
we can't really do much better than this without some api changes i guess
| 1. The existing TCP app lookup will run first, if the address matches a TCP app | ||
| then the IP will be permanently assigned to that app and regular TCP app | ||
| forwarding will take over. This matches the current VNet behavior where TCP | ||
| app matches are permanent. |
There was a problem hiding this comment.
Don't we also know the destination port of the TCP connection? Can't we take that into account for our decision?
There was a problem hiding this comment.
We could take it into account for sure, although it would be a breaking change for anyone currently connecting to a TCP app on port 22
| 1. VNet will set `ssh.ServerConfig.PublicKeyCallback` to a function that: | ||
| 1. Validates that the user public key matches `id_vnet.pub` mentioned in [SSH client configuration](#ssh-client-configuration) | ||
| 1. Attempts an SSH connection to the target with and without session MFA. If | ||
| both fail, an authentication error is returned. This is done in | ||
| `PublicKeyCallback` so that authentication failures to the target are returned | ||
| as authentication failures to the client. Optimistically we can send | ||
| more informative errors to Connect. |
There was a problem hiding this comment.
This doesn't actually work that well; the PublicKeyCallback can be called an arbitrary amount of times and can end up being called with a key that the client is not actually in control of, and you're only going to know that after finishing the authentication. 😢 golang/go#70795 would fix it but it's sort of stuck while the x/crypto/ssh maintainers figure out the v2 api.
Thankfully it's not a very big deal to optimistically open the connection (once, obviously) to the remote server and just close it if the client doesn't prove itself to be in control of the id_vnet private key.
There was a problem hiding this comment.
That being said, we can also fork x/crypto/ssh since this is something that runs in our binary rather than being in the api module (maybe to also add a way to return a custom authn error message?)
There was a problem hiding this comment.
Are there any alternatives to leveraging the PublicKeyCallback like this? Is this purely to provide more useful error messaging to the end user?
There was a problem hiding this comment.
No, this is about the fact that PublicKeyCallback gets called even when the client presents just a public key without any proof of possession.
It's not a particularly big deal to open the connection immediately tho, this is something that could kinda be exploited by an attacker on the local machine to open connections to SSH servers just to have them closed immediately - and at no big cost to anything in the cluster even, since the connection will just get stuck at version negotiation until the local client proves it's in possession of the vnet private key, so the only cost is a bit of network and some logging, not even any cpu usage due to encryption.
There was a problem hiding this comment.
And it's not like the public key is going to be broadcast to the world, it's not treated as sensitive data but it's also something that you'd need a bit of shenanigans to obtain in the first place.
There was a problem hiding this comment.
@rosstimothy yeah we don't have to do this in PublicKeyCallback it was just an idea so that we could return authentication errors to openssh clients when the user actually fails to auth to the teleport node
we could dial the target node in BannerCallback instead to display the error as a banner, it feels sketchy but i guess it's no less sketchy than doing it in PublicKeyCallback where the private key has not been verified yet
although at this point i'm thinking we just shouldn't shove teleport errors into the SSH protocol, and shouldn't dial the node before the client authenticates with the private key, instead we can send whatever error message we want to Connect in a "notification" the same way we currently notify users when they connect to an invalid port of a multi-port TCP app
There was a problem hiding this comment.
We don't need BannerCallback, the Cool™ way is to get a ServerPreAuthConn through PreAuthConnCallback and then we can send an error message in a banner at any time before the authentication completes.
I think that the best UX would be to display the error in both a banner (so it shows up in the terminal that the user is interacting with) and in a connect notification (to not swallow errors if the connection is happening somewhere other than in a terminal).
There was a problem hiding this comment.
I agree in principal that showing better and more informative SSH errors is something we should do. However, it's not a problem limited to VNet SSH. If we can tackle error handling while working on VNet SSH that's great, but I don't think the internals of how that happens need to be ironed out in this RFD.
|
Hey team! I just finished my review of the RFD. Taking into consideration the fact that the new functionality leverages the existing SSH tunneling implementation, the changes shouldn't introduce any new threat surface. The RDF looks good to me! |

No description provided.