Skip to content

Teleport Connect headless approval#29097

Merged
Joerger merged 2 commits intomasterfrom
joerger/headless-approval-teleport-connect
Aug 1, 2023
Merged

Teleport Connect headless approval#29097
Joerger merged 2 commits intomasterfrom
joerger/headless-approval-teleport-connect

Conversation

@Joerger
Copy link
Copy Markdown
Contributor

@Joerger Joerger commented Jul 14, 2023

This PR adds headless approval/denial logic and UI to Teleport Connect. This logic is linked into the headless watcher system so that the UI automatically pops up when a headless request is detected.

Updates #27137

@Joerger Joerger requested review from greedy52 and ravicious July 14, 2023 01:33
@Joerger Joerger marked this pull request as draft July 14, 2023 01:33
@public-teleport-github-review-bot
Copy link
Copy Markdown

@Joerger - this PR will require admin approval to merge due to its size. Consider breaking it up into a series smaller changes.

@github-actions github-actions Bot added audit-log Issues related to Teleports Audit Log bpf Used to bugs with bpf and enhanced session recording. database-access Database access related issues and PRs discovery documentation helm kubernetes-access machine-id rfd Request for Discussion size/xl tsh tsh - Teleport's command line tool for logging into nodes running Teleport. ui labels Jul 14, 2023
@Joerger Joerger force-pushed the joerger/teleport-connect-headless-watcher branch from 5b896d5 to 18b1d52 Compare July 14, 2023 01:34
@Joerger Joerger force-pushed the joerger/headless-approval-teleport-connect branch from 30fd5e9 to 2039354 Compare July 14, 2023 01:34
@Joerger Joerger removed request for alexfornuto and r0mant July 14, 2023 01:34
@Joerger Joerger force-pushed the joerger/headless-approval-teleport-connect branch from 3848423 to 0031fe4 Compare July 25, 2023 00:24
Comment thread web/packages/teleterm/src/services/headlessAuthn.ts Outdated
Comment thread web/packages/teleterm/src/services/tshd/types.ts Outdated
Comment thread web/packages/teleterm/src/ui/HeadlessAuthn/HeadlessPrompt/HeadlessPrompt.tsx Outdated
Comment thread web/packages/teleterm/src/ui/HeadlessAuthn/HeadlessPrompt/HeadlessPrompt.tsx Outdated
Comment thread web/packages/teleterm/src/ui/ModalsHost/ModalsHost.tsx Outdated
Comment thread web/packages/teleterm/src/ui/HeadlessAuthn/HeadlessPrompt/HeadlessPrompt.tsx Outdated
Comment thread web/packages/teleterm/src/ui/HeadlessAuthn/HeadlessPrompt/HeadlessPrompt.tsx Outdated
@gzdunek
Copy link
Copy Markdown
Contributor

gzdunek commented Jul 25, 2023

I'm thinking of the UX of "cancelling".
Shouldn't we reject the headless request when the "X" is clicked and the dialog is in non-MFA mode? Currently nothing happens, and the headless request waits until a timeout. In other words, "X" would work the same as the "Reject" button.

Also, when we abort the UpdateHeadlessAuthenticationState request, perhaps the tshd code could tell the cluster that the request should be rejected? Because, similarly to the first case, the request waits until the timeout.

LMK what do you think of it.

@Joerger
Copy link
Copy Markdown
Contributor Author

Joerger commented Jul 26, 2023

Currently nothing happens, and the headless request waits until a timeout.

This was my intention, but i understand it could be weird UX. Unlike with tsh / WebUI, the user cannot cancel and then retry, so rejecting outright could make sense.

But there's also nothing wrong with leaving it to timeout in case the user clicked reject by mistake and still wants to approve it with tsh headless approve or the WebUI.

If you think we should change it, maybe we should just remove the "X" button and maintain the "Reject" button throughout the flow instead.n

@gzdunek WDYT?

@Joerger Joerger force-pushed the joerger/headless-approval-teleport-connect branch from b165933 to 93d8c25 Compare July 26, 2023 00:49
@Joerger Joerger requested review from gzdunek and ravicious July 26, 2023 00:52
@ravicious
Copy link
Copy Markdown
Member

Pushed a small commit which makes the new modal look more similar to other modals that we already have in the app.

@gzdunek
Copy link
Copy Markdown
Contributor

gzdunek commented Jul 26, 2023

But there's also nothing wrong with leaving it to timeout in case the user clicked reject by mistake and still wants to approve it with tsh headless approve or the WebUI.

Now that you've written about approving pending requests in tsh/webui, I'm less convinced to change the current UX.

Since there are other ways to approve the request, I think we can leave it as is.

Comment thread lib/teleterm/daemon/daemon.go Outdated
Comment thread lib/teleterm/daemon/daemon.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I execute a headless login and then approve it outside of Connect (or in another instance of Connect), Connect will continue to show the modal. I guess that's because the loop in tshd will be stuck on s.tshdEventsClient.SendPendingHeadlessAuthentication.

How hard would it be to abort that request when we detect that the auth prompt has been approved? I think it wouldn't need to be addressed in this PR, but it feels like something we should implement sooner rather than later.

In this PR, we should at least add a timeout to the request, I feel like a minute would be enough?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How hard would it be to abort that request when we detect that the auth prompt has been approved?

The more I think about it, the more doable it seems. We'd basically need to set up a new watcher using rpc WatchEvents to watch specifically for approve/deny events matching the headless ID. I'll give it an attempt.

I also think having a timeout (3m callback timeout to match the max lifetime of a headless request) is a must have, thanks for suggesting it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fb8b2efa8a4cb623b896f26feae3cc52dc9cecda

Comment thread web/packages/teleterm/src/ui/HeadlessAuthn/HeadlessPrompt/HeadlessPrompt.tsx Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little UX thing, when you send two requests, one after the other, and then reject one of them in the UI, it seems that you still see the same, previous dialog.
This is because another request immediately comes in and displays a new dialog box. However, this happens so fast that it looks like the old dialog has not been closed.

My proposal is to wait some time before resolving the current request, so the user can see that one dialog has been closed and a new one has been opened.

I'd do it this way:

      emitter
        .emit('sendPendingHeadlessAuthentication', { request, onCancelled })
        .then(() => wait(400))
        .then(
          () => {
            callback(null, new api.SendPendingHeadlessAuthenticationResponse());
          },
          error => {
            callback(error);
          }
        );

I think we have never done something like this, WDYT @ravicious?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I thought my comments would cover all the async edge cases. 🫠

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting is not a bad idea, but this would artificially slow down the action that requested the headless auth as well.

If anything, tshd should check how much time has passed since the last request was resolved and if it's less than x, wait a little bit before sending another request to the Electron app.

Ultimately though, I don't have a good grasp on how this feature is going to be used to warrant spending time addressing such use cases.

But the issue you've raised plays into two other things:

  • Two headless requests issued for two different clusters at the same time, which I already commented on Teleport Connect headless approval #29097 (comment)
    • Here we wouldn't need to wait as this would be done from two goroutines and the second one would simply not forward the request to Connect.
    • …but what if we didn't drop the requests and instead maintained a "queue" for those requests? This would boil down to using Lock instead of TryLock. We could have another field on daemon.Service with the time of when the lock was last released.
  • The copy in the modal for two different requests is almost identical.
    • For two requests from the same cluster, we could think of showing time of the request, but even that wouldn't help much with the problem you described.

Copy link
Copy Markdown
Contributor Author

@Joerger Joerger Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, we can add a second delay between important modal pop ups coming from the daemon. As Rafał suggested, we can do this by doing a Lock with a wait period afterwards if needed.

Ultimately though, I don't have a good grasp on how this feature is going to be used to warrant spending time addressing such use cases.

I think we are getting close to edge cases with a 0% chance of occurring. Users create their own headless requests so they should not be confused by the popups. I don't know if there is a scenario where a user would need to create 2 headless requests for different clusters at the same time either.

I suppose my main concern is clarity as we don't want a user to approve their headless request, and then receive a malicious headless request, and clicking approve again thinking something went wrong with the first one.

Maybe we should make the Client IP show in larger font as it is the main thing to change if a malicious user made a request. Or we can show the request ID so they might double check it matches the request ID showing from tsh --headless output. For now, I'll bold the Client IP - 39dff7febe576a39ad68f4517b4110eda5ab4bbb, let me know if you think there is a non-ugly way to display the request UUID.

I don't think anything more complicated than that is warranted though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO showing Client IP in larger font would look weird, making it bold is enough.

I experimented with showing the request UUID, I hope it's no that ugly :p
image

If this looks ok, I can commit it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gzdunek Looks good to me, thanks!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 7ff47f3

Comment thread web/packages/teleterm/src/services/headlessAuthn.ts Outdated
Comment thread lib/teleterm/daemon/daemon.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the relogin logic in gateways and the headless watcher show an important modal in Connect and focus the app. An important modal is going to cover a regular modal shown in the app.

The Electron app doesn't guard against attempting to show two important modals at the same time. Calling modalsService.openImportantDialog twice will simply remove the first modal and replace it with the new one.

Because the relogin logic in gateways and the headless watcher don't coordinate with each other, it's possible for them to overwrite each other. I don't think that should be the case: in a situation where you requested a headless login somewhere and the app was brought to focus, if a DB GUI client attempts to relogin, it should not overwrite the headless authn modal.

The relogin logic is already written in a way that assumes it might not be able to lock a mutex and will instead fail immediately. I think we could do the same for headless authn. It should boil down to renaming reloginMu to importantModalMu, calling TryLock on it before calling tshdEventsClient.SendPendingHeadlessAuthentication and then logging and ignoring a failed attempt at obtaining a lock.

What do you think about this? Both relogin and headless authn occuring at the same time are rather unlikely, but it would set out a good pattern to follow for any other features using important modals in tshd.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, after reading Grzegorz's comment about two requests in a row, I realized that it's possible for two headless requests from two different clusters to override each other.

Copy link
Copy Markdown
Contributor Author

@Joerger Joerger Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should add importantModalMu to fix the issue, but I don't actually think we should replace reloginMu.

IIUC, there is a second reason for reloginMu - to trigger relogin just once even if multiple gateways in the same cluster trigger relogin at the same time. In this case, TryLock makes sense, and we should keep reloginMu for the TryLock flow.

In general, we want to use a normal importantModalMu.Lock so that multiple important modals can be displayed back to back, without any being dropped.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eff189353ea587d662a89f81d2e0c7436b759061

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I'm just thinking about the name: tshdEventsClientMu makes it seem like it should guard every call to tshdEventsClient. I feel like importantModalMu would be more clear in this case, but I see you dropped that name in favor of tshdEventsClient.

Copy link
Copy Markdown
Contributor Author

@Joerger Joerger Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I named it tshdEventsClientMu to include the tshdEventsClient.SendNotification, but if you think this is only needed for important modals, i will remove it from SendNotification and rename it.

0fa2e3d964ed966fc3569c171a9fb2ea9c777c2d

Comment thread web/packages/teleterm/src/ui/HeadlessAuthn/HeadlessPrompt/HeadlessPrompt.tsx Outdated
@Joerger Joerger force-pushed the joerger/headless-approval-teleport-connect branch from f2162a9 to 39dff7f Compare July 27, 2023 02:31
Copy link
Copy Markdown
Contributor

@gzdunek gzdunek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments left, I think we are almost there.

I really appreciate that you added cancelling the request when the it is resolved externally - it makes the UX muuuch nicer!

Comment thread lib/teleterm/daemon/daemon_headless.go Outdated
Comment thread lib/teleterm/daemon/daemon.go Outdated
Comment thread lib/teleterm/daemon/daemon_headless.go Outdated
Comment thread web/packages/teleterm/src/ui/HeadlessAuthn/HeadlessPrompt/HeadlessPrompt.tsx Outdated
Copy link
Copy Markdown
Member

@ravicious ravicious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested the scenario with two requests from the same cluster and from two different clusters, also accepting from the Web UI – it all appears to be working just fine.

One last thing if you have the time for it: it'd probably be good to have an integration test which both issues a relogin request and a headless auth request. Take a look at integration/teleterm_test.go and integration/proxy/teleterm_test.go.

It probably doesn't matter where we put the test, they both use the same test setup.

@ravicious
Copy link
Copy Markdown
Member

Did you have a chance to look into writing an integration test that I mentioned in my last review? #29097 (review)

@Joerger
Copy link
Copy Markdown
Contributor Author

Joerger commented Jul 31, 2023

Did you have a chance to look into writing an integration test that I mentioned in my last review? #29097 (review)

Sorry, I was sick at the end of last week. I'm not sure exactly what test you want me to add though. Are you concerned about testing the important modal semaphore logic, or something different?

@ravicious
Copy link
Copy Markdown
Member

Yeah, mostly the logic around semaphores, mutexes and async stuff. Relogin needs to cooperate with headless auth but at the moment we don't have any tests which verify that that's the case.

@Joerger
Copy link
Copy Markdown
Contributor Author

Joerger commented Aug 1, 2023

Yeah, mostly the logic around semaphores, mutexes and async stuff. Relogin needs to cooperate with headless auth but at the moment we don't have any tests which verify that that's the case.

I added TestImportantModalSemaphore to daemon_test.go. I think this test is better suited in the daemon package, as it's harder to create the integration tests with actual relogin, etc.

I'm going to go ahead and merge as is to get it in customer's hands sooner than later, but please let me know if you'd like more tests and I will create a follow up PR.

@ravicious
Copy link
Copy Markdown
Member

Hell yeah, let's go. Those features where the app automatically knows that something's happening feel like magic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

headless-sso size/md teleport-connect Issues related to Teleport Connect. ui

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants