Skip to content

Assist - Execution web endpoint#25955

Merged
jakule merged 10 commits intomasterfrom
jakule/ai-execute-port
May 11, 2023
Merged

Assist - Execution web endpoint#25955
jakule merged 10 commits intomasterfrom
jakule/ai-execute-port

Conversation

@jakule
Copy link
Copy Markdown
Contributor

@jakule jakule commented May 9, 2023

This PR implements the execution endpoint used by Teleport Assist. Command passed in the request is being executed on multiple nodes. Nodes are picked based on the query also passed in the request. The command is being executed as non-interactive SSH command, and the result is being recorded, which is a different behavior rather than calling tsh ssh command where the output is not saved.

@jakule jakule force-pushed the jakule/ai-execute-port branch 2 times, most recently from c4865ff to 7739c27 Compare May 9, 2023 22:13
@jakule jakule marked this pull request as ready for review May 9, 2023 23:22
@jakule jakule added the assist label May 9, 2023
@github-actions github-actions Bot requested review from nklaassen and rudream May 9, 2023 23:23
@jakule jakule requested review from r0mant and zmb3 May 9, 2023 23:23
@jakule jakule self-assigned this May 10, 2023
Copy link
Copy Markdown
Collaborator

@zmb3 zmb3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is a bit scary to me just because it's not as isolated as the previous PRs, and touches a bunch of critical code (starting sessions, streaming output, etc)

How much testing was done (outside of Assist) to ensure that this does not introduce regressions in other parts of the product?

Comment thread lib/srv/ctx.go Outdated
Comment thread lib/srv/sess.go Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this always missing or is this new because of the assist changes?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was never needed as non interactive session was never recorded. This is also related to #21267, which shows that it's not fully reliable.

Comment thread lib/web/apiserver.go Outdated
Comment thread lib/web/command.go Outdated
Comment thread lib/web/command.go Outdated
Comment thread lib/web/command_utils.go Outdated
Comment thread lib/web/command_utils.go Outdated
Comment thread lib/web/command_utils.go Outdated
Comment thread lib/web/terminal.go Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func NewWStream(ws WSConn) *WsStream {
func NewWStream(ws WSConn) *WSStream {

If we're going to use WS to mean "websocket" then lets consistently capitalize it.

Comment thread lib/web/terminal.go Outdated
Comment thread lib/web/apiserver.go Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use GetAllResources you can drop the type assertions

Suggested change
resources, err := apiclient.GetResourcesWithFilters(ctx, clt, proto.ListResourcesRequest{
resources, err := apiclient.GetAllResources[types.Server](ctx, clt, &proto.ListResourcesRequest{

Comment thread lib/web/command.go Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There looks to be some duplicate/overlapping code between this and the TerminalHandler. Can we refactor things so that it only needs to be implemented a single time in lib/web?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extracted more shared code. Let me know it you're ok with the current state.

Copy link
Copy Markdown
Collaborator

@r0mant r0mant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't have too many comments besides what others have already pointed out. If there's a way to reuse more code between command executor and existing web terminal, let's do it.

Comment thread lib/client/api.go Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little strange that this is not exported while everything else is. Are callers outside of this package not supposed to be able to set this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread lib/web/apiserver.go Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return session.Session{}, err
return session.Session{}, trace.Wrap(err)

Comment thread lib/web/command.go Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can query be empty? It's not validated here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind it was that an empty query returns all nodes, so the client passing an empty query wants to execute a command on all nodes (that they have access to). I'm not aware of how you can create a query asking to do that, so that was kind of a workaround, but the more I think about it more I lean to block this edge case, as someone may, by mistake execute a command on all nodes.

Comment thread lib/web/command.go Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Suggested change
h.log.WithError(err).Debug("Unable to get auth access point.")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread lib/web/command.go Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? The caller should log the error?

Suggested change
h.log.WithError(err).Debug("Unable to fetch cluster networking config.")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread lib/web/command.go Outdated
Comment thread lib/web/command.go Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return the actual error in this case? Otherwise may be confusing why you get "no servers found" without looking in the logs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.. kind of.

Comment thread lib/web/command.go Outdated
Comment thread lib/web/command.go Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to keep logging every error additionally? Caller should log it, no?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caller does not log it. Our server handler assumes that the caller is an HTTP client, and then the error is returned. Here we've already upgraded the connection to WebSocket, so the client doesn't care about the HTTP responses.

I can remove the log messages, but then there is no trace in the logs of what went wrong. I think we have two options here:

  1. Leave as it is as it matches the implementation of TerminalHandler, for example, https://github.com/gravitational/teleport/blob/6db605fbb854d67aea96398b4e12abccdf61e4f8/lib/web/apiserver.go#L2619
  2. I can clean it up a bit by:
  • returning error as HTTP response - the client should be able to read those
  • after the WS connection is created, return any error as a WS message. The client should be able to handle those.
    As a side effect, we will have no indication in Teleport logs of what happens, which can be a good and bad thing.

WDYGT @r0mant @zmb3

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for #1 for now.

There are a few other related issues where we fail to respond with errors after the WS upgrade completes, I think we should fix them all at once and not make you do it now.

@jakule jakule force-pushed the jakule/ai-execute-port branch from 323fb10 to 08cac88 Compare May 11, 2023 04:05
@jakule jakule requested review from r0mant, rosstimothy and zmb3 May 11, 2023 04:05
Comment thread lib/web/command_utils.go
"github.com/gravitational/trace"
)

// WSConn is a gorilla/websocket minimal interface used by our web implementation.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain in the godoc why do we want to override Close method?

Fix linter
@jakule jakule added this pull request to the merge queue May 11, 2023
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 11, 2023
Comment thread lib/client/api.go
env[sshutils.SessionEnvVar] = tc.SessionID
}

for key, val := range tc.extraEnvs {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each of these additional environment variables we set on the session will result in an additional roundtrip between client and target before the session is available.

Comment thread constants.go Outdated
SSHSessionID = "SSH_SESSION_ID"

// EnableNonInteractiveSessionRecording can be used to record non-interactive SSH session.
EnableNonInteractiveSessionRecording = "SSH_RECORD_NON_INTERACTIVE"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we prefix this with TELEPORT_?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to SSH_TELEPORT_RECORD_NON_INTERACTIVE to match our other names.

@jakule jakule enabled auto-merge May 11, 2023 20:09
@jakule jakule added this pull request to the merge queue May 11, 2023
Merged via the queue into master with commit 9a89404 May 11, 2023
@jakule jakule deleted the jakule/ai-execute-port branch May 11, 2023 20:37
jakule added a commit that referenced this pull request May 11, 2023
* Assist - Execution web endpoint

* Add test
Clean up code a bit

* Add missing username

* Address review comments

* Make more implementation shared between Terminal and Command Web Handlers

* Address review comments

* Address review comments

* Fixes after rebase
Add comments

* Add comments
Fix linter

* Add TELEPORT to Teleport related environment variable.
jakule added a commit that referenced this pull request May 12, 2023
* Assist - Execution web endpoint (#25955)

* Assist - Execution web endpoint

* Add test
Clean up code a bit

* Add missing username

* Address review comments

* Make more implementation shared between Terminal and Command Web Handlers

* Address review comments

* Address review comments

* Fixes after rebase
Add comments

* Add comments
Fix linter

* Add TELEPORT to Teleport related environment variable.

* Fixes after merge
jakule added a commit that referenced this pull request Jun 2, 2023
* Assist - Execution web endpoint

* Add test
Clean up code a bit

* Add missing username

* Address review comments

* Make more implementation shared between Terminal and Command Web Handlers

* Address review comments

* Address review comments

* Fixes after rebase
Add comments

* Add comments
Fix linter

* Add TELEPORT to Teleport related environment variable.
jakule added a commit that referenced this pull request Jun 3, 2023
* Assist - API implementation (#25810)

* Assist - API implementation

* Add Assistant CRUD test

* Move assistant API to a separate file.

* Rename GRPC methods
Fix comments

* Address PR comments

* Update GRPC API - addressing code review comments

* Move assist API to a new GRPC service

* Add Assistant RBAC rules to this backport.

* Add missing license

* Update comment.

* Move username and ID out of AssistantMessage (#25964)

* Move username and ID out of AssistantMessage

#25810 added username and conversation ID to AssistantMessage. They aren't needed inside the message and can be moved out of it.
This PR also fixes ` grpc: error while marshaling: proto: Marshal called with nil` in `CreateAssistantMessage`.

Note: It's safe to change protobuf numbers as this code hasn't been released or backported yet.

* Fix test

* make grpc

* Assist - Configuration and usage (#25953)

* Assist - Configuration and usage

* Add network config test

* Add config test

* Run GCI

* Address review comments

* Assist - OpenAI library port (#25948)

* Assist - OpenAI library port

* Add tests

* Address code review comments

* Added comment

* Partial backport of #26058

* Move AI messages to a new file

* Prevent blocking on error

* Add comments.
Fix typo

* Assist - Execution web endpoint (#25955)

* Assist - Execution web endpoint

* Add test
Clean up code a bit

* Add missing username

* Address review comments

* Make more implementation shared between Terminal and Command Web Handlers

* Address review comments

* Address review comments

* Fixes after rebase
Add comments

* Add comments
Fix linter

* Add TELEPORT to Teleport related environment variable.

* Assist - Web endpoints (#26046)

* Assist - Web endpoints

* GCI

* proto rpc

* enforce endpoint checks

* disable in ui if disallowed by auth

* add comment

* unwrap stack

* Refactor code

---------

Co-authored-by: Joel Wejdenstål <jwejdenstal@icloud.com>

* [Assist] Fix random user selection (#26183)

Currently, after selecting a user for command execution in Teleport Assist the user can randomly change. This PR fixes this behavior.

* Add rate limiting to Assist (#26011)

* Add rate limiting to Assist

* Only rate limit Assist in Cloud

* Add a comment to assistantLimiter

* Fixes after rebase

* Add 'rate-limited' test case to assistant_test

* Handle CHAT_MESSAGE_UI in Assist web UI

* Add godoc

* CHAT_MESSAGE_UI -> CHAT_MESSAGE_ERROR

* Run assistant test cases in parallel

* Rename `ChatGPT` to `GPT-4` (#26272)

* Rename `ChatGPT` to `GPT4`

ChatGPT is a user friendly name, but is technically inaccurate.

* Apply suggestions from code review

Co-authored-by: Reed Loden <reed@goteleport.com>

* Lint fix

* Additional ChatGPT --> OpenAI GPT-4 fixes

---------

Co-authored-by: Reed Loden <reed@goteleport.com>

* Fix Assist rate-limiting in Cloud (#26342)

When Proxy is separate from Auth,
Proxy 'modules' will not contain meaningful data.
Instead, one must use ClusterFeatures fetched from the Auth server

* Assist UI improvements (#26365)

* Update assist warning wording, add link to ToS (#26396)

* Always render the portal for the assist title to go into (#26733)

* Stop using TokenPath for API key in Assist (#26671)

* [Assist] MFA support (#26719)

* Initial support for MFA in Assist

* UI webauth handler

* WebUI - WIP

* Run prettier

* Perform MFA ceremony only once.

* Cleanup JS

* Remove hacky WS logic

* Add cancel MFA logic

* [Assist] Prevent creating messages without conversation (#26797)

* Add parallel execution to assist. (#26563)

* Add parallel execution to assist.

* Extract execution logic to a new function.

* Add test

* Switch uber to std

* Address code review comments

* Fix display of Assist command executions with empty output (#27010)

* Fix display of Assist executions with empty output

* Lint

* Nit

* Improve display of commands that failed with code

* Address lint

* [Assist] Allow removing assist conversations (#26788)

* [Assist] Allow removing assist conversations

* Display landing page after the conversion is removed

* Improve styling and add a confirmation dialog

* Change the icon opacity to copy the main navigation

* Remove unused minus icon

* Add missing trace.wrap

---------

Co-authored-by: Ryan Clark <ryan.clark@goteleport.com>

* Backport fixes

* Regenerate go.sum to make the linter happy.

* Sync files after rebase

* After rebase fixes

---------

Co-authored-by: Joel Wejdenstål <jwejdenstal@icloud.com>
Co-authored-by: Justinas Stankevičius <justinas@users.noreply.github.com>
Co-authored-by: Mike Jensen <jentfoo@users.noreply.github.com>
Co-authored-by: Reed Loden <reed@goteleport.com>
Co-authored-by: Ryan Clark <ryan.clark@goteleport.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants