Add rate limiting to Assist by justinas · Pull Request #26011 · gravitational/teleport

justinas · 2023-05-10T17:34:19Z

Rate limit Assist in Cloud based on the amount of tokens used (cluster-wide). This is a naive implementation that only stores the counter in memory.

justinas · 2023-05-10T17:34:45Z

Latest discussion about the rate limit: https://gravitational.slack.com/archives/C0509EYASCW/p1683738890958669

justinas · 2023-05-10T17:35:09Z

Using this instead of oxy ratelimiter, as the latter is quite tightly coupled with individual http.Request-s.

Can you add this PR command as a code command too? The reasoning won't be easily visible to people after this PR is merged.

jakule

@justinas Can we add some tests to it?

jakule · 2023-05-12T15:27:32Z

Can you add this PR command as a code command too? The reasoning won't be easily visible to people after this PR is merged.

jakule · 2023-05-12T15:28:46Z

Do we have an issue to trace it? If yes, can you link it here?

jakule · 2023-05-12T15:38:36Z

This should not be an assistant message. We should create a new message type, so the UI can react to it instead of displaying an error as a regular message. We should also close the WS connection as a client should not write anything more.

I have made a new message type, however I kept the frontend behavior the same for now. WS connection now closes

justinas · 2023-05-15T12:31:58Z

@justinas Can we add some tests to it?

Added a test case where rate limit is hit.

jakule

I'm fine with the backend changes, but please wait for @ryanclark to approve the UI changes.

jakule · 2023-05-15T13:18:39Z

 ): Promise<MessagesAction> {
-  if (message.type === 'CHAT_MESSAGE_ASSISTANT') {
+  if (
+    message.type === 'CHAT_MESSAGE_ASSISTANT' ||


@ryanclark Can you take a look at it?

jakule · 2023-05-15T13:21:45Z

-						TotalTokens:      int64(usedTokens.Prompt + usedTokens.Competition),
+						TotalTokens:      int64(usedTokens.Prompt + usedTokens.Completion),
 						PromptTokens:     int64(usedTokens.Prompt),
-						CompletionTokens: int64(usedTokens.Competition),


Lol, sorry for that 😅

jakule · 2023-05-15T13:23:21Z

+		// Try to consume a small amount of tokens first.
+		const lookaheadTokens = 100
+		if !h.assistantLimiter.AllowN(time.Now(), lookaheadTokens) {
+			err := onMessageFn(assist.MessageKindUIMessage, []byte("You have reached the rate limit. Please try again later."), h.clock.Now().UTC())


Should we send some pot-hog event when this happens? Maybe this question doesn't make sense, but I've got no idea how we are using post-hog now.

Might be a good idea, I'll mark it down somewhere. Just don't want to complicate this PR further for now.

tigrato · 2023-05-15T13:21:40Z

 	// SSOLoginFailureMessage is a generic error message to avoid disclosing sensitive SSO failure messages.
 	SSOLoginFailureMessage = "Failed to login. Please check Teleport's log for more details."
+
+	assistantTokensPerHour = 140


tigrato · 2023-05-15T13:23:49Z

+	assistantTokensPerHour = 140
+	// assistantLimiterRate is the rate (in tokens per second)
+	// at which tokens for the assistant rate limiter are replenished
+	assistantLimiterRate = rate.Limit(assistantTokensPerHour / float64(time.Hour/time.Second))


nit:

Suggested change

assistantLimiterRate = rate.Limit(assistantTokensPerHour / float64(time.Hour/time.Second))

assistantLimiterRate = rate.Limit(assistantTokensPerHour / time.Hour.Seconds())

time.Hour.Seconds() sadly can not be used in a const definition

ryanclark

LGTM, we should probably style CHAT_MESSAGE_UI messages a bit differently than normal messages to show that it's an error or something, but that can be done in another PR

ryanclark · 2023-05-15T13:29:08Z

 	// MessageKindSystemMessage is the type of Assist message that contains the system message.
 	MessageKindSystemMessage MessageType = "CHAT_MESSAGE_SYSTEM"
+	// MessageKindUIMessage is the type of Assist message that is presented to user as information, but not stored persistently in the conversation. This can include backend error messages and the like.
+	MessageKindUIMessage MessageType = "CHAT_MESSAGE_UI"


Will this ever be anything other than an error?

Honestly dunno, for now this rate-limit is the only case 🤷

@justinas @ryanclark Can we just change the name to CHAT_MESSAGE_ERROR and keep using that type for all assist-related messages?

Yeah I think we should

public-teleport-github-review-bot · 2023-05-15T16:57:58Z

@justinas See the table below for backport results.

Branch	Result
branch/v12	Failed

* Add rate limiting to Assist * Only rate limit Assist in Cloud * Add a comment to assistantLimiter * Fixes after rebase * Add 'rate-limited' test case to assistant_test * Handle CHAT_MESSAGE_UI in Assist web UI * Add godoc * CHAT_MESSAGE_UI -> CHAT_MESSAGE_ERROR * Run assistant test cases in parallel

* Add rate limiting to Assist (#26011) * Add rate limiting to Assist * Only rate limit Assist in Cloud * Add a comment to assistantLimiter * Fixes after rebase * Add 'rate-limited' test case to assistant_test * Handle CHAT_MESSAGE_UI in Assist web UI * Add godoc * CHAT_MESSAGE_UI -> CHAT_MESSAGE_ERROR * Run assistant test cases in parallel * Fix incorrect merge --------- Co-authored-by: Justinas Stankevičius <justinas@users.noreply.github.com>

* Add rate limiting to Assist * Only rate limit Assist in Cloud * Add a comment to assistantLimiter * Fixes after rebase * Add 'rate-limited' test case to assistant_test * Handle CHAT_MESSAGE_UI in Assist web UI * Add godoc * CHAT_MESSAGE_UI -> CHAT_MESSAGE_ERROR * Run assistant test cases in parallel

* Assist - API implementation (#25810) * Assist - API implementation * Add Assistant CRUD test * Move assistant API to a separate file. * Rename GRPC methods Fix comments * Address PR comments * Update GRPC API - addressing code review comments * Move assist API to a new GRPC service * Add Assistant RBAC rules to this backport. * Add missing license * Update comment. * Move username and ID out of AssistantMessage (#25964) * Move username and ID out of AssistantMessage #25810 added username and conversation ID to AssistantMessage. They aren't needed inside the message and can be moved out of it. This PR also fixes ` grpc: error while marshaling: proto: Marshal called with nil` in `CreateAssistantMessage`. Note: It's safe to change protobuf numbers as this code hasn't been released or backported yet. * Fix test * make grpc * Assist - Configuration and usage (#25953) * Assist - Configuration and usage * Add network config test * Add config test * Run GCI * Address review comments * Assist - OpenAI library port (#25948) * Assist - OpenAI library port * Add tests * Address code review comments * Added comment * Partial backport of #26058 * Move AI messages to a new file * Prevent blocking on error * Add comments. Fix typo * Assist - Execution web endpoint (#25955) * Assist - Execution web endpoint * Add test Clean up code a bit * Add missing username * Address review comments * Make more implementation shared between Terminal and Command Web Handlers * Address review comments * Address review comments * Fixes after rebase Add comments * Add comments Fix linter * Add TELEPORT to Teleport related environment variable. * Assist - Web endpoints (#26046) * Assist - Web endpoints * GCI * proto rpc * enforce endpoint checks * disable in ui if disallowed by auth * add comment * unwrap stack * Refactor code --------- Co-authored-by: Joel Wejdenstål <jwejdenstal@icloud.com> * [Assist] Fix random user selection (#26183) Currently, after selecting a user for command execution in Teleport Assist the user can randomly change. This PR fixes this behavior. * Add rate limiting to Assist (#26011) * Add rate limiting to Assist * Only rate limit Assist in Cloud * Add a comment to assistantLimiter * Fixes after rebase * Add 'rate-limited' test case to assistant_test * Handle CHAT_MESSAGE_UI in Assist web UI * Add godoc * CHAT_MESSAGE_UI -> CHAT_MESSAGE_ERROR * Run assistant test cases in parallel * Rename `ChatGPT` to `GPT-4` (#26272) * Rename `ChatGPT` to `GPT4` ChatGPT is a user friendly name, but is technically inaccurate. * Apply suggestions from code review Co-authored-by: Reed Loden <reed@goteleport.com> * Lint fix * Additional ChatGPT --> OpenAI GPT-4 fixes --------- Co-authored-by: Reed Loden <reed@goteleport.com> * Fix Assist rate-limiting in Cloud (#26342) When Proxy is separate from Auth, Proxy 'modules' will not contain meaningful data. Instead, one must use ClusterFeatures fetched from the Auth server * Assist UI improvements (#26365) * Update assist warning wording, add link to ToS (#26396) * Always render the portal for the assist title to go into (#26733) * Stop using TokenPath for API key in Assist (#26671) * [Assist] MFA support (#26719) * Initial support for MFA in Assist * UI webauth handler * WebUI - WIP * Run prettier * Perform MFA ceremony only once. * Cleanup JS * Remove hacky WS logic * Add cancel MFA logic * [Assist] Prevent creating messages without conversation (#26797) * Add parallel execution to assist. (#26563) * Add parallel execution to assist. * Extract execution logic to a new function. * Add test * Switch uber to std * Address code review comments * Fix display of Assist command executions with empty output (#27010) * Fix display of Assist executions with empty output * Lint * Nit * Improve display of commands that failed with code * Address lint * [Assist] Allow removing assist conversations (#26788) * [Assist] Allow removing assist conversations * Display landing page after the conversion is removed * Improve styling and add a confirmation dialog * Change the icon opacity to copy the main navigation * Remove unused minus icon * Add missing trace.wrap --------- Co-authored-by: Ryan Clark <ryan.clark@goteleport.com> * Backport fixes * Regenerate go.sum to make the linter happy. * Sync files after rebase * After rebase fixes --------- Co-authored-by: Joel Wejdenstål <jwejdenstal@icloud.com> Co-authored-by: Justinas Stankevičius <justinas@users.noreply.github.com> Co-authored-by: Mike Jensen <jentfoo@users.noreply.github.com> Co-authored-by: Reed Loden <reed@goteleport.com> Co-authored-by: Ryan Clark <ryan.clark@goteleport.com>

justinas commented May 10, 2023

View reviewed changes

Comment thread lib/web/assistant.go Outdated

justinas marked this pull request as ready for review May 10, 2023 17:36

justinas requested review from hugoShaka and jakule May 10, 2023 17:36

github-actions Bot added the size/sm label May 10, 2023

github-actions Bot requested review from codingllama and tigrato May 10, 2023 17:39

justinas changed the base branch from dev-ai to master May 11, 2023 13:58

justinas changed the base branch from master to dev-ai May 11, 2023 14:00

justinas force-pushed the justinas/assist-rate-limiter branch from f6788a8 to ed8a4fd Compare May 12, 2023 14:20

justinas changed the base branch from dev-ai to jakule/ai-web-endpoints May 12, 2023 14:20

jakule reviewed May 12, 2023

View reviewed changes

Base automatically changed from jakule/ai-web-endpoints to master May 13, 2023 05:19

Justinas Stankevicius added 5 commits May 15, 2023 14:24

Add rate limiting to Assist

385793f

Only rate limit Assist in Cloud

bcd3b38

Add a comment to assistantLimiter

41b1c4c

Fixes after rebase

54a4ebf

Add 'rate-limited' test case to assistant_test

b598ee0

justinas force-pushed the justinas/assist-rate-limiter branch from ed8a4fd to b598ee0 Compare May 15, 2023 12:16

justinas requested a review from jakule May 15, 2023 12:32

Handle CHAT_MESSAGE_UI in Assist web UI

c52f349

justinas force-pushed the justinas/assist-rate-limiter branch from 0742181 to c52f349 Compare May 15, 2023 12:36

justinas requested a review from ryanclark May 15, 2023 12:38

jakule approved these changes May 15, 2023

View reviewed changes

tigrato approved these changes May 15, 2023

View reviewed changes

public-teleport-github-review-bot Bot removed request for codingllama and ryanclark May 15, 2023 13:26

public-teleport-github-review-bot Bot removed the request for review from hugoShaka May 15, 2023 13:26

ryanclark approved these changes May 15, 2023

View reviewed changes

Add godoc

984c1b8

justinas added the backport/branch/v12 label May 15, 2023

Merge branch 'master' into justinas/assist-rate-limiter

beb658c

justinas enabled auto-merge May 15, 2023 14:06

CHAT_MESSAGE_UI -> CHAT_MESSAGE_ERROR

9b5f09e

justinas force-pushed the justinas/assist-rate-limiter branch from e1ab55c to 9b5f09e Compare May 15, 2023 14:49

Run assistant test cases in parallel

5eb9fb5

justinas added this pull request to the merge queue May 15, 2023

Merged via the queue into master with commit 21c534b May 15, 2023

justinas deleted the justinas/assist-rate-limiter branch May 15, 2023 16:56

jakule mentioned this pull request May 15, 2023

[v12] Add rate limiting to Assist #26270

Merged

	assistantLimiterRate = rate.Limit(assistantTokensPerHour / float64(time.Hour/time.Second))
	assistantLimiterRate = rate.Limit(assistantTokensPerHour / time.Hour.Seconds())

Conversation

justinas commented May 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jakule left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

justinas commented May 15, 2023

Uh oh!

jakule left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryanclark left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

public-teleport-github-review-bot Bot commented May 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

justinas commented May 10, 2023 •

edited

Loading