[Assist] Include embeddings in the prompt by jakule · Pull Request #28116 · gravitational/teleport

jakule · 2023-06-21T20:55:04Z

Closes https://github.com/gravitational/teleport.e/issues/1660

Notes:
The embedding service runs in the Auth, and we keep the embeddings in memory for similarity search as we don't have any vector DB available. For that reason, the size is capped at 1000 nodes. This should be enough for most non-enterprise customers. The vector in-memory index is created 10 seconds after the auth starts, and it's updated every hour.

hugoShaka · 2023-06-22T14:04:37Z

+			return nil, trace.Wrap(err)
+		}
+
+		content, err := ai.SerializeNode(node)


Non-blocking: I have no idea whether yaml serialization is the right tool here.

When prototyping, I instinctively used CSV serialization when feeding nodes to the prompt in order to minimize token usage. I have no evidence CSV is better, I just followed other projects feedback that table-like structures were better understood. YAML is simpler to implement with the current appraoch though (building a CSV table requires knowing all labels, which doesn't work really well with row-by-row APIs like this one).

If we observe too high token usage and/or inaccuracies, we might want to change this, and maybe how the enpdoint works as well.

fyi from experience, CSV is usually king for GPT-4 understanding if usable

I changed the implementation to CSV and I see a way worst performance. Previously LLM was almost always right. Now, it's the opposite. I'll leave the YAML for now. We can try to tweak it later.

hugoShaka · 2023-06-22T14:36:22Z

+func (e *EmbeddingProcessor) Run(ctx context.Context, initialDelay, period time.Duration) error {
+	initTimer := time.NewTimer(initialDelay)
 	for {
 		select {
 		case <-ctx.Done():
 			return ctx.Err()
+		case <-initTimer.C:
+			// Stop the timer after the initial delay.
+			initTimer.Stop()
+			e.process(ctx)
 		case <-time.After(e.jitter(period)):
 			e.process(ctx)
 		}


Non blocking: it looks like a great use-case for the interval util: https://github.com/gravitational/teleport/blob/master/lib/utils/interval/interval.go

I'm not sure how the interval util could be used there. In this case, I want to run the timer once, and the interval utils only provide tools to run action periodically. I could also use even time.Sleep here for simplicity, except this would break the ctx support for the first run

hugoShaka · 2023-06-22T14:48:02Z

 	Name() string
 	Description() string
-	Run(ctx context.Context, input string) (string, error)
+	Run(ctx context.Context, input string) (*stepOutput, error)


I'm not sure to get all the implications of this change. What happens if the tool returns a stepOutput with action set? If the goal is to either return an observation or finish execution, maybe a dedicated structure would be clearer.

@xacrimon Do you have any opinions on that? The problem that I have currently is that when a user doesn't have access to any nodes, the empty output confuses the execution, and the LLM keeps asking about the same thing.

Preferably can we respond with an english language text hint telling it there are no nodes instead of an empty string? Responding with a stepoutput breaks the nice tool/exec loop encapsulation so not a big fan of this.

xacrimon · 2023-06-27T13:57:13Z

+message GetAssistantEmbeddingsRequest {
+  // username is a username of the user who request the embeddings.
+  string username = 1;
+  // contentQuery is the content that going to be used for similarity search.


"content" in the description seems ambigous/unclear?

xacrimon · 2023-06-27T13:57:24Z


+// GetAssistantEmbeddingsRequest is a request to get embeddings.
+message GetAssistantEmbeddingsRequest {
+  // username is a username of the user who request the embeddings.


Suggested change

// username is a username of the user who request the embeddings.

// username is a username of the user who requested the embeddings.

xacrimon · 2023-06-27T13:58:13Z

+  // limit is the number of embeddings to return (also known as k).
+  uint32 limit = 3;
+  // kind is the kind of embeddings to return (ex, node).
+  string kind = 4;


should this be a oneof? stringly typed enumerations are rather fragile/unclear and this doesn't describe all possible/valid states either.

Why oneof and not enum then? My idea was to keep it flexible, but after adding more types, we will probably need to re-route them to different sources, so we will need to rely on that type anyways.

xacrimon · 2023-06-27T14:12:22Z

+// GetAssistantEmbeddingsResponse is a response from the assistant service.
+message GetAssistantEmbeddingsResponse {
+  // embeddings is the list of embeddings.
+  repeated EmbeddedDocument embeddings = 1;


should we specify these are ordered?

xacrimon · 2023-06-27T14:16:01Z

 	Name() string
 	Description() string
-	Run(ctx context.Context, input string) (string, error)
+	Run(ctx context.Context, input string) (*stepOutput, error)


Preferably can we respond with an english language text hint telling it there are no nodes instead of an empty string? Responding with a stepoutput breaks the nice tool/exec loop encapsulation so not a big fan of this.

xacrimon · 2023-06-27T14:17:31Z

+			return nil, trace.Wrap(err)
+		}
+
+		content, err := ai.SerializeNode(node)


fyi from experience, CSV is usually king for GPT-4 understanding if usable

GCI Minor fixes

Apply code review suggestions.

* [Assist] Include embeddings in the prompt * Add comments GCI Minor fixes * Move stuff * Fix tests * Fix tests * Fixes after rebase Apply code review suggestions. * Address review comments * After rebase fix

* [Assist] Scaffold the chat-loop onto a multi-step thinking model (#27075) * agent scaffold conversion * command input validation * rename Agent.Think and replace debug logs with trace logs * doc * action docs * godocs * clarify * remove unused code * remove tests which relied on the old non-agent model interaction with the llm * fix broken e * Add node name to the Assist execution result (#27635) * Add node name to the Assist execution result Currently, only node ID is returned on the command execution result in Assist. For better UX we want to display Node name which id more human friendly rather than a node ID which is a UUID. Adding the value to returned payload sounds cheaper than calling an API to get node names. * Add test * Extract commandExecResult struct * Fix test after rebase * Fix command execution test flakiness (#27704) Fix ``` --- FAIL: TestExecuteCommand (1.46s) testing.go:1206: TempDir RemoveAll cleanup: unlinkat /tmp/TestExecuteCommand3553793052/002/log/upload/streaming/default: directory not empty FAIL ``` error * [Assist] Fix panic when writing to one WS from multiple threads (#27828) * [Assist] Fix panic when writing to one WS from multiple threads Fixes https://github.com/gravitational/teleport.e/issues/1650 * Remove mutex on SetReadDeadline * Move SetPongHandler * Fix typos * Fix command output showing when running on multiple nodes (#27936) * ai: Add a node embedding watcher (#27204) * ai: add embeddings basic support - add Embeddings service and its local implementation - add Embedding type and proto message - add nodeEmbeddingCollector tracking nodes - add NodeEmbeddingWatcher watching for events adn sending them to the collector - add the Embedder interface and its openai implementation * ai: adapt embeddings to the vector index * fixup! ai: adapt embeddings to the vector index * fixup! fixup! ai: adapt embeddings to the vector index * Update lib/service/service.go Co-authored-by: Jakub Nyckowski <jakub.nyckowski@goteleport.com> * address feedback pt.1 * address feedback pt.2: store protobuf message in backend * address feedback pt.3: have GetEmbeddings return a stream * Update lib/services/embeddings.go Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Edoardo Spadolini <edoardo.spadolini@goteleport.com> * address feedback pt.4: extract embedding logic out of Embeddings service * fixup! address feedback pt.4: extract embedding logic out of Embeddings service * address feedback pt.5: simpler error handling when embedding fails * fix tests pt.1 * fix tests pt.2 * fix tests pt.3 * [Assist] Replace embedding watcher (#27953) Change the way how the embeddings are calculated. Instead of creating a watcher in Auth, we will process all nodes every hour and process embeddings if any embeddings are missing or any node has been updated. --------- Co-authored-by: Jakub Nyckowski <jakub.nyckowski@goteleport.com> Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com> Co-authored-by: Edoardo Spadolini <edoardo.spadolini@goteleport.com> * Restore `lib/ai` tests (#28077) * Restore `lib/ai` tests The tests were removed as a part of #27075. This PR updates the tests to use the new logic. * Fix tests * Restore lib/web tests * GCI * Move test handler to a common place * Fix used token test * Add comment * Remove duplicate imports (#27886) * [Assist] Remove the empty assist message (#28125) * [Assist] Remove the empty assist message Assist shows an empty message at the beginning of each conversation when reading it from DB. This PR fixes that behavior and adds a test to prevent this from happening in the future. * Address code review comments * Address code review comments * Skip embedding processor on Cloud Non-Team plan (#28197) * ai: compute opportinistic summary of command execution (#28033) * ai: compute opportinistic summary of command execution * ai: add streaming summary back after rebase on new front-end * Lint and fix tests pt.1 * reference nodes by name and add tests * Lint, fix tests and address feedback * Attempt to tame the stream close monster * fixup! Attempt to tame the stream close monster * [Assist] Do not close the WS after command execution (#28246) * Revert "fixup! Attempt to tame the stream close monster" This reverts commit 8537aa2. * Revert "Attempt to tame the stream close monster" This reverts commit e0c861d. * Do not close the WS after command execution * Fix tests and lint * fixup! Fix tests and lint * undo put web test command into constant --------- Co-authored-by: Jakub Nyckowski <jakub.nyckowski@goteleport.com> * [Assist] Include embeddings in the prompt (#28116) * [Assist] Include embeddings in the prompt * Add comments GCI Minor fixes * Move stuff * Fix tests * Fix tests * Fixes after rebase Apply code review suggestions. * Address review comments * After rebase fix * Improve error handling and embedding prompts; fix typos (#28403) * "Improve error handling and embedding prompts; fix typos" This commit encompasses several changes. First, an error handling routine has been added in AssistContext.tsx to properly close a WebSocket connection and finish all results. The intent is to ensure that execution fails gracefully when a session doesn't end normally. In tool.go, user instructions have been made more explicit to ensure users check access to nodes before generating any commands. It warns them that not checking access will cause error. Also, some minor typos were corrected in agent.go and messages.go for better readability. * "Refactor 'hosts' to 'nodes' in AI Tool Descriptions" This commit refactors the language from 'host' terminology to 'node' terminology in the AI tool's generated responses as the LLM seems to be confused when generating queries with embeddings. * Update expected test values in chat_test.go The expected values in three different tests in chat_test.go have been updated. This change was required because the underlying algorithm has been adjusted and these modifications will keep the tests aligned with the current algorithm's behavior. * Add missing imports * Introduce user preferences (#28291) * Add user preferences feature * Add missing license header * Fix the order of arguments to require.Equal * Update lib/web/userpreferences.go Co-authored-by: Michelle Bergquist <11967646+michellescripts@users.noreply.github.com> * Add a `GetUserPreferencesResponse` message * Remove unused logger * Use .Put instead of .Create/.Update * Add missing godoc * trace.Wrap the happy path --------- Co-authored-by: Michelle Bergquist <11967646+michellescripts@users.noreply.github.com> * Shut down embedding processor on graceful exit (#28356) * Refactor websocket termination and stream handling (#28452) * Refactor websocket termination and stream handling Refactored websocket stream shutdown and error handling. Replaced `Close()` with `SendCloseMessage()` for better control over the websocket connection termination process. Added checks for the validity of channels to prevent reading from closed channels. The commit also includes minor typo fixes. * Remove unused completedC * Remove unnecessary select blocks in terminal.go The select blocks used in terminal.go for reading data from channels were unnecessary as we were just pulling from a single channel. Removed the select block and directly attempted to read from the channel. These changes increase code readability and integrity by removing unnecessary select blocks. In the command_test.go, an explanatory comment was added for clarity. * Remove commented code * Replace trace.NewAggregate with trace.Wrap as aggregation is not needed. * Add the UI for Assist's settings (#28413) * Add the UI for Assist's settings * Add typing * Fix test by wrapping render in LayoutContextProvider * Run prettier * Assist: fix summary logic (#28487) * Update command.go * simplify export signature * assist: add classification code (#28221) * [Assist] Provide interactive updates during agent execution (#27893) * send progress update messages during agent thoughts * handle new output format * define json tags for serialized fields * use streaming api * fan streaming from model loop * fix streaming * stream progress updates * Update lib/assist/assist.go Co-authored-by: Jakub Nyckowski <jakub.nyckowski@goteleport.com> * remove useless mute * nits * Update lib/ai/model/agent.go Co-authored-by: Jakub Nyckowski <jakub.nyckowski@goteleport.com> * fix merge * fix misc * more misc fixes * what * what2 * weird eof errors? * Fix tests UI integration * Fix other tests * Linter fixes * Comment out token counting for assist streams to avoid race condition. * Fix more tests --------- Co-authored-by: Jakub Nyckowski <jakub.nyckowski@goteleport.com> * Remove console.log in AssistContext (#28607) --------- Co-authored-by: Joel <jwejdenstal@goteleport.com> Co-authored-by: Ryan Clark <ryan.clark@goteleport.com> Co-authored-by: Hugo Shaka <hugo.hervieux@goteleport.com> Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com> Co-authored-by: Edoardo Spadolini <edoardo.spadolini@goteleport.com> Co-authored-by: Zac Bergquist <zac.bergquist@goteleport.com> Co-authored-by: Justinas Stankevičius <justinas@users.noreply.github.com> Co-authored-by: Michelle Bergquist <11967646+michellescripts@users.noreply.github.com>

jakule mentioned this pull request Jun 21, 2023

[Assist] Include embeddings in the prompt #27827

Closed

jakule force-pushed the jakule/ai-embeddings-prompt branch from 0c0dff3 to def2bc4 Compare June 22, 2023 01:10

jakule marked this pull request as ready for review June 22, 2023 01:10

jakule requested review from hugoShaka and xacrimon June 22, 2023 01:10

github-actions Bot requested review from capnspacehook and r0mant June 22, 2023 01:10

github-actions Bot added the size/md label Jun 22, 2023

jakule added the assist label Jun 22, 2023

hugoShaka reviewed Jun 22, 2023

View reviewed changes

jakule requested a review from hugoShaka June 26, 2023 19:42

jakule force-pushed the jakule/ai-embeddings-prompt branch from f1506e1 to d436c90 Compare June 26, 2023 19:42

capnspacehook approved these changes Jun 26, 2023

View reviewed changes

xacrimon reviewed Jun 27, 2023

View reviewed changes

jakule force-pushed the jakule/ai-embeddings-prompt branch from d436c90 to f040fd6 Compare June 27, 2023 15:41

jakule requested a review from xacrimon June 27, 2023 15:41

xacrimon approved these changes Jun 27, 2023

View reviewed changes

hugoShaka approved these changes Jun 27, 2023

View reviewed changes

jakule added 8 commits June 27, 2023 15:34

[Assist] Include embeddings in the prompt

13215a4

Add comments

f588f0d

GCI Minor fixes

Move stuff

75aa568

Fix tests

b1ef7ff

Fix tests

e93036d

Fixes after rebase

d3307cb

Apply code review suggestions.

Address review comments

33e0640

After rebase fix

ba9d5ed

jakule force-pushed the jakule/ai-embeddings-prompt branch from f040fd6 to ba9d5ed Compare June 27, 2023 19:53

ryanclark approved these changes Jun 27, 2023

View reviewed changes

public-teleport-github-review-bot Bot removed the request for review from r0mant June 27, 2023 20:03

jakule added this pull request to the merge queue Jun 27, 2023

github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 27, 2023

jakule added this pull request to the merge queue Jun 27, 2023

Merged via the queue into master with commit 2b0fb80 Jun 27, 2023

jakule deleted the jakule/ai-embeddings-prompt branch June 27, 2023 21:15

fheinecke mentioned this pull request Sep 26, 2023

Release 14.0.1 #32611

Merged

	// username is a username of the user who request the embeddings.
	// username is a username of the user who requested the embeddings.

Conversation

jakule commented Jun 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jakule commented Jun 21, 2023 •

edited

Loading