[Security AI Assistant] Chat complete API#184485
Conversation
…menko/kibana into security-ai-assistant-public-chat
…nt-public-chat # Conflicts: # x-pack/plugins/elastic_assistant/server/routes/post_actions_connector_execute.ts
## Summary Migrates our existing RAG pipeline to use LangGraph, and adds tools for Knowledge Base retrieval/storage. When the `assistantKnowledgeBaseByDefault` FF is enabled, a new branch, `callAssistantGraph()`, is taken in `postActionsConnectorExecuteRoute` that exercises the LangGraph implementation. This is a drop-in replacement for the existing `callAgentExecutor()` in effort to keep adoption as clean and easy as possible. The new control flow is as follows: `postActionsConnectorExecuteRoute` -> `callAssistantGraph()` -> `getDefaultAssistantGraph()` -> `isStreamingEnabled ? streamGraph() : invokeGraph()` Graph creation is isolated to `getDefaultAssistantGraph()`, and execution (streaming or not) has been extracted to `streamGraph()` and `invokeGraph()` respectively. Note: Streaming currently only works with `ChatOpenAI` models, but `SimpleChatModelStreaming` was de-risked and just need to discuss potential solutions with @stephmilovic. See [comment here](https://github.com/elastic/kibana/pull/184554/files#diff-ad87c5621b231a40810419fc1e56f28aeb4f8328e125e465dfe95ae0e1c305b8R97-R98). #### DefaultAssistantGraph To start with a predictable and piecemeal migration, our existing `agentExecutor` pipeline has been recreated in LangGraph. It consists of a single agent node, either `OpenAIFunctionsAgent`, or `StructuredChatAgent` (depending on the backing LLM), a tool executing node, and a conditional edge that routes between the two nodes until there are no more function calls chosen by the agent. This varies from our initial implementation in that multiple tool calls are now supported, so a user could ask about their alerts AND retrieve additional knowledge base information in the same response. > [!NOTE] > While `chat_history` has been plumbed into the graph, after discussing with @YulNaumenko we decided to wait to plumb the rest of persistence into the graph until #184485 is merged. I had already plumbed through the `chatTitleGeneration` node ([here](https://github.com/elastic/kibana/pull/184554/files#diff-26038489e9a3f1a14c5ea2ac2954671973d833349ef3ffaddcf9b29ce9e2b96eR33)), and so will just need to include initial conversation creation and append/update operations. #### Knowledge History & KB Tools Knowledge History is now always added in the initial prompt for any KB documents marked as `required`, and two new tools were added for creating and recalling KB entries from within the conversation, `KnowledgeBaseWriteTool` and `KnowledgeBaseRetrievalTool` respectively. All three methods of storing and retrieving KB content use the `kbDataClient` for access, and scopes all requests to the authenticatedUser that made the initial request. Additional Notes: * LangChain dependencies have been updated, and a new dependency on `LangGraph` has been added. ### Checklist Delete any items that are not applicable to this PR. - [X] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials * Feature currently behind a FF, documentation to be added once feature is complete. Tracked in elastic/security-docs#5337. - [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios * Test coverage in progress... --------- Co-authored-by: Patryk Kopycinski <contact@patrykkopycinski.com> Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
…nt-public-chat # Conflicts: # x-pack/packages/kbn-langchain/server/language_models/simple_chat_model.ts # x-pack/plugins/elastic_assistant/server/routes/post_actions_connector_execute.ts
…nt-public-chat # Conflicts: # x-pack/packages/kbn-langchain/server/language_models/simple_chat_model.ts # x-pack/plugins/elastic_assistant/server/lib/langchain/execute_custom_llm_chain/index.ts # x-pack/plugins/elastic_assistant/server/routes/helpers.ts # x-pack/plugins/elastic_assistant/server/routes/register_routes.ts
|
Works as expected from playground side. |
…nt-public-chat # Conflicts: # x-pack/plugins/elastic_assistant/server/routes/attack_discovery/post_attack_discovery.ts
| conversationUpdateProps, | ||
| isPatch, | ||
| user: authenticatedUser, | ||
| user: authenticatedUser ?? this.options.currentUser ?? undefined, |
There was a problem hiding this comment.
Do we want to add the same authUser check here for updateConversation() as you did above for createConversation()?
if (!this.options.currentUser) {
throw new Error('AIAssistantConversationsDataClient currentUser is not defined.');
}| responses: | ||
| 200: |
There was a problem hiding this comment.
Thoughts on capturing the chat/complete (non-stream) response interface here as well?
I was going to comment and see if we wanted to return conversationId when persist:true, but then noticed we are just using the ExecuteConnectorResponse interface as part of StaticReturnType, which is what langChainExecute implicitly returns.
I guess more generally, what are your thoughts on chat/complete replacing /actions/connector/{connectorId}/_execute route? Would be nice to get away from that initial implementation that was mirroring the actions framework.
| default: () => '', | ||
| }, | ||
| conversation: { | ||
| value: (x: ConversationResponse | undefined, y?: ConversationResponse | undefined) => |
There was a problem hiding this comment.
nit: This had me do a double-take as I was like "Why is the conversation response being stored in state, wouldn't that be computed from messages?", then I realized it's just the current persisted state of the conversation.
Maybe we should alias ConversationResponse to Conversation for more general use? I do like that we're aligning on the API response interface, that's nice.
| const inputs = { input: latestMessage[0]?.content as string }; | ||
|
|
||
| if (isStream) { | ||
| return streamGraph({ apmTracer, assistantGraph, inputs, logger, onLlmResponse, request }); |
There was a problem hiding this comment.
FYI, @patrykkopycinski and I noticed LangSmith tracing wasn't working with streaming. Turns out I forgot to pass in traceOptions here which includes the necessary tracer. It's plumbed through the rest of the way, so only need to add it to this call.
Patryk is going to fix this in his esql PR, so need to add it now if you don't want, but wanted to let you know in case you were looking for traces it in LangSmith.
| if (state.messages.length !== 0) { | ||
| logger.debug('No need to generate chat title, messages already exist'); | ||
| return; | ||
| return { chatTitle: '' }; |
There was a problem hiding this comment.
Any reason you're using empty string vs undefined for denoting when we'll use the title from the persisted conversation?
| conversation, | ||
| messages, | ||
| chatTitle: conversation.title, | ||
| input: !state.input ? conversation.messages?.slice(-1)[0].content : state.input, |
There was a problem hiding this comment.
🙏
Will be so nice to then separate out the system prompt as well once prompt persistence lands. Thank you @YulNaumenko!!
…menko/kibana into security-ai-assistant-public-chat
| ...state, | ||
| chat_history: state.messages, // TODO: Message de-dupe with ...state spread | ||
| knowledge_history: knowledgeHistory?.length ? knowledgeHistory : NO_HISTORY, | ||
| knowledge_history: JSON.stringify(knowledgeHistory?.length ? knowledgeHistory : NO_HISTORY), |
| // TODO: For Bedrock streaming support, override `handleLLMNewToken` in callbacks, | ||
| // TODO: or maybe we can update ActionsClientSimpleChatModel to handle this `on_llm_stream` event |
There was a problem hiding this comment.
FYI, this might be resolved with Patryk's work on BedrockChat: #186809. Will still be needed for SimpleChatModel though
| filter: `users:{ id: "${ | ||
| ctx.elasticAssistant.getCurrentUser()?.profile_uid | ||
| }" } AND title:${request.body.title}`, |
There was a problem hiding this comment.
When creating conversations via persist:true using the chat/complete API, I noticed that the conversations were not showing up in the UI. This appears to be because the user object doesn't have profile_uid when using the API, but this is the unique identifier we use when fetching.
I used this API request to create the conversation:
### 4. Q&A with creation of the new conversation:
POST http://localhost:5601/kbn/api/security_ai_assistant/chat/complete
kbn-xsrf
Elastic-Api-Version: 2023-10-31
Authorization: Basic elastic changeme
X-Kbn-Context: %7B%22type%22%3A%22application%22%2C%22name%22%3A%22securitySolutionUI%22%2C%22url%22%3A%22%2Fkbn%2Fapp%2Fsecurity%22%7D
Content-Type: application/json
{
"connectorId": "azure-open-ai",
"persist": true,
"responseLanguage": "dutch",
"messages": [
{
"role": "user",
"content": "Tell me funny joke"
}
]
}And the conversation created was:
{
"_index": ".ds-.kibana-elastic-ai-assistant-conversations-default-2024.06.28-000001",
"_id": "f0d5694a-51e1-41a9-84ed-cca483321675",
"_score": 1,
"_source": {
"@timestamp": "2024-07-01T22:19:54.563Z",
"updated_at": "2024-07-01T22:19:56.210Z",
"api_config": {
"action_type_id": ".gen-ai",
"connector_id": "azure-open-ai"
},
"namespace": "default",
"created_at": "2024-07-01T22:19:54.563Z",
"messages": [
{
"@timestamp": "2024-07-01T22:19:54.563Z",
"role": "user",
"reader": null,
"is_error": null,
"content": "Tell me funny joke"
},
{
"@timestamp": "2024-07-01T22:19:56.210Z",
"role": "assistant",
"reader": null,
"is_error": false,
"trace_data": {},
"content": """Why don't scientists trust atoms?
Because they make up everything!"""
}
],
"replacements": [],
"title": "Cybersecurity Humor",
"category": "assistant",
"users": [
{
"name": "elastic"
}
]
}
},And this was the `user` object logged out right when we create the resource
{
"username": "elastic",
"roles": [
"superuser"
],
"full_name": null,
"email": null,
"metadata": {
"_reserved": true
},
"enabled": true,
"authentication_realm": {
"name": "reserved",
"type": "reserved"
},
"lookup_realm": {
"name": "reserved",
"type": "reserved"
},
"authentication_type": "realm",
"authentication_provider": {
"type": "http",
"name": "__http__"
},
"elastic_cloud_user": false
}
spong
left a comment
There was a problem hiding this comment.
Checked out, code reviewed, and tested locally -- LGTM! Thanks so much @YulNaumenko and @stephmilovic for this streamlined API and enhancements in simplifying our flows. Looks great! 🚀
Note: I did find one bug where conversations created via the chat/complete API were not included in the Conversations UI, but will leave that up to you to fix now or in a follow-up.
💚 Build Succeeded
Metrics [docs]Module Count
Public APIs missing comments
Async chunks
Unknown metric groupsAPI count
ESLint disabled line counts
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: cc @YulNaumenko |

Summary
Current PR introduces the new public
chat/completeAPI for Security Assistant. Implementation support both LangGraph and non-LangGraph implementations (managed by experimental featureassistantKnowledgeBaseByDefault).Use cases:
POST http://localhost:5601/api/security_ai_assistant/chat/complete