[Security Solution] [AI Assistant] Security AI Assistant - Citations by KDKHD · Pull Request #206683 · elastic/kibana

KDKHD · 2025-01-14T21:02:31Z

Note

Planning to merge before the 9.0 feature freeze.
Documentation update issue: elastic/security-docs#6473

Tip

Tip for the reviewer

As a starting point to review this PR I would suggest reading the section "How does it work (on a high level)" and viewing the hyperlinked code. The linked code covers the main concepts of this feature and the majority of the remaining changes in the PR are related to API schema updates and tests.

Summary

This PR adds citations to the security AI assistant. Citations are produced when tools are used and they are displayed in the LLM response as numbered superscript elements. A label appears when the user hovers over the numbered elements and clicking on the label opens a new tab that displays the cited data.

How to test:

Enable the feature flags:

# kibana.dev.yml
xpack.securitySolution.enableExperimental: ['contentReferencesEnabled']

Populate the security knowledge base with some information (e.g. a document, an index, product documentation, the global threat report, etc...).
Open the security assistant
Ask it the following questions about data in the knowledge base:

What is an elastic search cold tier? - Make sure product docs are installed
What topics are covered in the security lab content?
Ask it a question about one of your knowledge base documents.
Which platform did my most recent alert happen on? - Make sure you have a recent alert

How does it work (on a high level)?

Citations are stored inside the ContentReferencesStore object. When tools are called, the tools add citations to the ContentReferencesStore and pass the Id of the ContentReferences back to the LLM along side the result of the tool. The LLM can then use those contentReference IDs in its response by forming a response like:

The sky is blue {reference(12345)}

The web client parses out the contentReference ({reference(12345)}) from the assistant message and replaces it with the citation react component.

Tools that are cited:

Include citations for the following tools:
alert_counts_tool -> cites to alerts page
knowledge_base_retrieval_tool -> cites knowledge base management page with specific entry pre-filtered
open_and_acknowledged_alerts_tool -> cites to specific alert
security_labs_tool -> cites knowledge base management page with specific entry pre-filtered
knowledge_base indices -> opens ESQL view selecting the particular document used
product_documentation -> cites documentation

Endpoints impacted

POST /internal/elastic_assistant/actions/connector/{connectorId}/_execute
POST /api/security_ai_assistant/chat/complete
GET /api/security_ai_assistant/current_user/conversations/_find
GET /api/security_ai_assistant/current_user/conversations/:id
PUT /api/security_ai_assistant/current_user/conversations/{id}

Considerations:

One of the main objectives of this feature was to produce in-text citations to create a great user experience. Multiple approaches were tested to do this reliably. Attempts were made to make the LLM return structured JSON containing the citations however this was unreliable with smaller models. Generation post-processing (issuing an additional LLM call to annotate the response with citations) was also explored however this also had limitations as the second LLM call would not contain enough contextual information to create the citations reliably. Eventually, the approach described in the section above was used alongside few shot promoting.
Instead of using the ContentReferencesStore to store citations, the langGraph state could be used to save the citations. I looked at doing this but currently, there are a few blockers in the langgraph API the prevent this.
- Lang graph must be updated to @langchain/langgraph>=0.2.31 to get access to the Command type so that tools can update the graph state.
- It seems that DynamicStructuredTools do not support the Command type yet. This is something that we can clarify with the langchain team.
  Once these blockers have been addressed, ContentReferencesStore could easily be refactored to the graph state.
The feature has been put behind a feature flag so we can test during the feature freeze and sync the release of the documentation update. The only thing that is not behind a feature flag is the new anonymization button in the settings menu (don't think it is necessary and it means a lot more code changes are required).

On few occasions, you can nudge the LLM a bit more to include citations by appending "Include citations" to your message.

Furthermore, the settings menu has been updated to include anonymized values and citation toggles:

Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support
Documentation was added for features that require explanation or tutorials
Unit or functional tests were updated or added to match the most common scenarios
If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the docker list
This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The release_note:breaking label should be applied in these situations.
Flaky Test Runner was used on any tests changed
The PR description includes the appropriate Release Notes section, and the correct release_note:* label is applied per the guidelines

Identify risks

Does this PR introduce any risks? For example, consider risks like hard to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified risk. Invite stakeholders and evaluate how to proceed before merging.

See some risk examples
...

Release note

Adds in-text citations to security solution AI assistant responses.

…y-assistant-content-references

…-fix'

…b.com:KDKHD/kibana into feature/security-assistant-content-references

…-fix'

…b.com:KDKHD/kibana into feature/security-assistant-content-references

…-fix'

…b.com:KDKHD/kibana into feature/security-assistant-content-references

…-fix'

…b.com:KDKHD/kibana into feature/security-assistant-content-references

e40pud · 2025-01-23T15:46:50Z

...-common/impl/content_references/content_references_store/content_references_store_factory.ts

+   * @param size Size of ID to generate
+   * @returns an unsecure Id
+   */
+  const generateUnsecureId = (size = 5): string => {


any reason not to use import { v4 as uuid } from 'uuid';?

Yes, I purposefully chose not to use a UUID. These are some of the reasons:

UUID is unnecessarily long. A message only has a small number of references and we don't need 30+ characters to represent their IDs.

The longer the ID, the higher the chance that the LLM will not copy it correctly to the output. A shorter ID makes the references that are produced more reliable.

Shorter ID results in fewer output tokens.

When the output is streamed to the frontend we can quickly replace the short ids with the citation component. If the ids were longer, we would need to stream the full UUID before being able to insert the citation component. I tried using UUIDs at the beginning and it didn't look great because the long UUIDs were getting streamed in the middle of the response and it looked very choppy.

x-pack/platform/packages/shared/kbn-elastic-assistant/impl/assistant/assistant_body/index.tsx

...security/plugins/elastic_assistant/server/ai_assistant_data_clients/conversations/helpers.ts

e40pud · 2025-01-28T16:16:46Z

...ecurity/plugins/elastic_assistant/server/ai_assistant_data_clients/knowledge_base/helpers.ts

 }: {
  indexEntry: IndexEntry;
  esClient: ElasticsearchClient;
+  contentReferencesStore: ContentReferencesStore | false;


Was the intention to make contentReferencesStore optional? If yes, then we should be using contentReferencesStore?: ContentReferencesStore;. We should not mix different types here.

I chose to use the type contentReferencesStore: ContentReferencesStore | false; instead of contentReferencesStore?: ContentReferencesStore; because if contentReferencesStore is optional, other devs (and future me) may forget to pass it as a parameter.

Using contentReferencesStore: ContentReferencesStore | false; requires devs to be deliberate about if they want to use the contentReferencesStore or not.

I can make it optional though if you think it would be better.

If you want to enforce users to pass contentReferencesStore even if it is optional, then you can do that by explicitly defining it as undefined as well:

contentReferencesStore: ContentReferencesStore | undefined;

So, that users will have to call this as { contentReferencesStore: undefined }.

Btw, why would we want to make contentReferencesStore optional?

If you want to enforce users to pass contentReferencesStore even if it is optional, then you can do that by explicitly defining it as undefined as well:
contentReferencesStore: ContentReferencesStore | undefined;

Good point. I will change it to this.

Btw, why would we want to make contentReferencesStore optional?

I made it optional when I was putting this whole feature behind a feature flag. The plan would be to make it not optional once the feature flag is gone.

Updated in #208902

e40pud · 2025-01-28T16:17:41Z

.../security/plugins/elastic_assistant/server/ai_assistant_data_clients/knowledge_base/index.ts

+    contentReferencesStore,
    esClient,
  }: {
+    contentReferencesStore: ContentReferencesStore | false;


same as above, should it be contentReferencesStore?: ContentReferencesStore; instead?

Updated in #208902

e40pud · 2025-01-28T16:18:14Z

x-pack/solutions/security/plugins/elastic_assistant/server/lib/langchain/executors/types.ts

  assistantTools?: AssistantTool[];
  connectorId: string;
  conversationId?: string;
+  contentReferencesStore: ContentReferencesStore | false;


Updated in #208902

e40pud · 2025-01-28T16:20:15Z

...urity/plugins/elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/index.ts

+    ...(contentReferences ? { contentReferences } : {}),
+  };
+
+  const isMetadataPopulated = Boolean(contentReferences) !== false;


I don't think we would need != false if we use const isMetadataPopulated = !!contentReferences;

e40pud · 2025-01-28T16:21:24Z

x-pack/solutions/security/plugins/elastic_assistant/server/types.ts

  inference?: InferenceServerStart;
  isEnabledKnowledgeBase: boolean;
  connectorId?: string;
+  contentReferencesStore: ContentReferencesStore | false;


Updated in #208902

.../assistant/get_comments/content_reference/components/content_reference_component_factory.tsx

…erimental_features.ts

…-fix'

…y-assistant-content-references # Conflicts: # oas_docs/output/kibana.serverless.yaml # oas_docs/output/kibana.yaml # x-pack/platform/packages/shared/kbn-elastic-assistant-common/docs/openapi/ess/elastic_assistant_api_2023_10_31.bundled.schema.yaml # x-pack/platform/packages/shared/kbn-elastic-assistant-common/docs/openapi/serverless/elastic_assistant_api_2023_10_31.bundled.schema.yaml # x-pack/platform/packages/shared/kbn-elastic-assistant/tsconfig.json # x-pack/solutions/security/plugins/elastic_assistant/server/ai_assistant_data_clients/knowledge_base/index.ts

…y-assistant-content-references # Conflicts: # x-pack/solutions/security/plugins/elastic_assistant/scripts/draw_graph_script.ts # x-pack/solutions/security/plugins/elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/index.test.ts # x-pack/solutions/security/plugins/elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/index.ts # x-pack/solutions/security/plugins/elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/nodes/run_agent.ts # x-pack/solutions/security/plugins/elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/nodes/translations.ts

kibanamachine · 2025-01-29T01:48:17Z

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/13023066804

elasticmachine · 2025-01-29T01:49:18Z

💛 Build succeeded, but was flaky

Buildkite Build
Commit: d53905f

Failed CI Steps

Jest Tests #3

Test Failures

[job] [logs] Jest Tests #3 / CryptoService Logging should log a warning when retrying
[job] [logs] Jest Tests #3 / CryptoService should retry and succeed retryable status code: 503
[job] [logs] Jest Tests #3 / TemplateTags adds new template tag to existing tags

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`automaticImport`	711	715	+4
`securitySolution`	6610	6624	+14
total			+18

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`@kbn/elastic-assistant-common`	414	428	+14
`elasticAssistant`	40	41	+1
total			+15

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`securitySolution`	21.3MB	21.3MB	+15.7KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`securitySolution`	88.2KB	88.3KB	+28.0B

Unknown metric groups

API count

id	before	after	diff
`@kbn/elastic-assistant-common`	451	502	+51
`elasticAssistant`	55	56	+1
total			+52

History

cc @KDKHD

kibanamachine · 2025-01-29T01:53:26Z

💔 All backports failed

Status	Branch	Result
❌	8.x	Backport failed because of merge conflicts You might need to backport the following PRs to 8.x: - [EUI Visual Refresh][Security Gen AI] Add support for borealis theme (#202393)

Manual backport

To create the backport manually run:

node scripts/backport --pr 206683

Questions ?

Please refer to the Backport tool documentation

KDKHD · 2025-01-29T02:11:16Z

💚 All backports created successfully

Status	Branch	Result
✅	8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

KDKHD and others added 16 commits January 14, 2025 09:14

metadata + content references backend

bd4230a

frontent + productdocs references

777f577

remove content references from conversation history

8ac60a0

content references switch

6c17942

Merge branch 'main' of github.com:elastic/kibana into feature/securit…

4fc3c00

…y-assistant-content-references

fix scroll

aa31e87

update propmt

d4abe39

[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…

2e69885

…-fix'

style

c190b31

Merge branch 'feature/security-assistant-content-references' of githu…

f61044e

…b.com:KDKHD/kibana into feature/security-assistant-content-references

fix imports

fca31b2

[CI] Auto-commit changed files from 'yarn openapi:generate'

6a6a8e0

[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…

f5707fb

…-fix'

esql reference

3d7d3da

style

d90cea5

[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…

8901118

…-fix'

KDKHD marked this pull request as ready for review January 16, 2025 21:24

KDKHD requested review from a team as code owners January 16, 2025 21:24

KDKHD marked this pull request as draft January 16, 2025 21:24

kibanamachine and others added 10 commits January 16, 2025 23:41

[CI] Auto-commit changed files from 'yarn openapi:generate'

8c70a29

types

9e49360

Merge branch 'feature/security-assistant-content-references' of githu…

ba45109

…b.com:KDKHD/kibana into feature/security-assistant-content-references

[CI] Auto-commit changed files from 'yarn openapi:generate'

dc5cfe1

fix schema gen

fd1f2a8

[CI] Auto-commit changed files from 'yarn openapi:bundle'

c0edb25

[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…

56a28a4

…-fix'

types

c7db935

[CI] Auto-commit changed files from 'make api-docs'

972485c

[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…

44c3f1e

…-fix'

KDKHD and others added 3 commits January 28, 2025 11:06

code review

0a9f1e3

[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…

00dce02

…-fix'

lint

a8ce54e

bojanasan mentioned this pull request Jan 28, 2025

[UI copy]: Review UX copy for the AI Assistant Citations feature elastic/security-docs#6485

Closed

KDKHD and others added 4 commits January 28, 2025 12:08

Merge branch 'feature/security-assistant-content-references' of githu…

558831a

…b.com:KDKHD/kibana into feature/security-assistant-content-references

[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…

fad89a2

…-fix'

lint

0fe2363

Merge branch 'feature/security-assistant-content-references' of githu…

8174da2

…b.com:KDKHD/kibana into feature/security-assistant-content-references

e40pud reviewed Jan 28, 2025

View reviewed changes

KDKHD and others added 6 commits January 28, 2025 17:16

feature flag + @e40pud code review

e722565

Fix test

c838733

Update x-pack/solutions/security/plugins/security_solution/common/exp…

9c54500

…erimental_features.ts

types

82fe9e6

[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…

551eaf8

…-fix'

patrykkopycinski approved these changes Jan 28, 2025

View reviewed changes

kibanamachine and others added 4 commits January 28, 2025 22:08

[CI] Auto-commit changed files from 'yarn openapi:bundle'

d681f85

[CI] Auto-commit changed files from 'make api-docs'

5e8e64e

fix import

d53905f

KDKHD mentioned this pull request Jan 29, 2025

[8.x] [Security Solution] [AI Assistant] security assistant content references (#206683) #208666

Merged

KDKHD mentioned this pull request Feb 24, 2025

[Security Solution] [AI Assistant] Remove citations feature flag #212204

Merged

9 tasks

Conversation

KDKHD commented Jan 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tip for the reviewer

Summary

How to test:

How does it work (on a high level)?

Tools that are cited:

Endpoints impacted

Considerations:

Checklist

Identify risks

Release note

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KDKHD Jan 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KDKHD Jan 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kibanamachine commented Jan 29, 2025

Uh oh!

elasticmachine commented Jan 29, 2025

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

Metrics [docs]

Module Count

Public APIs missing comments

Async chunks

Page load bundle

API count

History

Uh oh!

kibanamachine commented Jan 29, 2025

💔 All backports failed

Manual backport

Questions ?

Uh oh!

KDKHD commented Jan 29, 2025

💚 All backports created successfully

Questions ?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

KDKHD commented Jan 14, 2025 •

edited

Loading

KDKHD Jan 28, 2025 •

edited

Loading

KDKHD Jan 30, 2025 •

edited

Loading