[Security Assistant] Fixes issue preventing the creation of Knowledge Base Index Entries in deployments with a large number of indices/mappings #231376

spong · 2025-08-11T22:49:42Z

Summary

This PR fixes an issue with the Security Assistant KB Index Entries interface introduced in 9.1 where a large number of indices/mappings could result in Kibana crashing and preventing the creation of the Index Entry.

This is technically a 'fix-hancement' as we are bypassing the underlying issue altogether by switching to use common core API's for index/field suggestions (just as is done in the Discover 'Create a data view' interface), and in turn supporting all fields of type text, not just semantic_text (#230863).

Original Issue

The underlying issue here was introduced by a change to the field_caps API in 9.1 (elastic/elasticsearch#127664) that resulted in the /internal/elastic_assistant/knowledge_base/_indices route not finding any indices with a semantic_text field, and thus inadvertently falling back to doing a full scan of all mappings (source). A fix to this API was initially investigated, but there was no reasonable API available for fetching all occurrences of semantic_text fields, so with match queries adding support for semantic_text in 8.18, it was decided to go ahead and enable support for all text fields.

Fix Details

The Index input field now uses the dataViews.getIndices() API for suggestions (instead of the useKnowledgeBaseIndices hook/route), which is backed by the resolve indices ES API. System indices are filtered out with the *,-.* filter. The initial call will return all indices just as the Discover 'Create a data view' interface, which is further filtered upon as the user continues to type. Note: Discover makes subsequent calls upon further user input, though I'm not entirely sure this is necessary here as all indices are initially returned and available for client-side filtering within the input. I will perform further stress testing with many indices/mappings to confirm.

The Field input now uses the fields already queried for the Output fields input suggestions (via dataViews.getFieldsForWildcard()), just filtered to those whose field.esTypes?.includes('text'). This is potentially still a hot path for client-side code with many mappings, so I will also confirm this with further stress testing.

Docs

@elastic/security-docs / @benironside, we will need to update the Security Assistant KB docs here to indicate that any text field is now supported for retrieval. Docs issue here: elastic/docs-content#2628

Testing

Functional Testing

To confirm proper retrieval of both text and semantic_text fields we'll need to create an index, add a document, then create the KB Index Entry. Then update the KB Index Entry to reference the other field type, and test again.

Create Index

PUT project-details
{
  "mappings": {
    "properties": {
      "project_issue": {
        "type": "text"
      },
      "project_name": {
        "type": "text"
      },
      "summary": {
        "type": "semantic_text",
        "inference_id": ".elser-2-elasticsearch",
        "model_settings": {
          "service": "elasticsearch",
          "task_type": "sparse_embedding"
        }
      }
    }
  }
}

Create Sample Doc

PUT project-details/_doc/doc1
{
    "project_issue": "The main issue at hand is the breaking of the space plane",
    "project_name": "Issue 5",
    "summary": "This is a summary that contains the word yellow"
}

Create KB Index Entry (Text)

POST kbn:api/security_ai_assistant/knowledge_base/entries
{
  "type": "index",
  "name": "Project Details Tool",
  "index": "project-details",
  "field": "project_issue",
  "outputFields": [],
  "description": "Use this index to answer questions about any project details.",
  "queryDescription": "Key terms to search for from the user's prompt."
}

Now you can open the Assistant and perform a query like:

Do I have any project details about the issue at hand?

Which should result in a trace like this which calls the generated tool that will then perform a lexical search against the configured index. Ensure citations work as expected for the returned document.

Now open the Index Entry in the KB Settings UI and change the field to from project_issue to summary for testing semantic_text.

Open the Assistant and perform a query like:

Do I have any project details for Project Yellow?

Which should result in a trace like this which calls the generated tool that will then perform a semantic search against the configured index. Ensure citations work as expected for the returned document.

Performance Testing

⚠️ In progress -- I will include a script for generating many indices/mappings for testing, and also prepare the ci-cloud-deploy instance with the same setup for confirmation.

Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support
Documentation was added for features that require explanation or tutorials
- Will coordinate with @elastic/security-docs on the docs update here.
Unit or functional tests were updated or added to match the most common scenarios

e40pud · 2025-08-12T08:20:46Z

Since /internal/elastic_assistant/knowledge_base/_indices is internal and used only in IndexEntryEditor should we remove it and related hooks completely?

e40pud · 2025-08-12T09:53:59Z

Thanks for the fix!! Changes LGTM

Below are local testing results

Testing setup

Uploaded a PDF document: Elastic Global Threat Report 2024 and copied content into additional content field of semantic_text type
Created a new index entry in the KB
- Data Description: "Use this tool to answer questions about the Elastic Global Threat Report (GTR) 2024"
- Query Instruction: "Key terms to return data relevant to the Elastic Global Threat Report (GTR) 2024"
Asked assistant next five questions:
1. Who are the authors of the GTR 2024?
2. What is the forecast for the coming year in GTR 2024?
3. What are top 10 Process Injection by rules in Windows endpoints in GTR 2024?
4. What is the most widely adopted cloud service provider this year according to GTR 2024?
5. Give a brief conclusion of the GTR 2024

Results

1. Who are the authors of the GTR 2024?

🟡 Assistant was not able to find authors of the document.

First of all, I remember this to be working well some versions ago and assistant was able to list all the contributors.

It looks like something has changed in the process of converting user input into a meaningful tool input to the KB tool calling. Right now (in case of GPT-4.1) the question above is being converted into a "authors" and sent as an input to the tool (here is the example trace).

When I tried to search within the index using the original question, I was able to see authors as part of the highlighted fragments:

Semantic search

GET elastic-global-threat-report-2024/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "content": "Who are the authors of the GTR 2024?"
          }
        }
      ]
    }
  },
  "highlight": {
    "fields": {
      "content": {
        "number_of_fragments": 2,
        "order": "score"
      }
    }
  }
}

Output

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 8.70585,
    "hits": [
      {
        "_index": "elastic-global-threat-report-2024",
        "_id": "u5x4nZgB7B6Is1yyNd5u",
        "_score": 8.70585,
        "_ignored": [
          "attachment.content.keyword"
        ],
        "_source": {
          "attachment": {
            "date": "2024-09-30T21:51:48Z",
            "content_type": "application/pdf",
            "format": "application/pdf; version=1.3",
            "modified": "2024-10-01T09:38:32Z",
            "language": "en",
            "metadata_date": "2024-10-01T09:38:32Z",
            "creator_tool": "Adobe InDesign 19.5 (Windows)",
            "content": """..."""
        },
        "highlight": {
          "content": [
            """2024

Global
Threat�
Report



Table of Contents
Introduction

Generative AI

	 Threat overview

	 Augmenting defenders

Malware Detections

	 Distribution by operation system

	 Malware categories

Endpoint Behaviors

	 Distribution by operating system

	 Distribution by tactic

Cloud Security

	 Distribution by cloud service provider

	 Benchmarking cloud security posture

Threat Profiles

	 REF5961 — BLOODALCHEMY, RUDEBIRD, EAGERBEE, DOWNTOWN

	 REF8207 — GHOSTPULSE

	 REF4578 — GHOSTENGINE

	 REF7001 — KANDYKORN

	 REF6127 — WARMCOOKIE

Responding to 2023 Forecasts

Forecasts and Recommendations

Conclusion

03

04

04

05

06

06

07

11

11

12

36

37

47

57

58

61

64

66

69

72

75

79

1

2

3

4

5

6

7

8

9

2024 Elastic Global Threat Report



Introduction1

2024 Elastic Global Threat Report

03

Introduction

With the best technologies, the most
widespread information distribution, and the
greatest public awareness of threats all in
motion, the security environment is stronger
than ever. Yet, almost in spite of these things,
threat ecosystems are thriving like never before.

Truthfully, the threat landscape is dynamic
and reactive — a new technique empowers
a previously unknown threat group, vendors
swarm to mitigate that threat and create new
technologies in the process, operators on both
""",
            """https://www.elastic.co/security
https://www.elastic.co/security-labs
https://x.com/elasticseclabs


Conclusion9

2024 Elastic Global Threat Report

80

The 2024 Elastic Global Threat Report features insights and expertise from across the
Elastic organization. We’d like to thank the following Elasticians for their contributions:

Mika Ayenson

Samir Bousseaden

Terrance DeJesus

Chris Donaher

Tinsae Erkailo

Ayoub Faouzi

Eric Forte

Ruben Groenewoud

Justin Ibarra

Devon Kerr

Jake King

Shashank Suryanarayana

Mark Mager

Asuka Nakajima

Andrew Pease

John Uhlmann

Alyssa VanNice

Colson Wilhoit



© 2024. Elasticsearch B.V. All Rights Reserved.
Elastic, Elasticsearch and other related marks are trademarks, logos or registered trademarks of Elasticsearch B.V. in the United States
and other countries. Microsoft, Azure, Windows and other related marks are trademarks of the Microsoft group of companies. Amazon
Web Services, AWS, and other related marks are trademarks of Amazon.com, Inc. or its affiliates. All other brand names, product
names, or trademarks belong to their respective owners.

2024

Global
Threat�
Report


	TOC
	Introduction
	Generative AI
	Threat overview
	Augmenting defenders

	Malware Detections
	Malware categories

	Endpoint Behaviors
	Distribution by tactic

	Cloud Security
	Distribution by cloud service provider
	Benchmarking cloud security posture

	Threat Profiles
	REF5961
	REF8207
	REF4578
	REF7001
	REF6127

	Responding to 2023 Forecasts
	Forecasts and Recommendations
	Conclusion"""
          ]
        }
      }
    ]
  }
}

This issue is not related to these changes, but wanted to highlight it. We might want to have a look into this and see how we can improve the experience.

2. What is the forecast for the coming year in GTR 2024?

✅ Great answer based on KB index entry

3. What are top 10 Process Injection by rules in Windows endpoints in GTR 2024?

✅ Great answer based on KB index entry

4. What is the most widely adopted cloud service provider this year according to GTR 2024?

✅ Great answer based on KB index entry

5. Give a brief conclusion of the GTR 2024

✅ Great answer based on KB index entry

…get kb indices route

…t-support

florent-leborgne

Doc link LGTM 🔗

e40pud

Thanks for all changes!! LGTM 🚀

elasticmachine · 2025-08-13T15:56:51Z

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`alerting`	315	314	-1
`apm`	1966	1965	-1
`automaticImport`	804	802	-2
`cases`	1132	1131	-1
`datasetQuality`	817	816	-1
`discover`	1384	1383	-1
`elasticAssistant`	460	458	-2
`embeddableAlertsTable`	514	513	-1
`infra`	1526	1525	-1
`ml`	2500	2499	-1
`monitoring`	725	724	-1
`observability`	1394	1393	-1
`observabilityAIAssistantApp`	437	436	-1
`observabilityShared`	308	307	-1
`securitySolution`	7886	7884	-2
`slo`	1235	1234	-1
`stackAlerts`	278	277	-1
`synthetics`	1340	1339	-1
`timelines`	248	247	-1
`transform`	784	783	-1
`triggersActionsUi`	965	964	-1
`uptime`	868	867	-1
total			-25

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`@kbn/elastic-assistant-common`	679	676	-3

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`aiAssistantManagementSelection`	78.5KB	78.6KB	+128.0B
`lists`	125.6KB	125.8KB	+128.0B
`securitySolution`	10.4MB	10.4MB	-240.0B
total			+16.0B

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`core`	499.9KB	500.0KB	+128.0B
`securitySolution`	95.3KB	95.3KB	-2.0B
total			+126.0B

Unknown metric groups

API count

id	before	after	diff
`@kbn/elastic-assistant-common`	793	790	-3

History

💔 Build #328987 failed 85e059a
💔 Build #328975 failed cfcdbf6
💚 Build #328561 succeeded b4fc185

cc @spong

spong · 2025-08-13T22:27:09Z

...stic-assistant/impl/knowledge_base/knowledge_base_settings_management/index_entry_editor.tsx

Alrighty @e40pud! 👋

Vibed with gemini-cli and here's a nice little node script for generating a buncha indices/mappings. It generates indices with names starting with a-z (for testing sorting/filtering), then generates mappings using all the different field types. Save the below to a file:

populate_es.js

#!/usr/bin/env node const http = require('http'); const https = require('https'); const readline = require('readline'); function parseArgs() { const args = process.argv.slice(2).reduce((acc, arg, i, arr) => { if (arg.startsWith('--')) { const key = arg.slice(2); const next = arr[i + 1]; if (next && !next.startsWith('--')) { acc[key] = next; } else { acc[key] = true; } } return acc; }, {}); if (args.help || args.h) { console.log(` Elasticsearch Index/Mapping Populator and Cleanup Script Usage: node populate_es.js [options] node populate_es.js --cleanup node populate_es.js --delete-by-count <number> Description: This script stress-tests an Elasticsearch instance by creating a large number of indices with many fields. It can also clean up the indices it creates. Creation Options: --host <url> Elasticsearch host URL (default: http://localhost:9200) --user <username> Username for basic auth (default: elastic) --pass <password> Password for basic auth (default: changeme) --apiKey <key> API key for authentication (overrides user/pass) --indices <number> Number of indices to create (default: 5000) --mappings <number> Number of mappings per index (default: 5000) --maxFields <number> The max number of fields per index (default: same as --mappings) --shards <number> Number of primary shards per index (default: 1) --replicas <number> Number of replicas per index (default: 0) Cleanup & Recovery Options: --cleanup Delete all indices created by this script. --delete-by-count <N> Delete the <N> newest stress-test indices. --yes Bypass confirmation prompt during cleanup. Other Options: -h, --help Show this help message `); process.exit(0); } return { host: args.host || 'http://localhost:9200', user: args.user || 'elastic', pass: args.pass || 'changeme', apiKey: args.apiKey, indices: parseInt(args.indices, 10) || 5000, mappings: parseInt(args.mappings, 10) || 5000, maxFields: parseInt(args.maxFields, 10) || parseInt(args.mappings, 10) || 5000, shards: parseInt(args.shards, 10) || 1, replicas: parseInt(args.replicas, 10) || 0, cleanup: !!args.cleanup, deleteByCount: parseInt(args['delete-by-count'], 10) || 0, yes: !!args.yes, }; } const config = parseArgs(); const simpleFieldTypes = [ { type: 'text' }, { type: 'keyword' }, { type: 'long' }, { type: 'integer' }, { type: 'short' }, { type: 'byte' }, { type: 'double' }, { type: 'float' }, { type: 'half_float' }, { type: 'scaled_float', scaling_factor: 100 }, { type: 'date' }, { type: 'date_nanos' }, { type: 'boolean' }, { type: 'binary' }, { type: 'geo_point' }, { type: 'ip' }, { type: 'completion' }, { type: 'token_count', analyzer: 'standard' }, ]; const complexFieldTypes = [ { type: 'integer_range' }, { type: 'float_range' }, { type: 'long_range' }, { type: 'double_range' }, { type: 'date_range' }, { type: 'geo_shape' }, { type: 'search_as_you_type' }, { type: 'dense_vector', dims: 4 }, { type: 'semantic_text' }, ]; function generateIndexBody(numMappings, maxFields, numShards, numReplicas) { const properties = {}; let fieldCount = 0; for (const fieldType of complexFieldTypes) { if (fieldCount >= numMappings) break; properties[`complex_${fieldType.type}_${fieldCount}`] = { ...fieldType }; fieldCount++; } while (fieldCount < numMappings) { const fieldTypeDefinition = simpleFieldTypes[fieldCount % simpleFieldTypes.length]; properties[`field_${fieldCount}`] = { ...fieldTypeDefinition }; fieldCount++; } return { settings: { 'index.mapping.total_fields.limit': maxFields, 'index.number_of_shards': numShards, 'index.number_of_replicas': numReplicas, }, mappings: { properties }, }; } const sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms)); async function makeRequest(method, path, body, retries = 3, delay = 1000) { for (let i = 0; i < retries; i++) { try { return await new Promise((resolve, reject) => { const url = new URL(config.host); const protocol = url.protocol === 'https:' ? https : http; const options = { hostname: url.hostname, port: url.port, path, method, headers: { 'Content-Type': 'application/json' }, }; if (config.apiKey) { options.headers.Authorization = `ApiKey ${config.apiKey}`; } else if (config.user && config.pass) { const auth = 'Basic ' + Buffer.from(config.user + ':' + config.pass).toString('base64'); options.headers.Authorization = auth; } const req = protocol.request(options, (res) => { let data = ''; res.on('data', (chunk) => (data += chunk)); res.on('end', () => { if (res.statusCode >= 200 && res.statusCode < 300) { try { resolve({ statusCode: res.statusCode, body: JSON.parse(data || '{}') }); } catch (e) { reject(new Error('Failed to parse JSON response.')); } } else { const err = new Error(`Request failed with status code ${res.statusCode}: ${data}`); if (data.includes('resource_already_exists_exception')) { err.isAlreadyExists = true; } if ([429, 503, 504].includes(res.statusCode)) { err.isRetryable = true; } reject(err); } }); }); req.on('error', (e) => reject(e)); if (body) req.write(JSON.stringify(body)); req.end(); }); } catch (error) { if (error.isAlreadyExists || !error.isRetryable || i === retries - 1) { throw error; } await sleep(delay); delay *= 2; } } } async function cleanupIndices() { console.log('Starting cleanup of stress-test indices...'); if (!config.yes) { const rl = readline.createInterface({ input: process.stdin, output: process.stdout }); await new Promise((resolve) => { rl.question( 'Are you sure you want to delete all indices with the pattern "*-stress-test-index-*"? (y/N) ', (answer) => { if (answer.toLowerCase() !== 'y') { console.log('Cleanup cancelled.'); process.exit(0); } rl.close(); resolve(); } ); }); } try { const { body } = await makeRequest('DELETE', '/*-stress-test-index-*'); console.log('Cleanup successful:', body); } catch (error) { if (error.message.includes('404')) { console.log('No stress-test indices found to delete.'); } else { console.error('An error occurred during cleanup:', error.message); process.exit(1); } } } async function deleteIndicesByCount(count) { console.log(`Fetching the ${count} newest stress-test indices to delete...`); try { const { body } = await makeRequest( 'GET', `/_cat/indices/*-stress-test-index-*?h=index&s=creation.date:desc&format=json` ); const indices = body.map((item) => item.index); if (indices.length === 0) { console.log('No stress-test indices found to delete.'); return; } const batchToDelete = indices.slice(0, count); console.log(`Deleting ${batchToDelete.length} indices: ${batchToDelete.join(', ')}`); await makeRequest('DELETE', `/${batchToDelete.join(',')}`); console.log('Deletion successful.'); } catch (e) { console.error('\n[FATAL] Could not get or delete indices:', e.message); process.exit(1); } } async function createIndices() { console.log('Starting to populate Elasticsearch...'); console.log('Configuration:', { ...config, pass: '***', apiKey: config.apiKey ? '***' : undefined, }); const alphabet = 'abcdefghijklmnopqrstuvwxyz'; let createdCount = 0; let skippedCount = 0; const total = config.indices; const barWidth = 40; for (let i = 0; i < total; i++) { const indexName = `${alphabet[i % alphabet.length]}-stress-test-index-${String(i).padStart( 5, '0' )}`; const percent = (i + 1) / total; const filledWidth = Math.round(barWidth * percent); const bar = `[${'█'.repeat(filledWidth)}${'-'.repeat(barWidth - filledWidth)}]`; const percentStr = `${(percent * 100).toFixed(1)}%`; readline.clearLine(process.stdout, 0); readline.cursorTo(process.stdout, 0); process.stdout.write(`${bar} ${percentStr} | [${i + 1}/${total}] Processing: ${indexName}`); const indexBody = generateIndexBody( config.mappings, config.maxFields, config.shards, config.replicas ); try { await makeRequest('PUT', `/${indexName}`, indexBody); createdCount++; } catch (error) { if (error.isAlreadyExists) { skippedCount++; continue; } process.stdout.write('\n'); console.error(`\n[FATAL] Failed while processing index ${indexName}:`, error.message); console.error( 'Exiting due to a critical error. Please check your Elasticsearch cluster status and settings.' ); process.exit(1); } } process.stdout.write('\n'); console.log( `\nPopulation complete. Created: ${createdCount}, Skipped: ${skippedCount}, Total processed: ${ createdCount + skippedCount }.` ); } async function main() { if (config.cleanup) { await cleanupIndices(); } else if (config.deleteByCount > 0) { await deleteIndicesByCount(config.deleteByCount); } else { await createIndices(); } } main().catch((err) => { console.error('\nAn unexpected error occurred:', err.message); process.exit(1); });

and for local dev call ala:

node populate_es.js --indices 4000 --mappings 4000

or for cloud clusters:

node populate_es.js --host https://kibana-pr-231376.es.us-west2.gcp.elastic-cloud.com/ --apiKey asdf== --indices 4000 --mappings 4000

and to cleanup all the garbage it made:

node populate_es.js --cleanup --yes

and since I didn't add circuit breaker detection, if you go too far and ES/Kibana won't start, use this to start deleting chunks of indices till the system is healthy again:

node populate_es.js --delete-by-count 20

I tested this both locally and with the ci:cloud-deploy instance linked and all was well! 🎉

Index suggestions worked without issue, and field fetching continued to work as well (even saw those getting cached in the network panel, which is nice :).

When I pushed it to the cluster limits I was seeing issues everywhere else before I could even make it to the KB UI, so I think we're good here! 😅

kibanamachine · 2025-08-13T22:33:45Z

Starting backport for target branches: 9.1

https://github.com/elastic/kibana/actions/runs/16950887524

kibanamachine · 2025-08-13T22:40:23Z

💔 All backports failed

Status	Branch	Result
❌	9.1	Backport failed because of merge conflicts You might need to backport the following PRs to 9.1: - [Fleet] Improve installation of bundled packages in airgapped environments (#230992) - Upgrade inquirer (#231019) - [scout] log --grep in test run command (#231649) - [ska] relocate Security serverless api & functional tests (#231277) - [Security Solution] Install mock prebuilt rules package in Cypress to reduce flakiness (#229689) - [ska] relocate Observability serverless api & functional tests (#231256)

Manual backport

To create the backport manually run:

node scripts/backport --pr 231376

Questions ?

Please refer to the Backport tool documentation

spong · 2025-08-13T22:57:41Z

💚 All backports created successfully

Status	Branch	Result
✅	9.1

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

@benironside

… Base Index Entries in deployments with a large number of indices/mappings (elastic#231376) ## Summary This PR fixes an issue with the Security Assistant KB Index Entries interface introduced in `9.1` where a large number of indices/mappings could result in Kibana crashing and preventing the creation of the Index Entry. This is technically a 'fix-hancement' as we are bypassing the underlying issue altogether by switching to use common core API's for index/field suggestions (just as is done in the Discover 'Create a data view' interface), and in turn supporting all fields of type `text`, not just `semantic_text` (elastic#230863). ### Original Issue The underlying issue here was introduced by a change to the `field_caps` API in `9.1` (elastic/elasticsearch#127664) that resulted in the `/internal/elastic_assistant/knowledge_base/_indices` route not finding any indices with a `semantic_text` field, and thus inadvertently falling back to doing a full scan of all mappings ([source](https://github.com/elastic/kibana/blob/b128cee4ee7ccc367e8acf159dbf58a75f081867/x-pack/solutions/security/plugins/elastic_assistant/server/routes/knowledge_base/get_knowledge_base_indices.ts#L69)). A fix to this API was initially investigated, but there was no reasonable API available for fetching all occurrences of `semantic_text` fields, so with `match` queries adding support for `semantic_text` in `8.18`, it was decided to go ahead and enable support for all `text` fields. ### Fix Details The `Index` input field now uses the `dataViews.getIndices()` API for suggestions (instead of the `useKnowledgeBaseIndices` hook/route), which is backed by the [resolve indices ES API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-resolve-index). System indices are filtered out with the `*,-.*` filter. The initial call will return all indices just as the Discover 'Create a data view' interface, which is further filtered upon as the user continues to type. Note: Discover makes subsequent calls upon further user input, though I'm not entirely sure this is necessary here as all indices are initially returned and available for client-side filtering within the input. I will perform further stress testing with many indices/mappings to confirm. The `Field` input now uses the fields already queried for the `Output fields` input suggestions (via `dataViews.getFieldsForWildcard()`), just filtered to those whose `field.esTypes?.includes('text')`. This is potentially still a hot path for client-side code with many mappings, so I will also confirm this with further stress testing. ### Docs @elastic/security-docs / @benironside, we will need to update the Security Assistant KB docs [here](https://www.elastic.co/docs/solutions/security/ai/ai-assistant-knowledge-base#knowledge-base-add-knowledge-index) to indicate that any `text` field is now supported for retrieval. ### Testing #### Functional Testing To confirm proper retrieval of both `text` and `semantic_text` fields we'll need to create an index, add a document, then create the KB Index Entry. Then update the KB Index Entry to reference the other field type, and test again. <details><summary>Create Index</summary> ``` JSON PUT project-details { "mappings": { "properties": { "project_issue": { "type": "text" }, "project_name": { "type": "text" }, "summary": { "type": "semantic_text", "inference_id": ".elser-2-elasticsearch", "model_settings": { "service": "elasticsearch", "task_type": "sparse_embedding" } } } } } ``` </details> <details><summary>Create Sample Doc</summary> ``` JSON PUT project-details/_doc/doc1 { "project_issue": "The main issue at hand is the breaking of the space plane", "project_name": "Issue 5", "summary": "This is a summary that contains the word yellow" } ``` </details> <details><summary>Create KB Index Entry (Text)</summary> ``` JSON POST kbn:api/security_ai_assistant/knowledge_base/entries { "type": "index", "name": "Project Details Tool", "index": "project-details", "field": "project_issue", "outputFields": [], "description": "Use this index to answer questions about any project details.", "queryDescription": "Key terms to search for from the user's prompt." } ``` </details> Now you can open the Assistant and perform a query like: ``` Do I have any project details about the issue at hand? ``` Which should result in a [trace like this](https://smith.langchain.com/public/208338a6-42f3-4a9d-bc4c-0ff38d06d34c/r) which calls the generated tool that will then perform a _lexical search_ against the configured index. Ensure citations work as expected for the returned document. Now open the Index Entry in the KB Settings UI and change the `field` to from `project_issue` to `summary` for testing `semantic_text`. Open the Assistant and perform a query like: ``` Do I have any project details for Project Yellow? ``` Which should result in a [trace like this](https://smith.langchain.com/public/3041a48b-5dfd-4c89-9f51-0bcdffd38f63/r) which calls the generated tool that will then perform a _semantic search_ against the configured index. Ensure citations work as expected for the returned document. #### Performance Testing ⚠️ In progress -- I will include a script for generating many indices/mappings for testing, and also prepare the `ci-cloud-deploy` instance with the same setup for confirmation. ### Checklist Check the PR satisfies following conditions. Reviewers should verify this PR satisfies this list as well. - [X] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials * Will coordinate with @elastic/security-docs on the docs update here. - [X] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: kibanamachine <[email protected]> (cherry picked from commit 4496b50) # Conflicts: # x-pack/platform/plugins/private/translations/translations/de-DE.json

@benironside

…wledge Base Index Entries in deployments with a large number of indices/mappings (#231376) (#231717) # Backport This will backport the following commits from `main` to `9.1`: - [[Security Assistant] Fixes issue preventing the creation of Knowledge Base Index Entries in deployments with a large number of indices/mappings (#231376)](#231376)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport)

@benironside

… Base Index Entries in deployments with a large number of indices/mappings (elastic#231376) ## Summary This PR fixes an issue with the Security Assistant KB Index Entries interface introduced in `9.1` where a large number of indices/mappings could result in Kibana crashing and preventing the creation of the Index Entry. This is technically a 'fix-hancement' as we are bypassing the underlying issue altogether by switching to use common core API's for index/field suggestions (just as is done in the Discover 'Create a data view' interface), and in turn supporting all fields of type `text`, not just `semantic_text` (elastic#230863). ### Original Issue The underlying issue here was introduced by a change to the `field_caps` API in `9.1` (elastic/elasticsearch#127664) that resulted in the `/internal/elastic_assistant/knowledge_base/_indices` route not finding any indices with a `semantic_text` field, and thus inadvertently falling back to doing a full scan of all mappings ([source](https://github.com/elastic/kibana/blob/b128cee4ee7ccc367e8acf159dbf58a75f081867/x-pack/solutions/security/plugins/elastic_assistant/server/routes/knowledge_base/get_knowledge_base_indices.ts#L69)). A fix to this API was initially investigated, but there was no reasonable API available for fetching all occurrences of `semantic_text` fields, so with `match` queries adding support for `semantic_text` in `8.18`, it was decided to go ahead and enable support for all `text` fields. ### Fix Details The `Index` input field now uses the `dataViews.getIndices()` API for suggestions (instead of the `useKnowledgeBaseIndices` hook/route), which is backed by the [resolve indices ES API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-resolve-index). System indices are filtered out with the `*,-.*` filter. The initial call will return all indices just as the Discover 'Create a data view' interface, which is further filtered upon as the user continues to type. Note: Discover makes subsequent calls upon further user input, though I'm not entirely sure this is necessary here as all indices are initially returned and available for client-side filtering within the input. I will perform further stress testing with many indices/mappings to confirm. The `Field` input now uses the fields already queried for the `Output fields` input suggestions (via `dataViews.getFieldsForWildcard()`), just filtered to those whose `field.esTypes?.includes('text')`. This is potentially still a hot path for client-side code with many mappings, so I will also confirm this with further stress testing. ### Docs @elastic/security-docs / @benironside, we will need to update the Security Assistant KB docs [here](https://www.elastic.co/docs/solutions/security/ai/ai-assistant-knowledge-base#knowledge-base-add-knowledge-index) to indicate that any `text` field is now supported for retrieval. ### Testing #### Functional Testing To confirm proper retrieval of both `text` and `semantic_text` fields we'll need to create an index, add a document, then create the KB Index Entry. Then update the KB Index Entry to reference the other field type, and test again. <details><summary>Create Index</summary> ``` JSON PUT project-details { "mappings": { "properties": { "project_issue": { "type": "text" }, "project_name": { "type": "text" }, "summary": { "type": "semantic_text", "inference_id": ".elser-2-elasticsearch", "model_settings": { "service": "elasticsearch", "task_type": "sparse_embedding" } } } } } ``` </details> <details><summary>Create Sample Doc</summary> ``` JSON PUT project-details/_doc/doc1 { "project_issue": "The main issue at hand is the breaking of the space plane", "project_name": "Issue 5", "summary": "This is a summary that contains the word yellow" } ``` </details> <details><summary>Create KB Index Entry (Text)</summary> ``` JSON POST kbn:api/security_ai_assistant/knowledge_base/entries { "type": "index", "name": "Project Details Tool", "index": "project-details", "field": "project_issue", "outputFields": [], "description": "Use this index to answer questions about any project details.", "queryDescription": "Key terms to search for from the user's prompt." } ``` </details> Now you can open the Assistant and perform a query like: ``` Do I have any project details about the issue at hand? ``` Which should result in a [trace like this](https://smith.langchain.com/public/208338a6-42f3-4a9d-bc4c-0ff38d06d34c/r) which calls the generated tool that will then perform a _lexical search_ against the configured index. Ensure citations work as expected for the returned document. Now open the Index Entry in the KB Settings UI and change the `field` to from `project_issue` to `summary` for testing `semantic_text`. Open the Assistant and perform a query like: ``` Do I have any project details for Project Yellow? ``` Which should result in a [trace like this](https://smith.langchain.com/public/3041a48b-5dfd-4c89-9f51-0bcdffd38f63/r) which calls the generated tool that will then perform a _semantic search_ against the configured index. Ensure citations work as expected for the returned document. #### Performance Testing ⚠️ In progress -- I will include a script for generating many indices/mappings for testing, and also prepare the `ci-cloud-deploy` instance with the same setup for confirmation. ### Checklist Check the PR satisfies following conditions. Reviewers should verify this PR satisfies this list as well. - [X] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials * Will coordinate with @elastic/security-docs on the docs update here. - [X] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: kibanamachine <[email protected]>

…#231725) ## Summary This is a follow-up to #231376 where I added new `docLinks` for the Security Solution AI Assistant. This PR removes the `TODO` added in that PR and creates a new `securitySolution.aiAssistant` grouping so we can more easily add docLinks as needed. Reviewers Note: Sorry for the extra noise here -- I thought there were more references so decided to do in another PR. Turns out there were not, so this is a quick one!

…elastic#231725) ## Summary This is a follow-up to elastic#231376 where I added new `docLinks` for the Security Solution AI Assistant. This PR removes the `TODO` added in that PR and creates a new `securitySolution.aiAssistant` grouping so we can more easily add docLinks as needed. Reviewers Note: Sorry for the extra noise here -- I thought there were more references so decided to do in another PR. Turns out there were not, so this is a quick one! (cherry picked from commit 90470cf) # Conflicts: # src/platform/plugins/shared/ai_assistant_management/selection/public/routes/components/ai_assistant_selection_page.tsx

@benironside

… Base Index Entries in deployments with a large number of indices/mappings (elastic#231376) ## Summary This PR fixes an issue with the Security Assistant KB Index Entries interface introduced in `9.1` where a large number of indices/mappings could result in Kibana crashing and preventing the creation of the Index Entry. This is technically a 'fix-hancement' as we are bypassing the underlying issue altogether by switching to use common core API's for index/field suggestions (just as is done in the Discover 'Create a data view' interface), and in turn supporting all fields of type `text`, not just `semantic_text` (elastic#230863). ### Original Issue The underlying issue here was introduced by a change to the `field_caps` API in `9.1` (elastic/elasticsearch#127664) that resulted in the `/internal/elastic_assistant/knowledge_base/_indices` route not finding any indices with a `semantic_text` field, and thus inadvertently falling back to doing a full scan of all mappings ([source](https://github.com/elastic/kibana/blob/b128cee4ee7ccc367e8acf159dbf58a75f081867/x-pack/solutions/security/plugins/elastic_assistant/server/routes/knowledge_base/get_knowledge_base_indices.ts#L69)). A fix to this API was initially investigated, but there was no reasonable API available for fetching all occurrences of `semantic_text` fields, so with `match` queries adding support for `semantic_text` in `8.18`, it was decided to go ahead and enable support for all `text` fields. ### Fix Details The `Index` input field now uses the `dataViews.getIndices()` API for suggestions (instead of the `useKnowledgeBaseIndices` hook/route), which is backed by the [resolve indices ES API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-resolve-index). System indices are filtered out with the `*,-.*` filter. The initial call will return all indices just as the Discover 'Create a data view' interface, which is further filtered upon as the user continues to type. Note: Discover makes subsequent calls upon further user input, though I'm not entirely sure this is necessary here as all indices are initially returned and available for client-side filtering within the input. I will perform further stress testing with many indices/mappings to confirm. The `Field` input now uses the fields already queried for the `Output fields` input suggestions (via `dataViews.getFieldsForWildcard()`), just filtered to those whose `field.esTypes?.includes('text')`. This is potentially still a hot path for client-side code with many mappings, so I will also confirm this with further stress testing. ### Docs @elastic/security-docs / @benironside, we will need to update the Security Assistant KB docs [here](https://www.elastic.co/docs/solutions/security/ai/ai-assistant-knowledge-base#knowledge-base-add-knowledge-index) to indicate that any `text` field is now supported for retrieval. ### Testing #### Functional Testing To confirm proper retrieval of both `text` and `semantic_text` fields we'll need to create an index, add a document, then create the KB Index Entry. Then update the KB Index Entry to reference the other field type, and test again. <details><summary>Create Index</summary> ``` JSON PUT project-details { "mappings": { "properties": { "project_issue": { "type": "text" }, "project_name": { "type": "text" }, "summary": { "type": "semantic_text", "inference_id": ".elser-2-elasticsearch", "model_settings": { "service": "elasticsearch", "task_type": "sparse_embedding" } } } } } ``` </details> <details><summary>Create Sample Doc</summary> ``` JSON PUT project-details/_doc/doc1 { "project_issue": "The main issue at hand is the breaking of the space plane", "project_name": "Issue 5", "summary": "This is a summary that contains the word yellow" } ``` </details> <details><summary>Create KB Index Entry (Text)</summary> ``` JSON POST kbn:api/security_ai_assistant/knowledge_base/entries { "type": "index", "name": "Project Details Tool", "index": "project-details", "field": "project_issue", "outputFields": [], "description": "Use this index to answer questions about any project details.", "queryDescription": "Key terms to search for from the user's prompt." } ``` </details> Now you can open the Assistant and perform a query like: ``` Do I have any project details about the issue at hand? ``` Which should result in a [trace like this](https://smith.langchain.com/public/208338a6-42f3-4a9d-bc4c-0ff38d06d34c/r) which calls the generated tool that will then perform a _lexical search_ against the configured index. Ensure citations work as expected for the returned document. Now open the Index Entry in the KB Settings UI and change the `field` to from `project_issue` to `summary` for testing `semantic_text`. Open the Assistant and perform a query like: ``` Do I have any project details for Project Yellow? ``` Which should result in a [trace like this](https://smith.langchain.com/public/3041a48b-5dfd-4c89-9f51-0bcdffd38f63/r) which calls the generated tool that will then perform a _semantic search_ against the configured index. Ensure citations work as expected for the returned document. #### Performance Testing ⚠️ In progress -- I will include a script for generating many indices/mappings for testing, and also prepare the `ci-cloud-deploy` instance with the same setup for confirmation. ### Checklist Check the PR satisfies following conditions. Reviewers should verify this PR satisfies this list as well. - [X] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials * Will coordinate with @elastic/security-docs on the docs update here. - [X] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: kibanamachine <[email protected]>

…elastic#231725) ## Summary This is a follow-up to elastic#231376 where I added new `docLinks` for the Security Solution AI Assistant. This PR removes the `TODO` added in that PR and creates a new `securitySolution.aiAssistant` grouping so we can more easily add docLinks as needed. Reviewers Note: Sorry for the extra noise here -- I thought there were more references so decided to do in another PR. Turns out there were not, so this is a quick one!

@jamesspi

…tions (#231904) ## Summary Small follow-up improvement to #231376 which added support for `text` fields to Index Entries. This PR adds the field type as a badge in the suggestions so users will know if a semantic or lexical search will be performed (so they can adapt the query instructions accordingly). Note: Needed to update the field API request from `dataViews.getFieldsForWildcard` (which called `/internal/data_views/_fields_for_wildcard`) to use `/api/index_management/mapping/[indexName]` as the former did not have the option to include field type. I confirmed no new privileges were necessary for this API, and the user just needs the same index privileges as before. cc @jamesspi Field Options: <img width="500" src="https://github.com/user-attachments/assets/f138c7f0-1d89-4946-8d27-fa6c9c49c60b" /> Output Field Options: <img width="500" src="https://github.com/user-attachments/assets/2b0395e5-d71d-43af-8a23-9bacc4b02b54" /> --- As part of this PR I've also included the helper script from #231376 for testing these large index/mapping scenarios. This script was almost entirely written in a collab session with `gemini-cli`, and is located in: > x-pack/solutions/security/plugins/elastic_assistant/scripts Options include: ``` bash Elasticsearch Index/Mapping Populator and Cleanup Script Usage: node stress_test_mappings.js [options] node stress_test_mappings.js --cleanup node stress_test_mappings.js --delete-by-count <number> Description: This script stress-tests an Elasticsearch instance by creating a large number of indices with many fields. It can also clean up the indices it creates. Creation Options: --host <url> Elasticsearch host URL (default: http://localhost:9200) --user <username> Username for basic auth (default: elastic) --pass <password> Password for basic auth (default: changeme) --apiKey <key> API key for authentication (overrides user/pass) --indices <number> Number of indices to create (default: 5000) --mappings <number> Number of mappings per index (default: 5000) --maxFields <number> The max number of fields per index (default: same as --mappings) --shards <number> Number of primary shards per index (default: 1) --replicas <number> Number of replicas per index (default: 0) Cleanup & Recovery Options: --cleanup Delete all indices created by this script. --delete-by-count <N> Delete the <N> newest stress-test indices. --yes Bypass confirmation prompt during cleanup. Other Options: -h, --help Show this help message ``` And some test executions are as follows. First CD into the assistant working directory: ``` cd x-pack/solutions/security/plugins/elastic_assistant/ ``` ##### Populate your local ES -- defaults to 5000 indices and 5000 mappings _per_ index. This _will cause_ a default local ES to crash, so stop early (~569), or change configuration :) ``` bash yarn stress-test-mappings ``` ##### If your ES is at its limits, you can slowly dial back the index count with the following: ``` bash yarn stress-test-mappings --delete-by-count 50 --yes ``` ##### Or cleanup all the indices you created entirely with: ``` bash yarn stress-test-mappings --cleanup --yes ``` ##### And for a cloud install, create an API key and populate with the following: ``` bash yarn stress-test-mappings -host https://stress-test.es.us-west2.gcp.elastic-cloud.com --apiKey APK_KEY_HERE ``` > [!IMPORTANT] > This is a quick utility script and may be buggy! Continue to vibe code it as you see fit, but it worked for my needs here for testing and validating this issue and fix 🙂 ### Checklist - [X] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: kibanamachine <[email protected]>

@jamesspi

…tions (elastic#231904) ## Summary Small follow-up improvement to elastic#231376 which added support for `text` fields to Index Entries. This PR adds the field type as a badge in the suggestions so users will know if a semantic or lexical search will be performed (so they can adapt the query instructions accordingly). Note: Needed to update the field API request from `dataViews.getFieldsForWildcard` (which called `/internal/data_views/_fields_for_wildcard`) to use `/api/index_management/mapping/[indexName]` as the former did not have the option to include field type. I confirmed no new privileges were necessary for this API, and the user just needs the same index privileges as before. cc @jamesspi Field Options: <img width="500" src="https://github.com/user-attachments/assets/f138c7f0-1d89-4946-8d27-fa6c9c49c60b" /> Output Field Options: <img width="500" src="https://github.com/user-attachments/assets/2b0395e5-d71d-43af-8a23-9bacc4b02b54" /> --- As part of this PR I've also included the helper script from elastic#231376 for testing these large index/mapping scenarios. This script was almost entirely written in a collab session with `gemini-cli`, and is located in: > x-pack/solutions/security/plugins/elastic_assistant/scripts Options include: ``` bash Elasticsearch Index/Mapping Populator and Cleanup Script Usage: node stress_test_mappings.js [options] node stress_test_mappings.js --cleanup node stress_test_mappings.js --delete-by-count <number> Description: This script stress-tests an Elasticsearch instance by creating a large number of indices with many fields. It can also clean up the indices it creates. Creation Options: --host <url> Elasticsearch host URL (default: http://localhost:9200) --user <username> Username for basic auth (default: elastic) --pass <password> Password for basic auth (default: changeme) --apiKey <key> API key for authentication (overrides user/pass) --indices <number> Number of indices to create (default: 5000) --mappings <number> Number of mappings per index (default: 5000) --maxFields <number> The max number of fields per index (default: same as --mappings) --shards <number> Number of primary shards per index (default: 1) --replicas <number> Number of replicas per index (default: 0) Cleanup & Recovery Options: --cleanup Delete all indices created by this script. --delete-by-count <N> Delete the <N> newest stress-test indices. --yes Bypass confirmation prompt during cleanup. Other Options: -h, --help Show this help message ``` And some test executions are as follows. First CD into the assistant working directory: ``` cd x-pack/solutions/security/plugins/elastic_assistant/ ``` ##### Populate your local ES -- defaults to 5000 indices and 5000 mappings _per_ index. This _will cause_ a default local ES to crash, so stop early (~569), or change configuration :) ``` bash yarn stress-test-mappings ``` ##### If your ES is at its limits, you can slowly dial back the index count with the following: ``` bash yarn stress-test-mappings --delete-by-count 50 --yes ``` ##### Or cleanup all the indices you created entirely with: ``` bash yarn stress-test-mappings --cleanup --yes ``` ##### And for a cloud install, create an API key and populate with the following: ``` bash yarn stress-test-mappings -host https://stress-test.es.us-west2.gcp.elastic-cloud.com --apiKey APK_KEY_HERE ``` > [!IMPORTANT] > This is a quick utility script and may be buggy! Continue to vibe code it as you see fit, but it worked for my needs here for testing and validating this issue and fix 🙂 ### Checklist - [X] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: kibanamachine <[email protected]> (cherry picked from commit 39a6983)

@jamesspi

…suggestions (#231904) (#232674) # Backport This will backport the following commits from `main` to `9.1`: - [[Security Assistant] Add field type badge to Index Entry field suggestions (#231904)](#231904)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport)  Co-authored-by: Garrett Spong <[email protected]>

@benironside

… Base Index Entries in deployments with a large number of indices/mappings (elastic#231376) ## Summary This PR fixes an issue with the Security Assistant KB Index Entries interface introduced in `9.1` where a large number of indices/mappings could result in Kibana crashing and preventing the creation of the Index Entry. This is technically a 'fix-hancement' as we are bypassing the underlying issue altogether by switching to use common core API's for index/field suggestions (just as is done in the Discover 'Create a data view' interface), and in turn supporting all fields of type `text`, not just `semantic_text` (elastic#230863). ### Original Issue The underlying issue here was introduced by a change to the `field_caps` API in `9.1` (elastic/elasticsearch#127664) that resulted in the `/internal/elastic_assistant/knowledge_base/_indices` route not finding any indices with a `semantic_text` field, and thus inadvertently falling back to doing a full scan of all mappings ([source](https://github.com/elastic/kibana/blob/b128cee4ee7ccc367e8acf159dbf58a75f081867/x-pack/solutions/security/plugins/elastic_assistant/server/routes/knowledge_base/get_knowledge_base_indices.ts#L69)). A fix to this API was initially investigated, but there was no reasonable API available for fetching all occurrences of `semantic_text` fields, so with `match` queries adding support for `semantic_text` in `8.18`, it was decided to go ahead and enable support for all `text` fields. ### Fix Details The `Index` input field now uses the `dataViews.getIndices()` API for suggestions (instead of the `useKnowledgeBaseIndices` hook/route), which is backed by the [resolve indices ES API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-resolve-index). System indices are filtered out with the `*,-.*` filter. The initial call will return all indices just as the Discover 'Create a data view' interface, which is further filtered upon as the user continues to type. Note: Discover makes subsequent calls upon further user input, though I'm not entirely sure this is necessary here as all indices are initially returned and available for client-side filtering within the input. I will perform further stress testing with many indices/mappings to confirm. The `Field` input now uses the fields already queried for the `Output fields` input suggestions (via `dataViews.getFieldsForWildcard()`), just filtered to those whose `field.esTypes?.includes('text')`. This is potentially still a hot path for client-side code with many mappings, so I will also confirm this with further stress testing. ### Docs @elastic/security-docs / @benironside, we will need to update the Security Assistant KB docs [here](https://www.elastic.co/docs/solutions/security/ai/ai-assistant-knowledge-base#knowledge-base-add-knowledge-index) to indicate that any `text` field is now supported for retrieval. ### Testing #### Functional Testing To confirm proper retrieval of both `text` and `semantic_text` fields we'll need to create an index, add a document, then create the KB Index Entry. Then update the KB Index Entry to reference the other field type, and test again. <details><summary>Create Index</summary> ``` JSON PUT project-details { "mappings": { "properties": { "project_issue": { "type": "text" }, "project_name": { "type": "text" }, "summary": { "type": "semantic_text", "inference_id": ".elser-2-elasticsearch", "model_settings": { "service": "elasticsearch", "task_type": "sparse_embedding" } } } } } ``` </details> <details><summary>Create Sample Doc</summary> ``` JSON PUT project-details/_doc/doc1 { "project_issue": "The main issue at hand is the breaking of the space plane", "project_name": "Issue 5", "summary": "This is a summary that contains the word yellow" } ``` </details> <details><summary>Create KB Index Entry (Text)</summary> ``` JSON POST kbn:api/security_ai_assistant/knowledge_base/entries { "type": "index", "name": "Project Details Tool", "index": "project-details", "field": "project_issue", "outputFields": [], "description": "Use this index to answer questions about any project details.", "queryDescription": "Key terms to search for from the user's prompt." } ``` </details> Now you can open the Assistant and perform a query like: ``` Do I have any project details about the issue at hand? ``` Which should result in a [trace like this](https://smith.langchain.com/public/208338a6-42f3-4a9d-bc4c-0ff38d06d34c/r) which calls the generated tool that will then perform a _lexical search_ against the configured index. Ensure citations work as expected for the returned document. Now open the Index Entry in the KB Settings UI and change the `field` to from `project_issue` to `summary` for testing `semantic_text`. Open the Assistant and perform a query like: ``` Do I have any project details for Project Yellow? ``` Which should result in a [trace like this](https://smith.langchain.com/public/3041a48b-5dfd-4c89-9f51-0bcdffd38f63/r) which calls the generated tool that will then perform a _semantic search_ against the configured index. Ensure citations work as expected for the returned document. #### Performance Testing ⚠️ In progress -- I will include a script for generating many indices/mappings for testing, and also prepare the `ci-cloud-deploy` instance with the same setup for confirmation. ### Checklist Check the PR satisfies following conditions. Reviewers should verify this PR satisfies this list as well. - [X] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials * Will coordinate with @elastic/security-docs on the docs update here. - [X] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: kibanamachine <[email protected]>

…elastic#231725) ## Summary This is a follow-up to elastic#231376 where I added new `docLinks` for the Security Solution AI Assistant. This PR removes the `TODO` added in that PR and creates a new `securitySolution.aiAssistant` grouping so we can more easily add docLinks as needed. Reviewers Note: Sorry for the extra noise here -- I thought there were more references so decided to do in another PR. Turns out there were not, so this is a quick one!

@jamesspi

…tions (elastic#231904) ## Summary Small follow-up improvement to elastic#231376 which added support for `text` fields to Index Entries. This PR adds the field type as a badge in the suggestions so users will know if a semantic or lexical search will be performed (so they can adapt the query instructions accordingly). Note: Needed to update the field API request from `dataViews.getFieldsForWildcard` (which called `/internal/data_views/_fields_for_wildcard`) to use `/api/index_management/mapping/[indexName]` as the former did not have the option to include field type. I confirmed no new privileges were necessary for this API, and the user just needs the same index privileges as before. cc @jamesspi Field Options: <img width="500" src="https://github.com/user-attachments/assets/f138c7f0-1d89-4946-8d27-fa6c9c49c60b" /> Output Field Options: <img width="500" src="https://github.com/user-attachments/assets/2b0395e5-d71d-43af-8a23-9bacc4b02b54" /> --- As part of this PR I've also included the helper script from elastic#231376 for testing these large index/mapping scenarios. This script was almost entirely written in a collab session with `gemini-cli`, and is located in: > x-pack/solutions/security/plugins/elastic_assistant/scripts Options include: ``` bash Elasticsearch Index/Mapping Populator and Cleanup Script Usage: node stress_test_mappings.js [options] node stress_test_mappings.js --cleanup node stress_test_mappings.js --delete-by-count <number> Description: This script stress-tests an Elasticsearch instance by creating a large number of indices with many fields. It can also clean up the indices it creates. Creation Options: --host <url> Elasticsearch host URL (default: http://localhost:9200) --user <username> Username for basic auth (default: elastic) --pass <password> Password for basic auth (default: changeme) --apiKey <key> API key for authentication (overrides user/pass) --indices <number> Number of indices to create (default: 5000) --mappings <number> Number of mappings per index (default: 5000) --maxFields <number> The max number of fields per index (default: same as --mappings) --shards <number> Number of primary shards per index (default: 1) --replicas <number> Number of replicas per index (default: 0) Cleanup & Recovery Options: --cleanup Delete all indices created by this script. --delete-by-count <N> Delete the <N> newest stress-test indices. --yes Bypass confirmation prompt during cleanup. Other Options: -h, --help Show this help message ``` And some test executions are as follows. First CD into the assistant working directory: ``` cd x-pack/solutions/security/plugins/elastic_assistant/ ``` ##### Populate your local ES -- defaults to 5000 indices and 5000 mappings _per_ index. This _will cause_ a default local ES to crash, so stop early (~569), or change configuration :) ``` bash yarn stress-test-mappings ``` ##### If your ES is at its limits, you can slowly dial back the index count with the following: ``` bash yarn stress-test-mappings --delete-by-count 50 --yes ``` ##### Or cleanup all the indices you created entirely with: ``` bash yarn stress-test-mappings --cleanup --yes ``` ##### And for a cloud install, create an API key and populate with the following: ``` bash yarn stress-test-mappings -host https://stress-test.es.us-west2.gcp.elastic-cloud.com --apiKey APK_KEY_HERE ``` > [!IMPORTANT] > This is a quick utility script and may be buggy! Continue to vibe code it as you see fit, but it worked for my needs here for testing and validating this issue and fix 🙂 ### Checklist - [X] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: kibanamachine <[email protected]>

@jamesspi

…tions (elastic#231904) ## Summary Small follow-up improvement to elastic#231376 which added support for `text` fields to Index Entries. This PR adds the field type as a badge in the suggestions so users will know if a semantic or lexical search will be performed (so they can adapt the query instructions accordingly). Note: Needed to update the field API request from `dataViews.getFieldsForWildcard` (which called `/internal/data_views/_fields_for_wildcard`) to use `/api/index_management/mapping/[indexName]` as the former did not have the option to include field type. I confirmed no new privileges were necessary for this API, and the user just needs the same index privileges as before. cc @jamesspi Field Options: <img width="500" src="https://github.com/user-attachments/assets/f138c7f0-1d89-4946-8d27-fa6c9c49c60b" /> Output Field Options: <img width="500" src="https://github.com/user-attachments/assets/2b0395e5-d71d-43af-8a23-9bacc4b02b54" /> --- As part of this PR I've also included the helper script from elastic#231376 for testing these large index/mapping scenarios. This script was almost entirely written in a collab session with `gemini-cli`, and is located in: > x-pack/solutions/security/plugins/elastic_assistant/scripts Options include: ``` bash Elasticsearch Index/Mapping Populator and Cleanup Script Usage: node stress_test_mappings.js [options] node stress_test_mappings.js --cleanup node stress_test_mappings.js --delete-by-count <number> Description: This script stress-tests an Elasticsearch instance by creating a large number of indices with many fields. It can also clean up the indices it creates. Creation Options: --host <url> Elasticsearch host URL (default: http://localhost:9200) --user <username> Username for basic auth (default: elastic) --pass <password> Password for basic auth (default: changeme) --apiKey <key> API key for authentication (overrides user/pass) --indices <number> Number of indices to create (default: 5000) --mappings <number> Number of mappings per index (default: 5000) --maxFields <number> The max number of fields per index (default: same as --mappings) --shards <number> Number of primary shards per index (default: 1) --replicas <number> Number of replicas per index (default: 0) Cleanup & Recovery Options: --cleanup Delete all indices created by this script. --delete-by-count <N> Delete the <N> newest stress-test indices. --yes Bypass confirmation prompt during cleanup. Other Options: -h, --help Show this help message ``` And some test executions are as follows. First CD into the assistant working directory: ``` cd x-pack/solutions/security/plugins/elastic_assistant/ ``` ##### Populate your local ES -- defaults to 5000 indices and 5000 mappings _per_ index. This _will cause_ a default local ES to crash, so stop early (~569), or change configuration :) ``` bash yarn stress-test-mappings ``` ##### If your ES is at its limits, you can slowly dial back the index count with the following: ``` bash yarn stress-test-mappings --delete-by-count 50 --yes ``` ##### Or cleanup all the indices you created entirely with: ``` bash yarn stress-test-mappings --cleanup --yes ``` ##### And for a cloud install, create an API key and populate with the following: ``` bash yarn stress-test-mappings -host https://stress-test.es.us-west2.gcp.elastic-cloud.com --apiKey APK_KEY_HERE ``` > [!IMPORTANT] > This is a quick utility script and may be buggy! Continue to vibe code it as you see fit, but it worked for my needs here for testing and validating this issue and fix 🙂 ### Checklist - [X] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: kibanamachine <[email protected]>

Add support for all text fields to KB Index Entries

b4fc185

spong self-assigned this Aug 11, 2025

spong requested a review from a team as a code owner August 11, 2025 22:49

spong requested a review from e40pud August 11, 2025 23:01

spong added 2 commits August 12, 2025 17:23

Update index input label, include docLink, and remove the now unused …

2146bbd

…get kb indices route

Merge branch 'main' of github.com:elastic/kibana into index-entry-tex…

cfcdbf6

…t-support

spong requested a review from a team as a code owner August 12, 2025 23:24

Lint plus i18n fixes

85e059a

florent-leborgne approved these changes Aug 13, 2025

View reviewed changes

e40pud approved these changes Aug 13, 2025

View reviewed changes

spong and others added 2 commits August 13, 2025 08:55

Fix docLinks mock

e674ff5

[CI] Auto-commit changed files from 'node scripts/notice'

e01b3e1

spong commented Aug 13, 2025

View reviewed changes

spong added v9.1.3 and removed v9.1.2 labels Aug 13, 2025

spong merged commit 4496b50 into elastic:main Aug 13, 2025
16 checks passed

spong deleted the index-entry-text-support branch August 13, 2025 22:33

spong mentioned this pull request Aug 13, 2025

[9.1] [Security Assistant] Fixes issue preventing the creation of Knowledge Base Index Entries in deployments with a large number of indices/mappings (#231376) #231717

Merged

spong mentioned this pull request Aug 14, 2025

[Security Assistant] Breakout Security Solution AI Assistant docLinks #231725

Merged

spong mentioned this pull request Aug 14, 2025

[Security Assistant] Add field type badge to Index Entry field suggestions #231904

Merged

1 task

spong mentioned this pull request Aug 19, 2025

[Internal]: Update Security Assistant Knowledge Base Docs to include added support of text fields for Index Entries elastic/docs-content#2628

Closed

spong mentioned this pull request Oct 16, 2025

[Security Solution] [Security Assistant] Security Assistant Index Entry form suggestions can be incorrect #239429

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Security Assistant] Fixes issue preventing the creation of Knowledge Base Index Entries in deployments with a large number of indices/mappings #231376

[Security Assistant] Fixes issue preventing the creation of Knowledge Base Index Entries in deployments with a large number of indices/mappings #231376

Uh oh!

spong commented Aug 11, 2025 •

edited

Loading

Uh oh!

e40pud commented Aug 12, 2025

Uh oh!

e40pud commented Aug 12, 2025

Uh oh!

florent-leborgne left a comment

Uh oh!

e40pud left a comment

Uh oh!

elasticmachine commented Aug 13, 2025 •

edited

Loading

API count

Uh oh!

spong Aug 13, 2025

Uh oh!

Uh oh!

kibanamachine commented Aug 13, 2025

Uh oh!

kibanamachine commented Aug 13, 2025

Uh oh!

spong commented Aug 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[Security Assistant] Fixes issue preventing the creation of Knowledge Base Index Entries in deployments with a large number of indices/mappings #231376

[Security Assistant] Fixes issue preventing the creation of Knowledge Base Index Entries in deployments with a large number of indices/mappings #231376

Uh oh!

Conversation

spong commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Original Issue

Fix Details

Docs

Testing

Functional Testing

Performance Testing

Checklist

Uh oh!

e40pud commented Aug 12, 2025

Uh oh!

e40pud commented Aug 12, 2025

Testing setup

Results

1. Who are the authors of the GTR 2024?

2. What is the forecast for the coming year in GTR 2024?

3. What are top 10 Process Injection by rules in Windows endpoints in GTR 2024?

4. What is the most widely adopted cloud service provider this year according to GTR 2024?

5. Give a brief conclusion of the GTR 2024

Uh oh!

florent-leborgne left a comment

Choose a reason for hiding this comment

Uh oh!

e40pud left a comment

Choose a reason for hiding this comment

Uh oh!

elasticmachine commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💚 Build Succeeded

Metrics [docs]

Module Count

Public APIs missing comments

Async chunks

Page load bundle

API count

History

Uh oh!

spong Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kibanamachine commented Aug 13, 2025

Uh oh!

kibanamachine commented Aug 13, 2025

💔 All backports failed

Manual backport

Questions ?

Uh oh!

spong commented Aug 13, 2025

💚 All backports created successfully

Questions ?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

spong commented Aug 11, 2025 •

edited

Loading

elasticmachine commented Aug 13, 2025 •

edited

Loading