Skip to content

Conversation

@spong
Copy link
Member

@spong spong commented Aug 11, 2025

Summary

This PR fixes an issue with the Security Assistant KB Index Entries interface introduced in 9.1 where a large number of indices/mappings could result in Kibana crashing and preventing the creation of the Index Entry.

This is technically a 'fix-hancement' as we are bypassing the underlying issue altogether by switching to use common core API's for index/field suggestions (just as is done in the Discover 'Create a data view' interface), and in turn supporting all fields of type text, not just semantic_text (#230863).

Original Issue

The underlying issue here was introduced by a change to the field_caps API in 9.1 (elastic/elasticsearch#127664) that resulted in the /internal/elastic_assistant/knowledge_base/_indices route not finding any indices with a semantic_text field, and thus inadvertently falling back to doing a full scan of all mappings (source). A fix to this API was initially investigated, but there was no reasonable API available for fetching all occurrences of semantic_text fields, so with match queries adding support for semantic_text in 8.18, it was decided to go ahead and enable support for all text fields.

Fix Details

The Index input field now uses the dataViews.getIndices() API for suggestions (instead of the useKnowledgeBaseIndices hook/route), which is backed by the resolve indices ES API. System indices are filtered out with the *,-.* filter. The initial call will return all indices just as the Discover 'Create a data view' interface, which is further filtered upon as the user continues to type. Note: Discover makes subsequent calls upon further user input, though I'm not entirely sure this is necessary here as all indices are initially returned and available for client-side filtering within the input. I will perform further stress testing with many indices/mappings to confirm.

The Field input now uses the fields already queried for the Output fields input suggestions (via dataViews.getFieldsForWildcard()), just filtered to those whose field.esTypes?.includes('text'). This is potentially still a hot path for client-side code with many mappings, so I will also confirm this with further stress testing.

Docs

@elastic/security-docs / @benironside, we will need to update the Security Assistant KB docs here to indicate that any text field is now supported for retrieval. Docs issue here: elastic/docs-content#2628

Testing

Functional Testing

To confirm proper retrieval of both text and semantic_text fields we'll need to create an index, add a document, then create the KB Index Entry. Then update the KB Index Entry to reference the other field type, and test again.

Create Index

PUT project-details
{
  "mappings": {
    "properties": {
      "project_issue": {
        "type": "text"
      },
      "project_name": {
        "type": "text"
      },
      "summary": {
        "type": "semantic_text",
        "inference_id": ".elser-2-elasticsearch",
        "model_settings": {
          "service": "elasticsearch",
          "task_type": "sparse_embedding"
        }
      }
    }
  }
}

Create Sample Doc

PUT project-details/_doc/doc1
{
    "project_issue": "The main issue at hand is the breaking of the space plane",
    "project_name": "Issue 5",
    "summary": "This is a summary that contains the word yellow"
}

Create KB Index Entry (Text)

POST kbn:api/security_ai_assistant/knowledge_base/entries
{
  "type": "index",
  "name": "Project Details Tool",
  "index": "project-details",
  "field": "project_issue",
  "outputFields": [],
  "description": "Use this index to answer questions about any project details.",
  "queryDescription": "Key terms to search for from the user's prompt."
}

Now you can open the Assistant and perform a query like:

Do I have any project details about the issue at hand?

Which should result in a trace like this which calls the generated tool that will then perform a lexical search against the configured index. Ensure citations work as expected for the returned document.

Now open the Index Entry in the KB Settings UI and change the field to from project_issue to summary for testing semantic_text.

Open the Assistant and perform a query like:

Do I have any project details for Project Yellow?

Which should result in a trace like this which calls the generated tool that will then perform a semantic search against the configured index. Ensure citations work as expected for the returned document.

Performance Testing

⚠️ In progress -- I will include a script for generating many indices/mappings for testing, and also prepare the ci-cloud-deploy instance with the same setup for confirmation.

Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

@spong spong self-assigned this Aug 11, 2025
@spong spong requested a review from a team as a code owner August 11, 2025 22:49
@spong spong added bug Fixes for quality problems that affect the customer experience release_note:fix needs_docs sdh-linked ci:cloud-deploy Create or update a Cloud deployment Team:Security Generative AI Security Generative AI backport:version Backport to applied version labels v9.2.0 v9.1.2 labels Aug 11, 2025
@spong spong requested a review from e40pud August 11, 2025 23:01
@e40pud
Copy link
Contributor

e40pud commented Aug 12, 2025

Since /internal/elastic_assistant/knowledge_base/_indices is internal and used only in IndexEntryEditor should we remove it and related hooks completely?

@e40pud
Copy link
Contributor

e40pud commented Aug 12, 2025

Thanks for the fix!! Changes LGTM

Below are local testing results

Testing setup

  1. Uploaded a PDF document:
Elastic Global Threat Report 2024 and copied content into additional content field of semantic_text type
  2. Created a new index entry in the KB
    • Data Description: "Use this tool to answer questions about the Elastic Global Threat Report (GTR) 2024"
    • Query Instruction: "Key terms to return data relevant to the Elastic Global Threat Report (GTR) 2024"
  3. Asked assistant next five questions:
    1. Who are the authors of the GTR 2024?
    2. What is the forecast for the coming year in GTR 2024?
    3. What are top 10 Process Injection by rules in Windows endpoints in GTR 2024?
    4. What is the most widely adopted cloud service provider this year according to GTR 2024?
    5. Give a brief conclusion of the GTR 2024

Results

1. Who are the authors of the GTR 2024?

🟡 Assistant was not able to find authors of the document.

Screenshot 2025-08-12 at 11 24 17

First of all, I remember this to be working well some versions ago and assistant was able to list all the contributors.

It looks like something has changed in the process of converting user input into a meaningful tool input to the KB tool calling. Right now (in case of GPT-4.1) the question above is being converted into a "authors" and sent as an input to the tool (here is the example trace).

When I tried to search within the index using the original question, I was able to see authors as part of the highlighted fragments:

Semantic search

GET elastic-global-threat-report-2024/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "content": "Who are the authors of the GTR 2024?"
          }
        }
      ]
    }
  },
  "highlight": {
    "fields": {
      "content": {
        "number_of_fragments": 2,
        "order": "score"
      }
    }
  }
}

Output

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 8.70585,
    "hits": [
      {
        "_index": "elastic-global-threat-report-2024",
        "_id": "u5x4nZgB7B6Is1yyNd5u",
        "_score": 8.70585,
        "_ignored": [
          "attachment.content.keyword"
        ],
        "_source": {
          "attachment": {
            "date": "2024-09-30T21:51:48Z",
            "content_type": "application/pdf",
            "format": "application/pdf; version=1.3",
            "modified": "2024-10-01T09:38:32Z",
            "language": "en",
            "metadata_date": "2024-10-01T09:38:32Z",
            "creator_tool": "Adobe InDesign 19.5 (Windows)",
            "content": """..."""
        },
        "highlight": {
          "content": [
            """2024

Global
Threat�
Report



Table of Contents
Introduction

Generative AI

	 Threat overview

	 Augmenting defenders

Malware Detections

	 Distribution by operation system

	 Malware categories

Endpoint Behaviors

	 Distribution by operating system

	 Distribution by tactic

Cloud Security

	 Distribution by cloud service provider

	 Benchmarking cloud security posture

Threat Profiles

	 REF5961 — BLOODALCHEMY, RUDEBIRD, EAGERBEE, DOWNTOWN

	 REF8207 — GHOSTPULSE

	 REF4578 — GHOSTENGINE

	 REF7001 — KANDYKORN

	 REF6127 — WARMCOOKIE

Responding to 2023 Forecasts

Forecasts and Recommendations

Conclusion

03

04

04

05

06

06

07

11

11

12

36

37

47

57

58

61

64

66

69

72

75

79

1

2

3

4

5

6

7

8

9

2024 Elastic Global Threat Report



Introduction1

2024 Elastic Global Threat Report

03

Introduction

With the best technologies, the most
widespread information distribution, and the
greatest public awareness of threats all in
motion, the security environment is stronger
than ever. Yet, almost in spite of these things,
threat ecosystems are thriving like never before.

Truthfully, the threat landscape is dynamic
and reactive — a new technique empowers
a previously unknown threat group, vendors
swarm to mitigate that threat and create new
technologies in the process, operators on both
""",
            """https://www.elastic.co/security
https://www.elastic.co/security-labs
https://x.com/elasticseclabs


Conclusion9

2024 Elastic Global Threat Report

80

The 2024 Elastic Global Threat Report features insights and expertise from across the
Elastic organization. We’d like to thank the following Elasticians for their contributions:

Mika Ayenson

Samir Bousseaden

Terrance DeJesus

Chris Donaher

Tinsae Erkailo

Ayoub Faouzi

Eric Forte

Ruben Groenewoud

Justin Ibarra

Devon Kerr

Jake King

Shashank Suryanarayana

Mark Mager

Asuka Nakajima

Andrew Pease

John Uhlmann

Alyssa VanNice

Colson Wilhoit



© 2024. Elasticsearch B.V. All Rights Reserved.
Elastic, Elasticsearch and other related marks are trademarks, logos or registered trademarks of Elasticsearch B.V. in the United States
and other countries. Microsoft, Azure, Windows and other related marks are trademarks of the Microsoft group of companies. Amazon
Web Services, AWS, and other related marks are trademarks of Amazon.com, Inc. or its affiliates. All other brand names, product
names, or trademarks belong to their respective owners.

2024

Global
Threat�
Report


	TOC
	Introduction
	Generative AI
	Threat overview
	Augmenting defenders

	Malware Detections
	Malware categories

	Endpoint Behaviors
	Distribution by tactic

	Cloud Security
	Distribution by cloud service provider
	Benchmarking cloud security posture

	Threat Profiles
	REF5961
	REF8207
	REF4578
	REF7001
	REF6127

	Responding to 2023 Forecasts
	Forecasts and Recommendations
	Conclusion"""
          ]
        }
      }
    ]
  }
}

This issue is not related to these changes, but wanted to highlight it. We might want to have a look into this and see how we can improve the experience.

2. What is the forecast for the coming year in GTR 2024?

✅ Great answer based on KB index entry

Screenshot 2025-08-12 at 11 24 41

3. What are top 10 Process Injection by rules in Windows endpoints in GTR 2024?

✅ Great answer based on KB index entry

Screenshot 2025-08-12 at 11 24 50

4. What is the most widely adopted cloud service provider this year according to GTR 2024?

✅ Great answer based on KB index entry

Screenshot 2025-08-12 at 11 24 58

5. Give a brief conclusion of the GTR 2024

✅ Great answer based on KB index entry

Screenshot 2025-08-12 at 11 25 04

@spong spong requested a review from a team as a code owner August 12, 2025 23:24
Copy link
Contributor

@florent-leborgne florent-leborgne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc link LGTM 🔗

Copy link
Contributor

@e40pud e40pud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all changes!! LGTM 🚀

@elasticmachine
Copy link
Contributor

elasticmachine commented Aug 13, 2025

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
alerting 315 314 -1
apm 1966 1965 -1
automaticImport 804 802 -2
cases 1132 1131 -1
datasetQuality 817 816 -1
discover 1384 1383 -1
elasticAssistant 460 458 -2
embeddableAlertsTable 514 513 -1
infra 1526 1525 -1
ml 2500 2499 -1
monitoring 725 724 -1
observability 1394 1393 -1
observabilityAIAssistantApp 437 436 -1
observabilityShared 308 307 -1
securitySolution 7886 7884 -2
slo 1235 1234 -1
stackAlerts 278 277 -1
synthetics 1340 1339 -1
timelines 248 247 -1
transform 784 783 -1
triggersActionsUi 965 964 -1
uptime 868 867 -1
total -25

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/elastic-assistant-common 679 676 -3

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
aiAssistantManagementSelection 78.5KB 78.6KB +128.0B
lists 125.6KB 125.8KB +128.0B
securitySolution 10.4MB 10.4MB -240.0B
total +16.0B

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
core 499.9KB 500.0KB +128.0B
securitySolution 95.3KB 95.3KB -2.0B
total +126.0B
Unknown metric groups

API count

id before after diff
@kbn/elastic-assistant-common 793 790 -3

History

cc @spong

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alrighty @e40pud! 👋

Vibed with gemini-cli and here's a nice little node script for generating a buncha indices/mappings. It generates indices with names starting with a-z (for testing sorting/filtering), then generates mappings using all the different field types. Save the below to a file:

populate_es.js

#!/usr/bin/env node

const http = require('http');
const https = require('https');
const readline = require('readline');

function parseArgs() {
  const args = process.argv.slice(2).reduce((acc, arg, i, arr) => {
    if (arg.startsWith('--')) {
      const key = arg.slice(2);
      const next = arr[i + 1];
      if (next && !next.startsWith('--')) {
        acc[key] = next;
      } else {
        acc[key] = true;
      }
    }
    return acc;
  }, {});

  if (args.help || args.h) {
    console.log(`
    Elasticsearch Index/Mapping Populator and Cleanup Script

    Usage:
      node populate_es.js [options]
      node populate_es.js --cleanup
      node populate_es.js --delete-by-count <number>

    Description:
      This script stress-tests an Elasticsearch instance by creating a large number
      of indices with many fields. It can also clean up the indices it creates.

    Creation Options:
      --host <url>          Elasticsearch host URL (default: http://localhost:9200)
      --user <username>     Username for basic auth (default: elastic)
      --pass <password>     Password for basic auth (default: changeme)
      --apiKey <key>        API key for authentication (overrides user/pass)
      --indices <number>    Number of indices to create (default: 5000)
      --mappings <number>   Number of mappings per index (default: 5000)
      --maxFields <number>  The max number of fields per index (default: same as --mappings)
      --shards <number>     Number of primary shards per index (default: 1)
      --replicas <number>   Number of replicas per index (default: 0)

    Cleanup & Recovery Options:
      --cleanup             Delete all indices created by this script.
      --delete-by-count <N> Delete the <N> newest stress-test indices.
      --yes                 Bypass confirmation prompt during cleanup.

    Other Options:
      -h, --help            Show this help message
    `);
    process.exit(0);
  }

  return {
    host: args.host || 'http://localhost:9200',
    user: args.user || 'elastic',
    pass: args.pass || 'changeme',
    apiKey: args.apiKey,
    indices: parseInt(args.indices, 10) || 5000,
    mappings: parseInt(args.mappings, 10) || 5000,
    maxFields: parseInt(args.maxFields, 10) || parseInt(args.mappings, 10) || 5000,
    shards: parseInt(args.shards, 10) || 1,
    replicas: parseInt(args.replicas, 10) || 0,
    cleanup: !!args.cleanup,
    deleteByCount: parseInt(args['delete-by-count'], 10) || 0,
    yes: !!args.yes,
  };
}

const config = parseArgs();

const simpleFieldTypes = [
  { type: 'text' },
  { type: 'keyword' },
  { type: 'long' },
  { type: 'integer' },
  { type: 'short' },
  { type: 'byte' },
  { type: 'double' },
  { type: 'float' },
  { type: 'half_float' },
  { type: 'scaled_float', scaling_factor: 100 },
  { type: 'date' },
  { type: 'date_nanos' },
  { type: 'boolean' },
  { type: 'binary' },
  { type: 'geo_point' },
  { type: 'ip' },
  { type: 'completion' },
  { type: 'token_count', analyzer: 'standard' },
];

const complexFieldTypes = [
  { type: 'integer_range' },
  { type: 'float_range' },
  { type: 'long_range' },
  { type: 'double_range' },
  { type: 'date_range' },
  { type: 'geo_shape' },
  { type: 'search_as_you_type' },
  { type: 'dense_vector', dims: 4 },
  { type: 'semantic_text' },
];

function generateIndexBody(numMappings, maxFields, numShards, numReplicas) {
  const properties = {};
  let fieldCount = 0;

  for (const fieldType of complexFieldTypes) {
    if (fieldCount >= numMappings) break;
    properties[`complex_${fieldType.type}_${fieldCount}`] = { ...fieldType };
    fieldCount++;
  }

  while (fieldCount < numMappings) {
    const fieldTypeDefinition = simpleFieldTypes[fieldCount % simpleFieldTypes.length];
    properties[`field_${fieldCount}`] = { ...fieldTypeDefinition };
    fieldCount++;
  }

  return {
    settings: {
      'index.mapping.total_fields.limit': maxFields,
      'index.number_of_shards': numShards,
      'index.number_of_replicas': numReplicas,
    },
    mappings: { properties },
  };
}

const sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms));

async function makeRequest(method, path, body, retries = 3, delay = 1000) {
  for (let i = 0; i < retries; i++) {
    try {
      return await new Promise((resolve, reject) => {
        const url = new URL(config.host);
        const protocol = url.protocol === 'https:' ? https : http;
        const options = {
          hostname: url.hostname,
          port: url.port,
          path,
          method,
          headers: { 'Content-Type': 'application/json' },
        };

        if (config.apiKey) {
          options.headers.Authorization = `ApiKey ${config.apiKey}`;
        } else if (config.user && config.pass) {
          const auth = 'Basic ' + Buffer.from(config.user + ':' + config.pass).toString('base64');
          options.headers.Authorization = auth;
        }

        const req = protocol.request(options, (res) => {
          let data = '';
          res.on('data', (chunk) => (data += chunk));
          res.on('end', () => {
            if (res.statusCode >= 200 && res.statusCode < 300) {
              try {
                resolve({ statusCode: res.statusCode, body: JSON.parse(data || '{}') });
              } catch (e) {
                reject(new Error('Failed to parse JSON response.'));
              }
            } else {
              const err = new Error(`Request failed with status code ${res.statusCode}: ${data}`);
              if (data.includes('resource_already_exists_exception')) {
                err.isAlreadyExists = true;
              }
              if ([429, 503, 504].includes(res.statusCode)) {
                err.isRetryable = true;
              }
              reject(err);
            }
          });
        });

        req.on('error', (e) => reject(e));
        if (body) req.write(JSON.stringify(body));
        req.end();
      });
    } catch (error) {
      if (error.isAlreadyExists || !error.isRetryable || i === retries - 1) {
        throw error;
      }
      await sleep(delay);
      delay *= 2;
    }
  }
}

async function cleanupIndices() {
  console.log('Starting cleanup of stress-test indices...');
  if (!config.yes) {
    const rl = readline.createInterface({ input: process.stdin, output: process.stdout });
    await new Promise((resolve) => {
      rl.question(
        'Are you sure you want to delete all indices with the pattern "*-stress-test-index-*"? (y/N) ',
        (answer) => {
          if (answer.toLowerCase() !== 'y') {
            console.log('Cleanup cancelled.');
            process.exit(0);
          }
          rl.close();
          resolve();
        }
      );
    });
  }

  try {
    const { body } = await makeRequest('DELETE', '/*-stress-test-index-*');
    console.log('Cleanup successful:', body);
  } catch (error) {
    if (error.message.includes('404')) {
      console.log('No stress-test indices found to delete.');
    } else {
      console.error('An error occurred during cleanup:', error.message);
      process.exit(1);
    }
  }
}

async function deleteIndicesByCount(count) {
  console.log(`Fetching the ${count} newest stress-test indices to delete...`);
  try {
    const { body } = await makeRequest(
      'GET',
      `/_cat/indices/*-stress-test-index-*?h=index&s=creation.date:desc&format=json`
    );
    const indices = body.map((item) => item.index);

    if (indices.length === 0) {
      console.log('No stress-test indices found to delete.');
      return;
    }

    const batchToDelete = indices.slice(0, count);
    console.log(`Deleting ${batchToDelete.length} indices: ${batchToDelete.join(', ')}`);
    await makeRequest('DELETE', `/${batchToDelete.join(',')}`);
    console.log('Deletion successful.');
  } catch (e) {
    console.error('\n[FATAL] Could not get or delete indices:', e.message);
    process.exit(1);
  }
}

async function createIndices() {
  console.log('Starting to populate Elasticsearch...');
  console.log('Configuration:', {
    ...config,
    pass: '***',
    apiKey: config.apiKey ? '***' : undefined,
  });

  const alphabet = 'abcdefghijklmnopqrstuvwxyz';
  let createdCount = 0;
  let skippedCount = 0;
  const total = config.indices;
  const barWidth = 40;

  for (let i = 0; i < total; i++) {
    const indexName = `${alphabet[i % alphabet.length]}-stress-test-index-${String(i).padStart(
      5,
      '0'
    )}`;
    const percent = (i + 1) / total;
    const filledWidth = Math.round(barWidth * percent);
    const bar = `[${'█'.repeat(filledWidth)}${'-'.repeat(barWidth - filledWidth)}]`;
    const percentStr = `${(percent * 100).toFixed(1)}%`;

    readline.clearLine(process.stdout, 0);
    readline.cursorTo(process.stdout, 0);
    process.stdout.write(`${bar} ${percentStr} | [${i + 1}/${total}] Processing: ${indexName}`);

    const indexBody = generateIndexBody(
      config.mappings,
      config.maxFields,
      config.shards,
      config.replicas
    );

    try {
      await makeRequest('PUT', `/${indexName}`, indexBody);
      createdCount++;
    } catch (error) {
      if (error.isAlreadyExists) {
        skippedCount++;
        continue;
      }

      process.stdout.write('\n');
      console.error(`\n[FATAL] Failed while processing index ${indexName}:`, error.message);
      console.error(
        'Exiting due to a critical error. Please check your Elasticsearch cluster status and settings.'
      );
      process.exit(1);
    }
  }
  process.stdout.write('\n');
  console.log(
    `\nPopulation complete. Created: ${createdCount}, Skipped: ${skippedCount}, Total processed: ${
      createdCount + skippedCount
    }.`
  );
}

async function main() {
  if (config.cleanup) {
    await cleanupIndices();
  } else if (config.deleteByCount > 0) {
    await deleteIndicesByCount(config.deleteByCount);
  } else {
    await createIndices();
  }
}

main().catch((err) => {
  console.error('\nAn unexpected error occurred:', err.message);
  process.exit(1);
});

and for local dev call ala:

node populate_es.js --indices 4000 --mappings 4000

or for cloud clusters:

node populate_es.js --host https://kibana-pr-231376.es.us-west2.gcp.elastic-cloud.com/ --apiKey asdf== --indices 4000 --mappings 4000

and to cleanup all the garbage it made:

node populate_es.js --cleanup --yes

and since I didn't add circuit breaker detection, if you go too far and ES/Kibana won't start, use this to start deleting chunks of indices till the system is healthy again:

node populate_es.js --delete-by-count 20

I tested this both locally and with the ci:cloud-deploy instance linked and all was well! 🎉

Index suggestions worked without issue, and field fetching continued to work as well (even saw those getting cached in the network panel, which is nice :).

When I pushed it to the cluster limits I was seeing issues everywhere else before I could even make it to the KB UI, so I think we're good here! 😅

@spong spong added v9.1.3 and removed v9.1.2 labels Aug 13, 2025
@spong spong merged commit 4496b50 into elastic:main Aug 13, 2025
16 checks passed
@spong spong deleted the index-entry-text-support branch August 13, 2025 22:33
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 9.1

https://github.com/elastic/kibana/actions/runs/16950887524

@kibanamachine
Copy link
Contributor

💔 All backports failed

Status Branch Result
9.1 Backport failed because of merge conflicts

You might need to backport the following PRs to 9.1:
- [Fleet] Improve installation of bundled packages in airgapped environments (#230992)
- Upgrade inquirer (#231019)
- [scout] log --grep in test run command (#231649)
- [ska] relocate Security serverless api & functional tests (#231277)
- [Security Solution] Install mock prebuilt rules package in Cypress to reduce flakiness (#229689)
- [ska] relocate Observability serverless api & functional tests (#231256)

Manual backport

To create the backport manually run:

node scripts/backport --pr 231376

Questions ?

Please refer to the Backport tool documentation

@spong
Copy link
Member Author

spong commented Aug 13, 2025

💚 All backports created successfully

Status Branch Result
9.1

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

spong added a commit to spong/kibana that referenced this pull request Aug 13, 2025
… Base Index Entries in deployments with a large number of indices/mappings (elastic#231376)

## Summary

This PR fixes an issue with the Security Assistant KB Index Entries
interface introduced in `9.1` where a large number of indices/mappings
could result in Kibana crashing and preventing the creation of the Index
Entry.

This is technically a 'fix-hancement' as we are bypassing the underlying
issue altogether by switching to use common core API's for index/field
suggestions (just as is done in the Discover 'Create a data view'
interface), and in turn supporting all fields of type `text`, not just
`semantic_text` (elastic#230863).

### Original Issue
The underlying issue here was introduced by a change to the `field_caps`
API in `9.1` (elastic/elasticsearch#127664) that
resulted in the `/internal/elastic_assistant/knowledge_base/_indices`
route not finding any indices with a `semantic_text` field, and thus
inadvertently falling back to doing a full scan of all mappings
([source](https://github.com/elastic/kibana/blob/b128cee4ee7ccc367e8acf159dbf58a75f081867/x-pack/solutions/security/plugins/elastic_assistant/server/routes/knowledge_base/get_knowledge_base_indices.ts#L69)).
A fix to this API was initially investigated, but there was no
reasonable API available for fetching all occurrences of `semantic_text`
fields, so with `match` queries adding support for `semantic_text` in
`8.18`, it was decided to go ahead and enable support for all `text`
fields.

### Fix Details

The `Index` input field now uses the `dataViews.getIndices()` API for
suggestions (instead of the `useKnowledgeBaseIndices` hook/route), which
is backed by the [resolve indices ES
API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-resolve-index).
System indices are filtered out with the `*,-.*` filter. The initial
call will return all indices just as the Discover 'Create a data view'
interface, which is further filtered upon as the user continues to type.
Note: Discover makes subsequent calls upon further user input, though
I'm not entirely sure this is necessary here as all indices are
initially returned and available for client-side filtering within the
input. I will perform further stress testing with many indices/mappings
to confirm.

The `Field` input now uses the fields already queried for the `Output
fields` input suggestions (via `dataViews.getFieldsForWildcard()`), just
filtered to those whose `field.esTypes?.includes('text')`. This is
potentially still a hot path for client-side code with many mappings, so
I will also confirm this with further stress testing.

### Docs

@elastic/security-docs / @benironside, we will need to update the
Security Assistant KB docs
[here](https://www.elastic.co/docs/solutions/security/ai/ai-assistant-knowledge-base#knowledge-base-add-knowledge-index)
to indicate that any `text` field is now supported for retrieval.

### Testing

#### Functional Testing

To confirm proper retrieval of both `text` and `semantic_text` fields
we'll need to create an index, add a document, then create the KB Index
Entry. Then update the KB Index Entry to reference the other field type,
and test again.

<details><summary>Create Index</summary>
<p>

``` JSON
PUT project-details
{
  "mappings": {
    "properties": {
      "project_issue": {
        "type": "text"
      },
      "project_name": {
        "type": "text"
      },
      "summary": {
        "type": "semantic_text",
        "inference_id": ".elser-2-elasticsearch",
        "model_settings": {
          "service": "elasticsearch",
          "task_type": "sparse_embedding"
        }
      }
    }
  }
}
```
</p>
</details>

<details><summary>Create Sample Doc</summary>
<p>

``` JSON
PUT project-details/_doc/doc1
{
    "project_issue": "The main issue at hand is the breaking of the space plane",
    "project_name": "Issue 5",
    "summary": "This is a summary that contains the word yellow"
}
```
</p>
</details>

<details><summary>Create KB Index Entry (Text)</summary>
<p>

``` JSON
POST kbn:api/security_ai_assistant/knowledge_base/entries
{
  "type": "index",
  "name": "Project Details Tool",
  "index": "project-details",
  "field": "project_issue",
  "outputFields": [],
  "description": "Use this index to answer questions about any project details.",
  "queryDescription": "Key terms to search for from the user's prompt."
}
```
</p>
</details>

Now you can open the Assistant and perform a query like:
```
Do I have any project details about the issue at hand?
```

Which should result in a [trace like
this](https://smith.langchain.com/public/208338a6-42f3-4a9d-bc4c-0ff38d06d34c/r)
which calls the generated tool that will then perform a _lexical search_
against the configured index. Ensure citations work as expected for the
returned document.

Now open the Index Entry in the KB Settings UI and change the `field` to
from `project_issue` to `summary` for testing `semantic_text`.

Open the Assistant and perform a query like:
```
Do I have any project details for Project Yellow?
```

Which should result in a [trace like
this](https://smith.langchain.com/public/3041a48b-5dfd-4c89-9f51-0bcdffd38f63/r)
which calls the generated tool that will then perform a _semantic
search_ against the configured index. Ensure citations work as expected
for the returned document.

#### Performance Testing

⚠️ In progress -- I will include a script for generating many
indices/mappings for testing, and also prepare the `ci-cloud-deploy`
instance with the same setup for confirmation.

### Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

- [X] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
  * Will coordinate with @elastic/security-docs on the docs update here.
- [X] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
(cherry picked from commit 4496b50)

# Conflicts:
#	x-pack/platform/plugins/private/translations/translations/de-DE.json
spong added a commit that referenced this pull request Aug 14, 2025
…wledge Base Index Entries in deployments with a large number of indices/mappings (#231376) (#231717)

# Backport

This will backport the following commits from `main` to `9.1`:
- [[Security Assistant] Fixes issue preventing the creation of Knowledge
Base Index Entries in deployments with a large number of
indices/mappings
(#231376)](#231376)

<!--- Backport version: 10.0.1 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Garrett
Spong","email":"[email protected]"},"sourceCommit":{"committedDate":"2025-08-13T22:33:31Z","message":"[Security
Assistant] Fixes issue preventing the creation of Knowledge Base Index
Entries in deployments with a large number of indices/mappings
(#231376)\n\n## Summary\n\nThis PR fixes an issue with the Security
Assistant KB Index Entries\ninterface introduced in `9.1` where a large
number of indices/mappings\ncould result in Kibana crashing and
preventing the creation of the Index\nEntry.\n\nThis is technically a
'fix-hancement' as we are bypassing the underlying\nissue altogether by
switching to use common core API's for index/field\nsuggestions (just as
is done in the Discover 'Create a data view'\ninterface), and in turn
supporting all fields of type `text`, not just\n`semantic_text`
(https://github.com/elastic/kibana/issues/230863).\n\n### Original
Issue\nThe underlying issue here was introduced by a change to the
`field_caps`\nAPI in `9.1`
(elastic/elasticsearch#127664) that\nresulted in
the `/internal/elastic_assistant/knowledge_base/_indices`\nroute not
finding any indices with a `semantic_text` field, and
thus\ninadvertently falling back to doing a full scan of all
mappings\n([source](https://github.com/elastic/kibana/blob/b128cee4ee7ccc367e8acf159dbf58a75f081867/x-pack/solutions/security/plugins/elastic_assistant/server/routes/knowledge_base/get_knowledge_base_indices.ts#L69)).\nA
fix to this API was initially investigated, but there was no\nreasonable
API available for fetching all occurrences of `semantic_text`\nfields,
so with `match` queries adding support for `semantic_text` in\n`8.18`,
it was decided to go ahead and enable support for all
`text`\nfields.\n\n\n### Fix Details\n\nThe `Index` input field now uses
the `dataViews.getIndices()` API for\nsuggestions (instead of the
`useKnowledgeBaseIndices` hook/route), which\nis backed by the [resolve
indices
ES\nAPI](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-resolve-index).\nSystem
indices are filtered out with the `*,-.*` filter. The initial\ncall will
return all indices just as the Discover 'Create a data view'\ninterface,
which is further filtered upon as the user continues to type.\nNote:
Discover makes subsequent calls upon further user input, though\nI'm not
entirely sure this is necessary here as all indices are\ninitially
returned and available for client-side filtering within the\ninput. I
will perform further stress testing with many indices/mappings\nto
confirm.\n\nThe `Field` input now uses the fields already queried for
the `Output\nfields` input suggestions (via
`dataViews.getFieldsForWildcard()`), just\nfiltered to those whose
`field.esTypes?.includes('text')`. This is\npotentially still a hot path
for client-side code with many mappings, so\nI will also confirm this
with further stress testing.\n\n\n### Docs\n\n@elastic/security-docs /
@benironside, we will need to update the\nSecurity Assistant KB
docs\n[here](https://www.elastic.co/docs/solutions/security/ai/ai-assistant-knowledge-base#knowledge-base-add-knowledge-index)\nto
indicate that any `text` field is now supported for retrieval.\n\n\n###
Testing\n\n#### Functional Testing\n\nTo confirm proper retrieval of
both `text` and `semantic_text` fields\nwe'll need to create an index,
add a document, then create the KB Index\nEntry. Then update the KB
Index Entry to reference the other field type,\nand test
again.\n\n<details><summary>Create Index</summary>\n<p>\n\n``` JSON\nPUT
project-details\n{\n \"mappings\": {\n \"properties\": {\n
\"project_issue\": {\n \"type\": \"text\"\n },\n \"project_name\": {\n
\"type\": \"text\"\n },\n \"summary\": {\n \"type\":
\"semantic_text\",\n \"inference_id\": \".elser-2-elasticsearch\",\n
\"model_settings\": {\n \"service\": \"elasticsearch\",\n \"task_type\":
\"sparse_embedding\"\n }\n }\n }\n }\n}\n```\n</p>\n</details>
\n\n<details><summary>Create Sample Doc</summary>\n<p>\n\n``` JSON\nPUT
project-details/_doc/doc1\n{\n \"project_issue\": \"The main issue at
hand is the breaking of the space plane\",\n \"project_name\": \"Issue
5\",\n \"summary\": \"This is a summary that contains the word
yellow\"\n}\n```\n</p>\n</details> \n\n<details><summary>Create KB Index
Entry (Text)</summary>\n<p>\n\n``` JSON\nPOST
kbn:api/security_ai_assistant/knowledge_base/entries\n{\n \"type\":
\"index\",\n \"name\": \"Project Details Tool\",\n \"index\":
\"project-details\",\n \"field\": \"project_issue\",\n \"outputFields\":
[],\n \"description\": \"Use this index to answer questions about any
project details.\",\n \"queryDescription\": \"Key terms to search for
from the user's prompt.\"\n}\n```\n</p>\n</details> \n\nNow you can open
the Assistant and perform a query like:\n```\nDo I have any project
details about the issue at hand?\n```\n\nWhich should result in a [trace
like\nthis](https://smith.langchain.com/public/208338a6-42f3-4a9d-bc4c-0ff38d06d34c/r)\nwhich
calls the generated tool that will then perform a _lexical
search_\nagainst the configured index. Ensure citations work as expected
for the\nreturned document.\n\nNow open the Index Entry in the KB
Settings UI and change the `field` to\nfrom `project_issue` to `summary`
for testing `semantic_text`.\n\nOpen the Assistant and perform a query
like:\n```\nDo I have any project details for Project
Yellow?\n```\n\nWhich should result in a [trace
like\nthis](https://smith.langchain.com/public/3041a48b-5dfd-4c89-9f51-0bcdffd38f63/r)\nwhich
calls the generated tool that will then perform a _semantic\nsearch_
against the configured index. Ensure citations work as expected\nfor the
returned document.\n\n\n\n\n\n\n\n#### Performance Testing\n\n⚠️ In
progress -- I will include a script for generating
many\nindices/mappings for testing, and also prepare the
`ci-cloud-deploy`\ninstance with the same setup for
confirmation.\n\n\n### Checklist\n\nCheck the PR satisfies following
conditions. \n\nReviewers should verify this PR satisfies this list as
well.\n\n- [X] Any text added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n * Will
coordinate with @elastic/security-docs on the docs update here.\n- [X]
[Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common
scenarios\n\n---------\n\nCo-authored-by: kibanamachine
<[email protected]>","sha":"4496b50237b5d6ff8acb105662912268717bf3f6","branchLabelMapping":{"^v9.2.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["bug","release_note:fix","needs_docs","sdh-linked","ci:cloud-deploy","Team:Security
Generative AI","backport:version","v9.2.0","v9.1.3"],"title":"[Security
Assistant] Fixes issue preventing the creation of Knowledge Base Index
Entries in deployments with a large number of
indices/mappings","number":231376,"url":"https://github.com/elastic/kibana/pull/231376","mergeCommit":{"message":"[Security
Assistant] Fixes issue preventing the creation of Knowledge Base Index
Entries in deployments with a large number of indices/mappings
(#231376)\n\n## Summary\n\nThis PR fixes an issue with the Security
Assistant KB Index Entries\ninterface introduced in `9.1` where a large
number of indices/mappings\ncould result in Kibana crashing and
preventing the creation of the Index\nEntry.\n\nThis is technically a
'fix-hancement' as we are bypassing the underlying\nissue altogether by
switching to use common core API's for index/field\nsuggestions (just as
is done in the Discover 'Create a data view'\ninterface), and in turn
supporting all fields of type `text`, not just\n`semantic_text`
(https://github.com/elastic/kibana/issues/230863).\n\n### Original
Issue\nThe underlying issue here was introduced by a change to the
`field_caps`\nAPI in `9.1`
(elastic/elasticsearch#127664) that\nresulted in
the `/internal/elastic_assistant/knowledge_base/_indices`\nroute not
finding any indices with a `semantic_text` field, and
thus\ninadvertently falling back to doing a full scan of all
mappings\n([source](https://github.com/elastic/kibana/blob/b128cee4ee7ccc367e8acf159dbf58a75f081867/x-pack/solutions/security/plugins/elastic_assistant/server/routes/knowledge_base/get_knowledge_base_indices.ts#L69)).\nA
fix to this API was initially investigated, but there was no\nreasonable
API available for fetching all occurrences of `semantic_text`\nfields,
so with `match` queries adding support for `semantic_text` in\n`8.18`,
it was decided to go ahead and enable support for all
`text`\nfields.\n\n\n### Fix Details\n\nThe `Index` input field now uses
the `dataViews.getIndices()` API for\nsuggestions (instead of the
`useKnowledgeBaseIndices` hook/route), which\nis backed by the [resolve
indices
ES\nAPI](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-resolve-index).\nSystem
indices are filtered out with the `*,-.*` filter. The initial\ncall will
return all indices just as the Discover 'Create a data view'\ninterface,
which is further filtered upon as the user continues to type.\nNote:
Discover makes subsequent calls upon further user input, though\nI'm not
entirely sure this is necessary here as all indices are\ninitially
returned and available for client-side filtering within the\ninput. I
will perform further stress testing with many indices/mappings\nto
confirm.\n\nThe `Field` input now uses the fields already queried for
the `Output\nfields` input suggestions (via
`dataViews.getFieldsForWildcard()`), just\nfiltered to those whose
`field.esTypes?.includes('text')`. This is\npotentially still a hot path
for client-side code with many mappings, so\nI will also confirm this
with further stress testing.\n\n\n### Docs\n\n@elastic/security-docs /
@benironside, we will need to update the\nSecurity Assistant KB
docs\n[here](https://www.elastic.co/docs/solutions/security/ai/ai-assistant-knowledge-base#knowledge-base-add-knowledge-index)\nto
indicate that any `text` field is now supported for retrieval.\n\n\n###
Testing\n\n#### Functional Testing\n\nTo confirm proper retrieval of
both `text` and `semantic_text` fields\nwe'll need to create an index,
add a document, then create the KB Index\nEntry. Then update the KB
Index Entry to reference the other field type,\nand test
again.\n\n<details><summary>Create Index</summary>\n<p>\n\n``` JSON\nPUT
project-details\n{\n \"mappings\": {\n \"properties\": {\n
\"project_issue\": {\n \"type\": \"text\"\n },\n \"project_name\": {\n
\"type\": \"text\"\n },\n \"summary\": {\n \"type\":
\"semantic_text\",\n \"inference_id\": \".elser-2-elasticsearch\",\n
\"model_settings\": {\n \"service\": \"elasticsearch\",\n \"task_type\":
\"sparse_embedding\"\n }\n }\n }\n }\n}\n```\n</p>\n</details>
\n\n<details><summary>Create Sample Doc</summary>\n<p>\n\n``` JSON\nPUT
project-details/_doc/doc1\n{\n \"project_issue\": \"The main issue at
hand is the breaking of the space plane\",\n \"project_name\": \"Issue
5\",\n \"summary\": \"This is a summary that contains the word
yellow\"\n}\n```\n</p>\n</details> \n\n<details><summary>Create KB Index
Entry (Text)</summary>\n<p>\n\n``` JSON\nPOST
kbn:api/security_ai_assistant/knowledge_base/entries\n{\n \"type\":
\"index\",\n \"name\": \"Project Details Tool\",\n \"index\":
\"project-details\",\n \"field\": \"project_issue\",\n \"outputFields\":
[],\n \"description\": \"Use this index to answer questions about any
project details.\",\n \"queryDescription\": \"Key terms to search for
from the user's prompt.\"\n}\n```\n</p>\n</details> \n\nNow you can open
the Assistant and perform a query like:\n```\nDo I have any project
details about the issue at hand?\n```\n\nWhich should result in a [trace
like\nthis](https://smith.langchain.com/public/208338a6-42f3-4a9d-bc4c-0ff38d06d34c/r)\nwhich
calls the generated tool that will then perform a _lexical
search_\nagainst the configured index. Ensure citations work as expected
for the\nreturned document.\n\nNow open the Index Entry in the KB
Settings UI and change the `field` to\nfrom `project_issue` to `summary`
for testing `semantic_text`.\n\nOpen the Assistant and perform a query
like:\n```\nDo I have any project details for Project
Yellow?\n```\n\nWhich should result in a [trace
like\nthis](https://smith.langchain.com/public/3041a48b-5dfd-4c89-9f51-0bcdffd38f63/r)\nwhich
calls the generated tool that will then perform a _semantic\nsearch_
against the configured index. Ensure citations work as expected\nfor the
returned document.\n\n\n\n\n\n\n\n#### Performance Testing\n\n⚠️ In
progress -- I will include a script for generating
many\nindices/mappings for testing, and also prepare the
`ci-cloud-deploy`\ninstance with the same setup for
confirmation.\n\n\n### Checklist\n\nCheck the PR satisfies following
conditions. \n\nReviewers should verify this PR satisfies this list as
well.\n\n- [X] Any text added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n * Will
coordinate with @elastic/security-docs on the docs update here.\n- [X]
[Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common
scenarios\n\n---------\n\nCo-authored-by: kibanamachine
<[email protected]>","sha":"4496b50237b5d6ff8acb105662912268717bf3f6"}},"sourceBranch":"main","suggestedTargetBranches":["9.1"],"targetPullRequestStates":[{"branch":"main","label":"v9.2.0","branchLabelMappingKey":"^v9.2.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/231376","number":231376,"mergeCommit":{"message":"[Security
Assistant] Fixes issue preventing the creation of Knowledge Base Index
Entries in deployments with a large number of indices/mappings
(#231376)\n\n## Summary\n\nThis PR fixes an issue with the Security
Assistant KB Index Entries\ninterface introduced in `9.1` where a large
number of indices/mappings\ncould result in Kibana crashing and
preventing the creation of the Index\nEntry.\n\nThis is technically a
'fix-hancement' as we are bypassing the underlying\nissue altogether by
switching to use common core API's for index/field\nsuggestions (just as
is done in the Discover 'Create a data view'\ninterface), and in turn
supporting all fields of type `text`, not just\n`semantic_text`
(https://github.com/elastic/kibana/issues/230863).\n\n### Original
Issue\nThe underlying issue here was introduced by a change to the
`field_caps`\nAPI in `9.1`
(elastic/elasticsearch#127664) that\nresulted in
the `/internal/elastic_assistant/knowledge_base/_indices`\nroute not
finding any indices with a `semantic_text` field, and
thus\ninadvertently falling back to doing a full scan of all
mappings\n([source](https://github.com/elastic/kibana/blob/b128cee4ee7ccc367e8acf159dbf58a75f081867/x-pack/solutions/security/plugins/elastic_assistant/server/routes/knowledge_base/get_knowledge_base_indices.ts#L69)).\nA
fix to this API was initially investigated, but there was no\nreasonable
API available for fetching all occurrences of `semantic_text`\nfields,
so with `match` queries adding support for `semantic_text` in\n`8.18`,
it was decided to go ahead and enable support for all
`text`\nfields.\n\n\n### Fix Details\n\nThe `Index` input field now uses
the `dataViews.getIndices()` API for\nsuggestions (instead of the
`useKnowledgeBaseIndices` hook/route), which\nis backed by the [resolve
indices
ES\nAPI](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-resolve-index).\nSystem
indices are filtered out with the `*,-.*` filter. The initial\ncall will
return all indices just as the Discover 'Create a data view'\ninterface,
which is further filtered upon as the user continues to type.\nNote:
Discover makes subsequent calls upon further user input, though\nI'm not
entirely sure this is necessary here as all indices are\ninitially
returned and available for client-side filtering within the\ninput. I
will perform further stress testing with many indices/mappings\nto
confirm.\n\nThe `Field` input now uses the fields already queried for
the `Output\nfields` input suggestions (via
`dataViews.getFieldsForWildcard()`), just\nfiltered to those whose
`field.esTypes?.includes('text')`. This is\npotentially still a hot path
for client-side code with many mappings, so\nI will also confirm this
with further stress testing.\n\n\n### Docs\n\n@elastic/security-docs /
@benironside, we will need to update the\nSecurity Assistant KB
docs\n[here](https://www.elastic.co/docs/solutions/security/ai/ai-assistant-knowledge-base#knowledge-base-add-knowledge-index)\nto
indicate that any `text` field is now supported for retrieval.\n\n\n###
Testing\n\n#### Functional Testing\n\nTo confirm proper retrieval of
both `text` and `semantic_text` fields\nwe'll need to create an index,
add a document, then create the KB Index\nEntry. Then update the KB
Index Entry to reference the other field type,\nand test
again.\n\n<details><summary>Create Index</summary>\n<p>\n\n``` JSON\nPUT
project-details\n{\n \"mappings\": {\n \"properties\": {\n
\"project_issue\": {\n \"type\": \"text\"\n },\n \"project_name\": {\n
\"type\": \"text\"\n },\n \"summary\": {\n \"type\":
\"semantic_text\",\n \"inference_id\": \".elser-2-elasticsearch\",\n
\"model_settings\": {\n \"service\": \"elasticsearch\",\n \"task_type\":
\"sparse_embedding\"\n }\n }\n }\n }\n}\n```\n</p>\n</details>
\n\n<details><summary>Create Sample Doc</summary>\n<p>\n\n``` JSON\nPUT
project-details/_doc/doc1\n{\n \"project_issue\": \"The main issue at
hand is the breaking of the space plane\",\n \"project_name\": \"Issue
5\",\n \"summary\": \"This is a summary that contains the word
yellow\"\n}\n```\n</p>\n</details> \n\n<details><summary>Create KB Index
Entry (Text)</summary>\n<p>\n\n``` JSON\nPOST
kbn:api/security_ai_assistant/knowledge_base/entries\n{\n \"type\":
\"index\",\n \"name\": \"Project Details Tool\",\n \"index\":
\"project-details\",\n \"field\": \"project_issue\",\n \"outputFields\":
[],\n \"description\": \"Use this index to answer questions about any
project details.\",\n \"queryDescription\": \"Key terms to search for
from the user's prompt.\"\n}\n```\n</p>\n</details> \n\nNow you can open
the Assistant and perform a query like:\n```\nDo I have any project
details about the issue at hand?\n```\n\nWhich should result in a [trace
like\nthis](https://smith.langchain.com/public/208338a6-42f3-4a9d-bc4c-0ff38d06d34c/r)\nwhich
calls the generated tool that will then perform a _lexical
search_\nagainst the configured index. Ensure citations work as expected
for the\nreturned document.\n\nNow open the Index Entry in the KB
Settings UI and change the `field` to\nfrom `project_issue` to `summary`
for testing `semantic_text`.\n\nOpen the Assistant and perform a query
like:\n```\nDo I have any project details for Project
Yellow?\n```\n\nWhich should result in a [trace
like\nthis](https://smith.langchain.com/public/3041a48b-5dfd-4c89-9f51-0bcdffd38f63/r)\nwhich
calls the generated tool that will then perform a _semantic\nsearch_
against the configured index. Ensure citations work as expected\nfor the
returned document.\n\n\n\n\n\n\n\n#### Performance Testing\n\n⚠️ In
progress -- I will include a script for generating
many\nindices/mappings for testing, and also prepare the
`ci-cloud-deploy`\ninstance with the same setup for
confirmation.\n\n\n### Checklist\n\nCheck the PR satisfies following
conditions. \n\nReviewers should verify this PR satisfies this list as
well.\n\n- [X] Any text added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n * Will
coordinate with @elastic/security-docs on the docs update here.\n- [X]
[Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common
scenarios\n\n---------\n\nCo-authored-by: kibanamachine
<[email protected]>","sha":"4496b50237b5d6ff8acb105662912268717bf3f6"}},{"branch":"9.1","label":"v9.1.3","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->
fkanout pushed a commit to fkanout/kibana that referenced this pull request Aug 14, 2025
… Base Index Entries in deployments with a large number of indices/mappings (elastic#231376)

## Summary

This PR fixes an issue with the Security Assistant KB Index Entries
interface introduced in `9.1` where a large number of indices/mappings
could result in Kibana crashing and preventing the creation of the Index
Entry.

This is technically a 'fix-hancement' as we are bypassing the underlying
issue altogether by switching to use common core API's for index/field
suggestions (just as is done in the Discover 'Create a data view'
interface), and in turn supporting all fields of type `text`, not just
`semantic_text` (elastic#230863).

### Original Issue
The underlying issue here was introduced by a change to the `field_caps`
API in `9.1` (elastic/elasticsearch#127664) that
resulted in the `/internal/elastic_assistant/knowledge_base/_indices`
route not finding any indices with a `semantic_text` field, and thus
inadvertently falling back to doing a full scan of all mappings
([source](https://github.com/elastic/kibana/blob/b128cee4ee7ccc367e8acf159dbf58a75f081867/x-pack/solutions/security/plugins/elastic_assistant/server/routes/knowledge_base/get_knowledge_base_indices.ts#L69)).
A fix to this API was initially investigated, but there was no
reasonable API available for fetching all occurrences of `semantic_text`
fields, so with `match` queries adding support for `semantic_text` in
`8.18`, it was decided to go ahead and enable support for all `text`
fields.


### Fix Details

The `Index` input field now uses the `dataViews.getIndices()` API for
suggestions (instead of the `useKnowledgeBaseIndices` hook/route), which
is backed by the [resolve indices ES
API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-resolve-index).
System indices are filtered out with the `*,-.*` filter. The initial
call will return all indices just as the Discover 'Create a data view'
interface, which is further filtered upon as the user continues to type.
Note: Discover makes subsequent calls upon further user input, though
I'm not entirely sure this is necessary here as all indices are
initially returned and available for client-side filtering within the
input. I will perform further stress testing with many indices/mappings
to confirm.

The `Field` input now uses the fields already queried for the `Output
fields` input suggestions (via `dataViews.getFieldsForWildcard()`), just
filtered to those whose `field.esTypes?.includes('text')`. This is
potentially still a hot path for client-side code with many mappings, so
I will also confirm this with further stress testing.


### Docs

@elastic/security-docs / @benironside, we will need to update the
Security Assistant KB docs
[here](https://www.elastic.co/docs/solutions/security/ai/ai-assistant-knowledge-base#knowledge-base-add-knowledge-index)
to indicate that any `text` field is now supported for retrieval.


### Testing

#### Functional Testing

To confirm proper retrieval of both `text` and `semantic_text` fields
we'll need to create an index, add a document, then create the KB Index
Entry. Then update the KB Index Entry to reference the other field type,
and test again.

<details><summary>Create Index</summary>
<p>

``` JSON
PUT project-details
{
  "mappings": {
    "properties": {
      "project_issue": {
        "type": "text"
      },
      "project_name": {
        "type": "text"
      },
      "summary": {
        "type": "semantic_text",
        "inference_id": ".elser-2-elasticsearch",
        "model_settings": {
          "service": "elasticsearch",
          "task_type": "sparse_embedding"
        }
      }
    }
  }
}
```
</p>
</details> 

<details><summary>Create Sample Doc</summary>
<p>

``` JSON
PUT project-details/_doc/doc1
{
    "project_issue": "The main issue at hand is the breaking of the space plane",
    "project_name": "Issue 5",
    "summary": "This is a summary that contains the word yellow"
}
```
</p>
</details> 

<details><summary>Create KB Index Entry (Text)</summary>
<p>

``` JSON
POST kbn:api/security_ai_assistant/knowledge_base/entries
{
  "type": "index",
  "name": "Project Details Tool",
  "index": "project-details",
  "field": "project_issue",
  "outputFields": [],
  "description": "Use this index to answer questions about any project details.",
  "queryDescription": "Key terms to search for from the user's prompt."
}
```
</p>
</details> 

Now you can open the Assistant and perform a query like:
```
Do I have any project details about the issue at hand?
```

Which should result in a [trace like
this](https://smith.langchain.com/public/208338a6-42f3-4a9d-bc4c-0ff38d06d34c/r)
which calls the generated tool that will then perform a _lexical search_
against the configured index. Ensure citations work as expected for the
returned document.

Now open the Index Entry in the KB Settings UI and change the `field` to
from `project_issue` to `summary` for testing `semantic_text`.

Open the Assistant and perform a query like:
```
Do I have any project details for Project Yellow?
```

Which should result in a [trace like
this](https://smith.langchain.com/public/3041a48b-5dfd-4c89-9f51-0bcdffd38f63/r)
which calls the generated tool that will then perform a _semantic
search_ against the configured index. Ensure citations work as expected
for the returned document.







#### Performance Testing

⚠️ In progress -- I will include a script for generating many
indices/mappings for testing, and also prepare the `ci-cloud-deploy`
instance with the same setup for confirmation.


### Checklist

Check the PR satisfies following conditions. 

Reviewers should verify this PR satisfies this list as well.

- [X] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
  * Will coordinate with @elastic/security-docs on the docs update here.
- [X] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
spong added a commit that referenced this pull request Aug 15, 2025
…#231725)

## Summary

This is a follow-up to #231376
where I added new `docLinks` for the Security Solution AI Assistant.
This PR removes the `TODO` added in that PR and creates a new
`securitySolution.aiAssistant` grouping so we can more easily add
docLinks as needed.

Reviewers Note: Sorry for the extra noise here -- I thought there were
more references so decided to do in another PR. Turns out there were
not, so this is a quick one!
spong added a commit to spong/kibana that referenced this pull request Aug 15, 2025
…elastic#231725)

## Summary

This is a follow-up to elastic#231376
where I added new `docLinks` for the Security Solution AI Assistant.
This PR removes the `TODO` added in that PR and creates a new
`securitySolution.aiAssistant` grouping so we can more easily add
docLinks as needed.

Reviewers Note: Sorry for the extra noise here -- I thought there were
more references so decided to do in another PR. Turns out there were
not, so this is a quick one!

(cherry picked from commit 90470cf)

# Conflicts:
#	src/platform/plugins/shared/ai_assistant_management/selection/public/routes/components/ai_assistant_selection_page.tsx
NicholasPeretti pushed a commit to NicholasPeretti/kibana that referenced this pull request Aug 18, 2025
… Base Index Entries in deployments with a large number of indices/mappings (elastic#231376)

## Summary

This PR fixes an issue with the Security Assistant KB Index Entries
interface introduced in `9.1` where a large number of indices/mappings
could result in Kibana crashing and preventing the creation of the Index
Entry.

This is technically a 'fix-hancement' as we are bypassing the underlying
issue altogether by switching to use common core API's for index/field
suggestions (just as is done in the Discover 'Create a data view'
interface), and in turn supporting all fields of type `text`, not just
`semantic_text` (elastic#230863).

### Original Issue
The underlying issue here was introduced by a change to the `field_caps`
API in `9.1` (elastic/elasticsearch#127664) that
resulted in the `/internal/elastic_assistant/knowledge_base/_indices`
route not finding any indices with a `semantic_text` field, and thus
inadvertently falling back to doing a full scan of all mappings
([source](https://github.com/elastic/kibana/blob/b128cee4ee7ccc367e8acf159dbf58a75f081867/x-pack/solutions/security/plugins/elastic_assistant/server/routes/knowledge_base/get_knowledge_base_indices.ts#L69)).
A fix to this API was initially investigated, but there was no
reasonable API available for fetching all occurrences of `semantic_text`
fields, so with `match` queries adding support for `semantic_text` in
`8.18`, it was decided to go ahead and enable support for all `text`
fields.


### Fix Details

The `Index` input field now uses the `dataViews.getIndices()` API for
suggestions (instead of the `useKnowledgeBaseIndices` hook/route), which
is backed by the [resolve indices ES
API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-resolve-index).
System indices are filtered out with the `*,-.*` filter. The initial
call will return all indices just as the Discover 'Create a data view'
interface, which is further filtered upon as the user continues to type.
Note: Discover makes subsequent calls upon further user input, though
I'm not entirely sure this is necessary here as all indices are
initially returned and available for client-side filtering within the
input. I will perform further stress testing with many indices/mappings
to confirm.

The `Field` input now uses the fields already queried for the `Output
fields` input suggestions (via `dataViews.getFieldsForWildcard()`), just
filtered to those whose `field.esTypes?.includes('text')`. This is
potentially still a hot path for client-side code with many mappings, so
I will also confirm this with further stress testing.


### Docs

@elastic/security-docs / @benironside, we will need to update the
Security Assistant KB docs
[here](https://www.elastic.co/docs/solutions/security/ai/ai-assistant-knowledge-base#knowledge-base-add-knowledge-index)
to indicate that any `text` field is now supported for retrieval.


### Testing

#### Functional Testing

To confirm proper retrieval of both `text` and `semantic_text` fields
we'll need to create an index, add a document, then create the KB Index
Entry. Then update the KB Index Entry to reference the other field type,
and test again.

<details><summary>Create Index</summary>
<p>

``` JSON
PUT project-details
{
  "mappings": {
    "properties": {
      "project_issue": {
        "type": "text"
      },
      "project_name": {
        "type": "text"
      },
      "summary": {
        "type": "semantic_text",
        "inference_id": ".elser-2-elasticsearch",
        "model_settings": {
          "service": "elasticsearch",
          "task_type": "sparse_embedding"
        }
      }
    }
  }
}
```
</p>
</details> 

<details><summary>Create Sample Doc</summary>
<p>

``` JSON
PUT project-details/_doc/doc1
{
    "project_issue": "The main issue at hand is the breaking of the space plane",
    "project_name": "Issue 5",
    "summary": "This is a summary that contains the word yellow"
}
```
</p>
</details> 

<details><summary>Create KB Index Entry (Text)</summary>
<p>

``` JSON
POST kbn:api/security_ai_assistant/knowledge_base/entries
{
  "type": "index",
  "name": "Project Details Tool",
  "index": "project-details",
  "field": "project_issue",
  "outputFields": [],
  "description": "Use this index to answer questions about any project details.",
  "queryDescription": "Key terms to search for from the user's prompt."
}
```
</p>
</details> 

Now you can open the Assistant and perform a query like:
```
Do I have any project details about the issue at hand?
```

Which should result in a [trace like
this](https://smith.langchain.com/public/208338a6-42f3-4a9d-bc4c-0ff38d06d34c/r)
which calls the generated tool that will then perform a _lexical search_
against the configured index. Ensure citations work as expected for the
returned document.

Now open the Index Entry in the KB Settings UI and change the `field` to
from `project_issue` to `summary` for testing `semantic_text`.

Open the Assistant and perform a query like:
```
Do I have any project details for Project Yellow?
```

Which should result in a [trace like
this](https://smith.langchain.com/public/3041a48b-5dfd-4c89-9f51-0bcdffd38f63/r)
which calls the generated tool that will then perform a _semantic
search_ against the configured index. Ensure citations work as expected
for the returned document.







#### Performance Testing

⚠️ In progress -- I will include a script for generating many
indices/mappings for testing, and also prepare the `ci-cloud-deploy`
instance with the same setup for confirmation.


### Checklist

Check the PR satisfies following conditions. 

Reviewers should verify this PR satisfies this list as well.

- [X] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
  * Will coordinate with @elastic/security-docs on the docs update here.
- [X] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
NicholasPeretti pushed a commit to NicholasPeretti/kibana that referenced this pull request Aug 18, 2025
…elastic#231725)

## Summary

This is a follow-up to elastic#231376
where I added new `docLinks` for the Security Solution AI Assistant.
This PR removes the `TODO` added in that PR and creates a new
`securitySolution.aiAssistant` grouping so we can more easily add
docLinks as needed.

Reviewers Note: Sorry for the extra noise here -- I thought there were
more references so decided to do in another PR. Turns out there were
not, so this is a quick one!
spong added a commit that referenced this pull request Aug 22, 2025
…tions (#231904)

## Summary

Small follow-up improvement to
#231376 which added support for
`text` fields to Index Entries. This PR adds the field type as a badge
in the suggestions so users will know if a semantic or lexical search
will be performed (so they can adapt the query instructions
accordingly).


Note: Needed to update the field API request from
`dataViews.getFieldsForWildcard` (which called
`/internal/data_views/_fields_for_wildcard`) to use
`/api/index_management/mapping/[indexName]` as the former did not have
the option to include field type. I confirmed no new privileges were
necessary for this API, and the user just needs the same index
privileges as before.

cc @jamesspi 

Field Options:
<p align="center">
<img width="500"
src="https://github.com/user-attachments/assets/f138c7f0-1d89-4946-8d27-fa6c9c49c60b"
/>
</p> 

Output Field Options:
<p align="center">
<img width="500"
src="https://github.com/user-attachments/assets/2b0395e5-d71d-43af-8a23-9bacc4b02b54"
/>
</p> 


---

As part of this PR I've also included the helper script from
#231376 for testing these large
index/mapping scenarios. This script was almost entirely written in a
collab session with `gemini-cli`, and is located in:

> x-pack/solutions/security/plugins/elastic_assistant/scripts 

Options include:

``` bash
    Elasticsearch Index/Mapping Populator and Cleanup Script

    Usage:
      node stress_test_mappings.js [options]
      node stress_test_mappings.js --cleanup
      node stress_test_mappings.js --delete-by-count <number>

    Description:
      This script stress-tests an Elasticsearch instance by creating a large number
      of indices with many fields. It can also clean up the indices it creates.

    Creation Options:
      --host <url>          Elasticsearch host URL (default: http://localhost:9200)
      --user <username>     Username for basic auth (default: elastic)
      --pass <password>     Password for basic auth (default: changeme)
      --apiKey <key>        API key for authentication (overrides user/pass)
      --indices <number>    Number of indices to create (default: 5000)
      --mappings <number>   Number of mappings per index (default: 5000)
      --maxFields <number>  The max number of fields per index (default: same as --mappings)
      --shards <number>     Number of primary shards per index (default: 1)
      --replicas <number>   Number of replicas per index (default: 0)

    Cleanup & Recovery Options:
      --cleanup             Delete all indices created by this script.
      --delete-by-count <N> Delete the <N> newest stress-test indices.
      --yes                 Bypass confirmation prompt during cleanup.

    Other Options:
      -h, --help            Show this help message
```


And some test executions are as follows. First CD into the assistant
working directory:

```
cd x-pack/solutions/security/plugins/elastic_assistant/
```

##### Populate your local ES -- defaults to 5000 indices and 5000
mappings _per_ index. This _will cause_ a default local ES to crash, so
stop early (~569), or change configuration :)
``` bash
yarn stress-test-mappings 
```

##### If your ES is at its limits, you can slowly dial back the index
count with the following:
``` bash
yarn stress-test-mappings --delete-by-count 50 --yes
```

##### Or cleanup all the indices you created entirely with:
``` bash
yarn stress-test-mappings --cleanup --yes
```

##### And for a cloud install, create an API key and populate with the
following:
``` bash
yarn stress-test-mappings -host https://stress-test.es.us-west2.gcp.elastic-cloud.com --apiKey APK_KEY_HERE
```

> [!IMPORTANT]
> This is a quick utility script and may be buggy! Continue to vibe code
it as you see fit, but it worked for my needs here for testing and
validating this issue and fix 🙂




### Checklist

- [X] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
kibanamachine added a commit to kibanamachine/kibana that referenced this pull request Aug 22, 2025
…tions (elastic#231904)

## Summary

Small follow-up improvement to
elastic#231376 which added support for
`text` fields to Index Entries. This PR adds the field type as a badge
in the suggestions so users will know if a semantic or lexical search
will be performed (so they can adapt the query instructions
accordingly).

Note: Needed to update the field API request from
`dataViews.getFieldsForWildcard` (which called
`/internal/data_views/_fields_for_wildcard`) to use
`/api/index_management/mapping/[indexName]` as the former did not have
the option to include field type. I confirmed no new privileges were
necessary for this API, and the user just needs the same index
privileges as before.

cc @jamesspi

Field Options:
<p align="center">
<img width="500"
src="https://github.com/user-attachments/assets/f138c7f0-1d89-4946-8d27-fa6c9c49c60b"
/>
</p>

Output Field Options:
<p align="center">
<img width="500"
src="https://github.com/user-attachments/assets/2b0395e5-d71d-43af-8a23-9bacc4b02b54"
/>
</p>

---

As part of this PR I've also included the helper script from
elastic#231376 for testing these large
index/mapping scenarios. This script was almost entirely written in a
collab session with `gemini-cli`, and is located in:

> x-pack/solutions/security/plugins/elastic_assistant/scripts

Options include:

``` bash
    Elasticsearch Index/Mapping Populator and Cleanup Script

    Usage:
      node stress_test_mappings.js [options]
      node stress_test_mappings.js --cleanup
      node stress_test_mappings.js --delete-by-count <number>

    Description:
      This script stress-tests an Elasticsearch instance by creating a large number
      of indices with many fields. It can also clean up the indices it creates.

    Creation Options:
      --host <url>          Elasticsearch host URL (default: http://localhost:9200)
      --user <username>     Username for basic auth (default: elastic)
      --pass <password>     Password for basic auth (default: changeme)
      --apiKey <key>        API key for authentication (overrides user/pass)
      --indices <number>    Number of indices to create (default: 5000)
      --mappings <number>   Number of mappings per index (default: 5000)
      --maxFields <number>  The max number of fields per index (default: same as --mappings)
      --shards <number>     Number of primary shards per index (default: 1)
      --replicas <number>   Number of replicas per index (default: 0)

    Cleanup & Recovery Options:
      --cleanup             Delete all indices created by this script.
      --delete-by-count <N> Delete the <N> newest stress-test indices.
      --yes                 Bypass confirmation prompt during cleanup.

    Other Options:
      -h, --help            Show this help message
```

And some test executions are as follows. First CD into the assistant
working directory:

```
cd x-pack/solutions/security/plugins/elastic_assistant/
```

##### Populate your local ES -- defaults to 5000 indices and 5000
mappings _per_ index. This _will cause_ a default local ES to crash, so
stop early (~569), or change configuration :)
``` bash
yarn stress-test-mappings
```

##### If your ES is at its limits, you can slowly dial back the index
count with the following:
``` bash
yarn stress-test-mappings --delete-by-count 50 --yes
```

##### Or cleanup all the indices you created entirely with:
``` bash
yarn stress-test-mappings --cleanup --yes
```

##### And for a cloud install, create an API key and populate with the
following:
``` bash
yarn stress-test-mappings -host https://stress-test.es.us-west2.gcp.elastic-cloud.com --apiKey APK_KEY_HERE
```

> [!IMPORTANT]
> This is a quick utility script and may be buggy! Continue to vibe code
it as you see fit, but it worked for my needs here for testing and
validating this issue and fix 🙂

### Checklist

- [X] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
(cherry picked from commit 39a6983)
kibanamachine added a commit that referenced this pull request Aug 22, 2025
…suggestions (#231904) (#232674)

# Backport

This will backport the following commits from `main` to `9.1`:
- [[Security Assistant] Add field type badge to Index Entry field
suggestions (#231904)](#231904)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Garrett
Spong","email":"[email protected]"},"sourceCommit":{"committedDate":"2025-08-22T14:41:57Z","message":"[Security
Assistant] Add field type badge to Index Entry field suggestions
(#231904)\n\n## Summary\n\nSmall follow-up improvement
to\nhttps://github.com//pull/231376 which added support
for\n`text` fields to Index Entries. This PR adds the field type as a
badge\nin the suggestions so users will know if a semantic or lexical
search\nwill be performed (so they can adapt the query
instructions\naccordingly).\n\n\nNote: Needed to update the field API
request from\n`dataViews.getFieldsForWildcard` (which
called\n`/internal/data_views/_fields_for_wildcard`) to
use\n`/api/index_management/mapping/[indexName]` as the former did not
have\nthe option to include field type. I confirmed no new privileges
were\nnecessary for this API, and the user just needs the same
index\nprivileges as before.\n\ncc @jamesspi \n\nField Options:\n<p
align=\"center\">\n<img
width=\"500\"\nsrc=\"https://github.com/user-attachments/assets/f138c7f0-1d89-4946-8d27-fa6c9c49c60b\"\n/>\n</p>
\n\nOutput Field Options:\n<p align=\"center\">\n<img
width=\"500\"\nsrc=\"https://github.com/user-attachments/assets/2b0395e5-d71d-43af-8a23-9bacc4b02b54\"\n/>\n</p>
\n\n\n---\n\nAs part of this PR I've also included the helper script
from\nhttps://github.com//pull/231376 for testing these
large\nindex/mapping scenarios. This script was almost entirely written
in a\ncollab session with `gemini-cli`, and is located in:\n\n>
x-pack/solutions/security/plugins/elastic_assistant/scripts \n\nOptions
include:\n\n``` bash\n Elasticsearch Index/Mapping Populator and Cleanup
Script\n\n Usage:\n node stress_test_mappings.js [options]\n node
stress_test_mappings.js --cleanup\n node stress_test_mappings.js
--delete-by-count <number>\n\n Description:\n This script stress-tests
an Elasticsearch instance by creating a large number\n of indices with
many fields. It can also clean up the indices it creates.\n\n Creation
Options:\n --host <url> Elasticsearch host URL (default:
http://localhost:9200)\n --user <username> Username for basic auth
(default: elastic)\n --pass <password> Password for basic auth (default:
changeme)\n --apiKey <key> API key for authentication (overrides
user/pass)\n --indices <number> Number of indices to create (default:
5000)\n --mappings <number> Number of mappings per index (default:
5000)\n --maxFields <number> The max number of fields per index
(default: same as --mappings)\n --shards <number> Number of primary
shards per index (default: 1)\n --replicas <number> Number of replicas
per index (default: 0)\n\n Cleanup & Recovery Options:\n --cleanup
Delete all indices created by this script.\n --delete-by-count <N>
Delete the <N> newest stress-test indices.\n --yes Bypass confirmation
prompt during cleanup.\n\n Other Options:\n -h, --help Show this help
message\n```\n\n\nAnd some test executions are as follows. First CD into
the assistant\nworking directory:\n\n```\ncd
x-pack/solutions/security/plugins/elastic_assistant/\n```\n\n#####
Populate your local ES -- defaults to 5000 indices and 5000\nmappings
_per_ index. This _will cause_ a default local ES to crash, so\nstop
early (~569), or change configuration :)\n``` bash\nyarn
stress-test-mappings \n```\n\n##### If your ES is at its limits, you can
slowly dial back the index\ncount with the following:\n``` bash\nyarn
stress-test-mappings --delete-by-count 50 --yes\n```\n\n##### Or cleanup
all the indices you created entirely with:\n``` bash\nyarn
stress-test-mappings --cleanup --yes\n```\n\n##### And for a cloud
install, create an API key and populate with the\nfollowing:\n```
bash\nyarn stress-test-mappings -host
https://stress-test.es.us-west2.gcp.elastic-cloud.com --apiKey
APK_KEY_HERE\n```\n\n> [!IMPORTANT]\n> This is a quick utility script
and may be buggy! Continue to vibe code\nit as you see fit, but it
worked for my needs here for testing and\nvalidating this issue and fix
🙂\n\n\n\n\n### Checklist\n\n- [X] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common
scenarios\n\n---------\n\nCo-authored-by: kibanamachine
<[email protected]>","sha":"39a6983ded36b572879346bbcfada819156f3e11","branchLabelMapping":{"^v9.2.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:Security
Generative AI","backport:version","v9.2.0","v9.1.3"],"title":"[Security
Assistant] Add field type badge to Index Entry field
suggestions","number":231904,"url":"https://github.com/elastic/kibana/pull/231904","mergeCommit":{"message":"[Security
Assistant] Add field type badge to Index Entry field suggestions
(#231904)\n\n## Summary\n\nSmall follow-up improvement
to\nhttps://github.com//pull/231376 which added support
for\n`text` fields to Index Entries. This PR adds the field type as a
badge\nin the suggestions so users will know if a semantic or lexical
search\nwill be performed (so they can adapt the query
instructions\naccordingly).\n\n\nNote: Needed to update the field API
request from\n`dataViews.getFieldsForWildcard` (which
called\n`/internal/data_views/_fields_for_wildcard`) to
use\n`/api/index_management/mapping/[indexName]` as the former did not
have\nthe option to include field type. I confirmed no new privileges
were\nnecessary for this API, and the user just needs the same
index\nprivileges as before.\n\ncc @jamesspi \n\nField Options:\n<p
align=\"center\">\n<img
width=\"500\"\nsrc=\"https://github.com/user-attachments/assets/f138c7f0-1d89-4946-8d27-fa6c9c49c60b\"\n/>\n</p>
\n\nOutput Field Options:\n<p align=\"center\">\n<img
width=\"500\"\nsrc=\"https://github.com/user-attachments/assets/2b0395e5-d71d-43af-8a23-9bacc4b02b54\"\n/>\n</p>
\n\n\n---\n\nAs part of this PR I've also included the helper script
from\nhttps://github.com//pull/231376 for testing these
large\nindex/mapping scenarios. This script was almost entirely written
in a\ncollab session with `gemini-cli`, and is located in:\n\n>
x-pack/solutions/security/plugins/elastic_assistant/scripts \n\nOptions
include:\n\n``` bash\n Elasticsearch Index/Mapping Populator and Cleanup
Script\n\n Usage:\n node stress_test_mappings.js [options]\n node
stress_test_mappings.js --cleanup\n node stress_test_mappings.js
--delete-by-count <number>\n\n Description:\n This script stress-tests
an Elasticsearch instance by creating a large number\n of indices with
many fields. It can also clean up the indices it creates.\n\n Creation
Options:\n --host <url> Elasticsearch host URL (default:
http://localhost:9200)\n --user <username> Username for basic auth
(default: elastic)\n --pass <password> Password for basic auth (default:
changeme)\n --apiKey <key> API key for authentication (overrides
user/pass)\n --indices <number> Number of indices to create (default:
5000)\n --mappings <number> Number of mappings per index (default:
5000)\n --maxFields <number> The max number of fields per index
(default: same as --mappings)\n --shards <number> Number of primary
shards per index (default: 1)\n --replicas <number> Number of replicas
per index (default: 0)\n\n Cleanup & Recovery Options:\n --cleanup
Delete all indices created by this script.\n --delete-by-count <N>
Delete the <N> newest stress-test indices.\n --yes Bypass confirmation
prompt during cleanup.\n\n Other Options:\n -h, --help Show this help
message\n```\n\n\nAnd some test executions are as follows. First CD into
the assistant\nworking directory:\n\n```\ncd
x-pack/solutions/security/plugins/elastic_assistant/\n```\n\n#####
Populate your local ES -- defaults to 5000 indices and 5000\nmappings
_per_ index. This _will cause_ a default local ES to crash, so\nstop
early (~569), or change configuration :)\n``` bash\nyarn
stress-test-mappings \n```\n\n##### If your ES is at its limits, you can
slowly dial back the index\ncount with the following:\n``` bash\nyarn
stress-test-mappings --delete-by-count 50 --yes\n```\n\n##### Or cleanup
all the indices you created entirely with:\n``` bash\nyarn
stress-test-mappings --cleanup --yes\n```\n\n##### And for a cloud
install, create an API key and populate with the\nfollowing:\n```
bash\nyarn stress-test-mappings -host
https://stress-test.es.us-west2.gcp.elastic-cloud.com --apiKey
APK_KEY_HERE\n```\n\n> [!IMPORTANT]\n> This is a quick utility script
and may be buggy! Continue to vibe code\nit as you see fit, but it
worked for my needs here for testing and\nvalidating this issue and fix
🙂\n\n\n\n\n### Checklist\n\n- [X] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common
scenarios\n\n---------\n\nCo-authored-by: kibanamachine
<[email protected]>","sha":"39a6983ded36b572879346bbcfada819156f3e11"}},"sourceBranch":"main","suggestedTargetBranches":["9.1"],"targetPullRequestStates":[{"branch":"main","label":"v9.2.0","branchLabelMappingKey":"^v9.2.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/231904","number":231904,"mergeCommit":{"message":"[Security
Assistant] Add field type badge to Index Entry field suggestions
(#231904)\n\n## Summary\n\nSmall follow-up improvement
to\nhttps://github.com//pull/231376 which added support
for\n`text` fields to Index Entries. This PR adds the field type as a
badge\nin the suggestions so users will know if a semantic or lexical
search\nwill be performed (so they can adapt the query
instructions\naccordingly).\n\n\nNote: Needed to update the field API
request from\n`dataViews.getFieldsForWildcard` (which
called\n`/internal/data_views/_fields_for_wildcard`) to
use\n`/api/index_management/mapping/[indexName]` as the former did not
have\nthe option to include field type. I confirmed no new privileges
were\nnecessary for this API, and the user just needs the same
index\nprivileges as before.\n\ncc @jamesspi \n\nField Options:\n<p
align=\"center\">\n<img
width=\"500\"\nsrc=\"https://github.com/user-attachments/assets/f138c7f0-1d89-4946-8d27-fa6c9c49c60b\"\n/>\n</p>
\n\nOutput Field Options:\n<p align=\"center\">\n<img
width=\"500\"\nsrc=\"https://github.com/user-attachments/assets/2b0395e5-d71d-43af-8a23-9bacc4b02b54\"\n/>\n</p>
\n\n\n---\n\nAs part of this PR I've also included the helper script
from\nhttps://github.com//pull/231376 for testing these
large\nindex/mapping scenarios. This script was almost entirely written
in a\ncollab session with `gemini-cli`, and is located in:\n\n>
x-pack/solutions/security/plugins/elastic_assistant/scripts \n\nOptions
include:\n\n``` bash\n Elasticsearch Index/Mapping Populator and Cleanup
Script\n\n Usage:\n node stress_test_mappings.js [options]\n node
stress_test_mappings.js --cleanup\n node stress_test_mappings.js
--delete-by-count <number>\n\n Description:\n This script stress-tests
an Elasticsearch instance by creating a large number\n of indices with
many fields. It can also clean up the indices it creates.\n\n Creation
Options:\n --host <url> Elasticsearch host URL (default:
http://localhost:9200)\n --user <username> Username for basic auth
(default: elastic)\n --pass <password> Password for basic auth (default:
changeme)\n --apiKey <key> API key for authentication (overrides
user/pass)\n --indices <number> Number of indices to create (default:
5000)\n --mappings <number> Number of mappings per index (default:
5000)\n --maxFields <number> The max number of fields per index
(default: same as --mappings)\n --shards <number> Number of primary
shards per index (default: 1)\n --replicas <number> Number of replicas
per index (default: 0)\n\n Cleanup & Recovery Options:\n --cleanup
Delete all indices created by this script.\n --delete-by-count <N>
Delete the <N> newest stress-test indices.\n --yes Bypass confirmation
prompt during cleanup.\n\n Other Options:\n -h, --help Show this help
message\n```\n\n\nAnd some test executions are as follows. First CD into
the assistant\nworking directory:\n\n```\ncd
x-pack/solutions/security/plugins/elastic_assistant/\n```\n\n#####
Populate your local ES -- defaults to 5000 indices and 5000\nmappings
_per_ index. This _will cause_ a default local ES to crash, so\nstop
early (~569), or change configuration :)\n``` bash\nyarn
stress-test-mappings \n```\n\n##### If your ES is at its limits, you can
slowly dial back the index\ncount with the following:\n``` bash\nyarn
stress-test-mappings --delete-by-count 50 --yes\n```\n\n##### Or cleanup
all the indices you created entirely with:\n``` bash\nyarn
stress-test-mappings --cleanup --yes\n```\n\n##### And for a cloud
install, create an API key and populate with the\nfollowing:\n```
bash\nyarn stress-test-mappings -host
https://stress-test.es.us-west2.gcp.elastic-cloud.com --apiKey
APK_KEY_HERE\n```\n\n> [!IMPORTANT]\n> This is a quick utility script
and may be buggy! Continue to vibe code\nit as you see fit, but it
worked for my needs here for testing and\nvalidating this issue and fix
🙂\n\n\n\n\n### Checklist\n\n- [X] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common
scenarios\n\n---------\n\nCo-authored-by: kibanamachine
<[email protected]>","sha":"39a6983ded36b572879346bbcfada819156f3e11"}},{"branch":"9.1","label":"v9.1.3","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Garrett Spong <[email protected]>
qn895 pushed a commit to qn895/kibana that referenced this pull request Aug 26, 2025
… Base Index Entries in deployments with a large number of indices/mappings (elastic#231376)

## Summary

This PR fixes an issue with the Security Assistant KB Index Entries
interface introduced in `9.1` where a large number of indices/mappings
could result in Kibana crashing and preventing the creation of the Index
Entry.

This is technically a 'fix-hancement' as we are bypassing the underlying
issue altogether by switching to use common core API's for index/field
suggestions (just as is done in the Discover 'Create a data view'
interface), and in turn supporting all fields of type `text`, not just
`semantic_text` (elastic#230863).

### Original Issue
The underlying issue here was introduced by a change to the `field_caps`
API in `9.1` (elastic/elasticsearch#127664) that
resulted in the `/internal/elastic_assistant/knowledge_base/_indices`
route not finding any indices with a `semantic_text` field, and thus
inadvertently falling back to doing a full scan of all mappings
([source](https://github.com/elastic/kibana/blob/b128cee4ee7ccc367e8acf159dbf58a75f081867/x-pack/solutions/security/plugins/elastic_assistant/server/routes/knowledge_base/get_knowledge_base_indices.ts#L69)).
A fix to this API was initially investigated, but there was no
reasonable API available for fetching all occurrences of `semantic_text`
fields, so with `match` queries adding support for `semantic_text` in
`8.18`, it was decided to go ahead and enable support for all `text`
fields.


### Fix Details

The `Index` input field now uses the `dataViews.getIndices()` API for
suggestions (instead of the `useKnowledgeBaseIndices` hook/route), which
is backed by the [resolve indices ES
API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-resolve-index).
System indices are filtered out with the `*,-.*` filter. The initial
call will return all indices just as the Discover 'Create a data view'
interface, which is further filtered upon as the user continues to type.
Note: Discover makes subsequent calls upon further user input, though
I'm not entirely sure this is necessary here as all indices are
initially returned and available for client-side filtering within the
input. I will perform further stress testing with many indices/mappings
to confirm.

The `Field` input now uses the fields already queried for the `Output
fields` input suggestions (via `dataViews.getFieldsForWildcard()`), just
filtered to those whose `field.esTypes?.includes('text')`. This is
potentially still a hot path for client-side code with many mappings, so
I will also confirm this with further stress testing.


### Docs

@elastic/security-docs / @benironside, we will need to update the
Security Assistant KB docs
[here](https://www.elastic.co/docs/solutions/security/ai/ai-assistant-knowledge-base#knowledge-base-add-knowledge-index)
to indicate that any `text` field is now supported for retrieval.


### Testing

#### Functional Testing

To confirm proper retrieval of both `text` and `semantic_text` fields
we'll need to create an index, add a document, then create the KB Index
Entry. Then update the KB Index Entry to reference the other field type,
and test again.

<details><summary>Create Index</summary>
<p>

``` JSON
PUT project-details
{
  "mappings": {
    "properties": {
      "project_issue": {
        "type": "text"
      },
      "project_name": {
        "type": "text"
      },
      "summary": {
        "type": "semantic_text",
        "inference_id": ".elser-2-elasticsearch",
        "model_settings": {
          "service": "elasticsearch",
          "task_type": "sparse_embedding"
        }
      }
    }
  }
}
```
</p>
</details> 

<details><summary>Create Sample Doc</summary>
<p>

``` JSON
PUT project-details/_doc/doc1
{
    "project_issue": "The main issue at hand is the breaking of the space plane",
    "project_name": "Issue 5",
    "summary": "This is a summary that contains the word yellow"
}
```
</p>
</details> 

<details><summary>Create KB Index Entry (Text)</summary>
<p>

``` JSON
POST kbn:api/security_ai_assistant/knowledge_base/entries
{
  "type": "index",
  "name": "Project Details Tool",
  "index": "project-details",
  "field": "project_issue",
  "outputFields": [],
  "description": "Use this index to answer questions about any project details.",
  "queryDescription": "Key terms to search for from the user's prompt."
}
```
</p>
</details> 

Now you can open the Assistant and perform a query like:
```
Do I have any project details about the issue at hand?
```

Which should result in a [trace like
this](https://smith.langchain.com/public/208338a6-42f3-4a9d-bc4c-0ff38d06d34c/r)
which calls the generated tool that will then perform a _lexical search_
against the configured index. Ensure citations work as expected for the
returned document.

Now open the Index Entry in the KB Settings UI and change the `field` to
from `project_issue` to `summary` for testing `semantic_text`.

Open the Assistant and perform a query like:
```
Do I have any project details for Project Yellow?
```

Which should result in a [trace like
this](https://smith.langchain.com/public/3041a48b-5dfd-4c89-9f51-0bcdffd38f63/r)
which calls the generated tool that will then perform a _semantic
search_ against the configured index. Ensure citations work as expected
for the returned document.







#### Performance Testing

⚠️ In progress -- I will include a script for generating many
indices/mappings for testing, and also prepare the `ci-cloud-deploy`
instance with the same setup for confirmation.


### Checklist

Check the PR satisfies following conditions. 

Reviewers should verify this PR satisfies this list as well.

- [X] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
  * Will coordinate with @elastic/security-docs on the docs update here.
- [X] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
qn895 pushed a commit to qn895/kibana that referenced this pull request Aug 26, 2025
…elastic#231725)

## Summary

This is a follow-up to elastic#231376
where I added new `docLinks` for the Security Solution AI Assistant.
This PR removes the `TODO` added in that PR and creates a new
`securitySolution.aiAssistant` grouping so we can more easily add
docLinks as needed.

Reviewers Note: Sorry for the extra noise here -- I thought there were
more references so decided to do in another PR. Turns out there were
not, so this is a quick one!
qn895 pushed a commit to qn895/kibana that referenced this pull request Aug 26, 2025
…tions (elastic#231904)

## Summary

Small follow-up improvement to
elastic#231376 which added support for
`text` fields to Index Entries. This PR adds the field type as a badge
in the suggestions so users will know if a semantic or lexical search
will be performed (so they can adapt the query instructions
accordingly).


Note: Needed to update the field API request from
`dataViews.getFieldsForWildcard` (which called
`/internal/data_views/_fields_for_wildcard`) to use
`/api/index_management/mapping/[indexName]` as the former did not have
the option to include field type. I confirmed no new privileges were
necessary for this API, and the user just needs the same index
privileges as before.

cc @jamesspi 

Field Options:
<p align="center">
<img width="500"
src="https://github.com/user-attachments/assets/f138c7f0-1d89-4946-8d27-fa6c9c49c60b"
/>
</p> 

Output Field Options:
<p align="center">
<img width="500"
src="https://github.com/user-attachments/assets/2b0395e5-d71d-43af-8a23-9bacc4b02b54"
/>
</p> 


---

As part of this PR I've also included the helper script from
elastic#231376 for testing these large
index/mapping scenarios. This script was almost entirely written in a
collab session with `gemini-cli`, and is located in:

> x-pack/solutions/security/plugins/elastic_assistant/scripts 

Options include:

``` bash
    Elasticsearch Index/Mapping Populator and Cleanup Script

    Usage:
      node stress_test_mappings.js [options]
      node stress_test_mappings.js --cleanup
      node stress_test_mappings.js --delete-by-count <number>

    Description:
      This script stress-tests an Elasticsearch instance by creating a large number
      of indices with many fields. It can also clean up the indices it creates.

    Creation Options:
      --host <url>          Elasticsearch host URL (default: http://localhost:9200)
      --user <username>     Username for basic auth (default: elastic)
      --pass <password>     Password for basic auth (default: changeme)
      --apiKey <key>        API key for authentication (overrides user/pass)
      --indices <number>    Number of indices to create (default: 5000)
      --mappings <number>   Number of mappings per index (default: 5000)
      --maxFields <number>  The max number of fields per index (default: same as --mappings)
      --shards <number>     Number of primary shards per index (default: 1)
      --replicas <number>   Number of replicas per index (default: 0)

    Cleanup & Recovery Options:
      --cleanup             Delete all indices created by this script.
      --delete-by-count <N> Delete the <N> newest stress-test indices.
      --yes                 Bypass confirmation prompt during cleanup.

    Other Options:
      -h, --help            Show this help message
```


And some test executions are as follows. First CD into the assistant
working directory:

```
cd x-pack/solutions/security/plugins/elastic_assistant/
```

##### Populate your local ES -- defaults to 5000 indices and 5000
mappings _per_ index. This _will cause_ a default local ES to crash, so
stop early (~569), or change configuration :)
``` bash
yarn stress-test-mappings 
```

##### If your ES is at its limits, you can slowly dial back the index
count with the following:
``` bash
yarn stress-test-mappings --delete-by-count 50 --yes
```

##### Or cleanup all the indices you created entirely with:
``` bash
yarn stress-test-mappings --cleanup --yes
```

##### And for a cloud install, create an API key and populate with the
following:
``` bash
yarn stress-test-mappings -host https://stress-test.es.us-west2.gcp.elastic-cloud.com --apiKey APK_KEY_HERE
```

> [!IMPORTANT]
> This is a quick utility script and may be buggy! Continue to vibe code
it as you see fit, but it worked for my needs here for testing and
validating this issue and fix 🙂




### Checklist

- [X] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
KodeRad pushed a commit to KodeRad/kibana that referenced this pull request Aug 28, 2025
…tions (elastic#231904)

## Summary

Small follow-up improvement to
elastic#231376 which added support for
`text` fields to Index Entries. This PR adds the field type as a badge
in the suggestions so users will know if a semantic or lexical search
will be performed (so they can adapt the query instructions
accordingly).


Note: Needed to update the field API request from
`dataViews.getFieldsForWildcard` (which called
`/internal/data_views/_fields_for_wildcard`) to use
`/api/index_management/mapping/[indexName]` as the former did not have
the option to include field type. I confirmed no new privileges were
necessary for this API, and the user just needs the same index
privileges as before.

cc @jamesspi 

Field Options:
<p align="center">
<img width="500"
src="https://github.com/user-attachments/assets/f138c7f0-1d89-4946-8d27-fa6c9c49c60b"
/>
</p> 

Output Field Options:
<p align="center">
<img width="500"
src="https://github.com/user-attachments/assets/2b0395e5-d71d-43af-8a23-9bacc4b02b54"
/>
</p> 


---

As part of this PR I've also included the helper script from
elastic#231376 for testing these large
index/mapping scenarios. This script was almost entirely written in a
collab session with `gemini-cli`, and is located in:

> x-pack/solutions/security/plugins/elastic_assistant/scripts 

Options include:

``` bash
    Elasticsearch Index/Mapping Populator and Cleanup Script

    Usage:
      node stress_test_mappings.js [options]
      node stress_test_mappings.js --cleanup
      node stress_test_mappings.js --delete-by-count <number>

    Description:
      This script stress-tests an Elasticsearch instance by creating a large number
      of indices with many fields. It can also clean up the indices it creates.

    Creation Options:
      --host <url>          Elasticsearch host URL (default: http://localhost:9200)
      --user <username>     Username for basic auth (default: elastic)
      --pass <password>     Password for basic auth (default: changeme)
      --apiKey <key>        API key for authentication (overrides user/pass)
      --indices <number>    Number of indices to create (default: 5000)
      --mappings <number>   Number of mappings per index (default: 5000)
      --maxFields <number>  The max number of fields per index (default: same as --mappings)
      --shards <number>     Number of primary shards per index (default: 1)
      --replicas <number>   Number of replicas per index (default: 0)

    Cleanup & Recovery Options:
      --cleanup             Delete all indices created by this script.
      --delete-by-count <N> Delete the <N> newest stress-test indices.
      --yes                 Bypass confirmation prompt during cleanup.

    Other Options:
      -h, --help            Show this help message
```


And some test executions are as follows. First CD into the assistant
working directory:

```
cd x-pack/solutions/security/plugins/elastic_assistant/
```

##### Populate your local ES -- defaults to 5000 indices and 5000
mappings _per_ index. This _will cause_ a default local ES to crash, so
stop early (~569), or change configuration :)
``` bash
yarn stress-test-mappings 
```

##### If your ES is at its limits, you can slowly dial back the index
count with the following:
``` bash
yarn stress-test-mappings --delete-by-count 50 --yes
```

##### Or cleanup all the indices you created entirely with:
``` bash
yarn stress-test-mappings --cleanup --yes
```

##### And for a cloud install, create an API key and populate with the
following:
``` bash
yarn stress-test-mappings -host https://stress-test.es.us-west2.gcp.elastic-cloud.com --apiKey APK_KEY_HERE
```

> [!IMPORTANT]
> This is a quick utility script and may be buggy! Continue to vibe code
it as you see fit, but it worked for my needs here for testing and
validating this issue and fix 🙂




### Checklist

- [X] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:version Backport to applied version labels bug Fixes for quality problems that affect the customer experience ci:cloud-deploy Create or update a Cloud deployment needs_docs release_note:fix sdh-linked Team:Security Generative AI Security Generative AI v9.1.3 v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants