[Obs AI] Replace get_data_sources with get_index_info tool#248234
[Obs AI] Replace get_data_sources with get_index_info tool#248234sorenlouv merged 20 commits intoelastic:mainfrom
get_data_sources with get_index_info tool#248234Conversation
|
Pinging @elastic/obs-presentation-team (Team:obs-presentation) |
cauemarcondes
left a comment
There was a problem hiding this comment.
Obs exploration changes LGTM
...ions/observability/plugins/observability_agent_builder/server/tools/get_index_info/README.md
Outdated
Show resolved
Hide resolved
...servability/plugins/observability_agent_builder/server/agent/register_observability_agent.ts
Outdated
Show resolved
Hide resolved
...servability/plugins/observability_agent_builder/server/agent/register_observability_agent.ts
Outdated
Show resolved
Hide resolved
...ervability/plugins/observability_agent_builder/server/tools/get_index_info/get_field_type.ts
Show resolved
Hide resolved
...utions/observability/plugins/observability_agent_builder/server/tools/get_index_info/tool.ts
Outdated
Show resolved
Hide resolved
...utions/observability/plugins/observability_agent_builder/server/tools/get_index_info/tool.ts
Outdated
Show resolved
Hide resolved
| const dataSources = await getObservabilityDataSources({ core, plugins, logger }); | ||
|
|
||
| // Discover data streams using the configured patterns | ||
| const dataStreams = await getDataStreamsHandler({ esClient, dataSources }); |
There was a problem hiding this comment.
Should this be wrapped in a try/catch so that even if fetching dataStreams fails, we can at least return the indices instead of throwing an error?
gbamparop
left a comment
There was a problem hiding this comment.
Codeowner changes for the onboarding team LGTM!
|
Starting backport for target branches: 9.3 https://github.com/elastic/kibana/actions/runs/21011849521 |
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]Public APIs missing comments
History
|
💔 All backports failed
Manual backportTo create the backport manually run: Questions ?Please refer to the Backport tool documentation |
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…ic#248234) Closes elastic/obs-ai-team#455 This PR introduces the `get_index_info` tool which replaces `get_data_sources` and adds field discovery capabilities. This is similar to the `get_dataset_info` tool we have for Obs AI Assistant. The tool has three operations: **`get_index_info({ operation: "get-overview" })`** Returns the same data sources as `get_data_sources` (APM indices, logs, metrics, alerts) plus a list of curated observability fields that exist in the cluster. Each field includes a `schema` indicator (`ecs`, `otel`, or `internal`). **`get_index_info({ operation: "list-fields", index, start?, end?, kqlFilter?, intent? })`** Returns fields with actual data. If the LLM specifies an `intent` and there are >100 fields, we filter them using a model to just the relevant ones. **`get_index_info({ operation: "get-field-values", index, fields })`** Returns field values: - Distinct values for keyword fields - Min/max ranges for numeric and date fields The LLM needs to know what fields exist in the user's cluster before building queries. Without this, it guesses field names which leads to invalid filters and confusing errors. This is especially important because customers can use different schemas (ECS vs OTel). Run the synthtrace scenario to populate observability indices with test data: ```bash node scripts/synthtrace \ src/platform/packages/shared/kbn-synthtrace/src/scenarios/agent_builder/tools/get_index_info/curated_fields.ts \ --from "now-15m" --to "now" --clean --workers=1 ``` ``` POST kbn:///api/agent_builder/tools/_execute { "tool_id": "observability.get_index_info", "tool_params": { "operation": "get-overview" } } ``` ``` POST kbn:///api/agent_builder/tools/_execute { "tool_id": "observability.get_index_info", "tool_params": { "operation": "get-field-values", "index": "metrics-*", "fields": "host.name" } } ``` (cherry picked from commit b6be8eb)
…ic#248234) Closes elastic/obs-ai-team#455 This PR introduces the `get_index_info` tool which replaces `get_data_sources` and adds field discovery capabilities. This is similar to the `get_dataset_info` tool we have for Obs AI Assistant. ### What it does The tool has three operations: **`get_index_info({ operation: "get-overview" })`** Returns the same data sources as `get_data_sources` (APM indices, logs, metrics, alerts) plus a list of curated observability fields that exist in the cluster. Each field includes a `schema` indicator (`ecs`, `otel`, or `internal`). **`get_index_info({ operation: "list-fields", index, start?, end?, kqlFilter?, intent? })`** Returns fields with actual data. If the LLM specifies an `intent` and there are >100 fields, we filter them using a model to just the relevant ones. **`get_index_info({ operation: "get-field-values", index, fields })`** Returns field values: - Distinct values for keyword fields - Min/max ranges for numeric and date fields ### Purpose of tool The LLM needs to know what fields exist in the user's cluster before building queries. Without this, it guesses field names which leads to invalid filters and confusing errors. This is especially important because customers can use different schemas (ECS vs OTel). ## Manual testing Run the synthtrace scenario to populate observability indices with test data: ```bash node scripts/synthtrace \ src/platform/packages/shared/kbn-synthtrace/src/scenarios/agent_builder/tools/get_index_info/curated_fields.ts \ --from "now-15m" --to "now" --clean --workers=1 ``` ### Execute tool to get overview ``` POST kbn:///api/agent_builder/tools/_execute { "tool_id": "observability.get_index_info", "tool_params": { "operation": "get-overview" } } ``` ### Execute tool to get field values ``` POST kbn:///api/agent_builder/tools/_execute { "tool_id": "observability.get_index_info", "tool_params": { "operation": "get-field-values", "index": "metrics-*", "fields": "host.name" } } ``` (cherry picked from commit b6be8eb) # Conflicts: # src/platform/packages/shared/kbn-synthtrace/src/scenarios/agent_builder/index.ts # x-pack/solutions/observability/plugins/observability_agent_builder/server/tools/get_data_sources/tool.ts # x-pack/solutions/observability/plugins/observability_agent_builder/server/tools/get_services/tool.ts # x-pack/solutions/observability/plugins/observability_agent_builder/server/tools/index.ts # x-pack/solutions/observability/test/api_integration_deployment_agnostic/apis/observability_agent_builder/index.ts # x-pack/solutions/observability/test/api_integration_deployment_agnostic/apis/observability_agent_builder/tools/get_data_sources.spec.ts
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
|
Looks like this PR has backport PRs but they still haven't been merged. Please merge them ASAP to keep the branches relatively in sync. |
… tool (#247474) | [Obs AI] Replace `get_data_sources` with `get_index_info` tool (#248234) (#249116) # Backport This will backport the following commits from `main` to `9.3`: - [[Obs AI] Extend `get_services` tool and add `get_trace_metrics` tool (#247474)](#247474) - [[Obs AI] Replace `get_data_sources` with `get_index_info` tool (#248234)](#248234) <!--- Backport version: 10.2.0 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport) <!--BACKPORT [{"author":{"name":"Viduni Wickramarachchi","email":"viduni.wickramarachchi@elastic.co"},"sourceCommit":{"committedDate":"2026-01-08T00:55:54Z","message":"[Obs AI] Extend `get_services` tool and add `get_trace_metrics` tool (#247474)\n\nCloses https://github.com/elastic/obs-ai-team/issues/427\nCloses https://github.com/elastic/obs-ai-team/issues/409\n\n## Summary\n\nThis PR includes the following changes\n\n1. Extend the `get_services` tool to retrieve services from logs and\nmetrics.\n2. Adds a new tool to retrieve RED metrics for services -\n`get_trace_metrics`\n3. Allows drilling down into a particular service for further\ninvestigation\n\nDifferences between the 2 tools:\n\nTool | get_services | get_trace_metrics\n-- | -- | --\nPurpose | High-level overview of all services | Detailed drill-down\ninto specific services\nData Sources | APM, Logs, and Metrics | APM only (RED/trace metrics can\nbe obtained only for APM services)\nFiltering | By environment, health status | By KQL filter (any field)\nGrouping | Fixed (by service) | Flexible (by transaction, host,\ncontainer, etc.)\nTransaction Types | Includes only the primary transaction type (the\ntransaction type with the higher throughput) | Includes all transaction\ntypes\n\n### Checklist\n\n- [x] [Unit or functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere updated or added to match the most common scenarios\n- [x] The PR description includes the appropriate Release Notes section,\nand the correct `release_note:*` label is applied per the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n- [x] Review the [backport\nguidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)\nand apply applicable `backport:*` labels.\n\n_Cursor + Claude-4.5-Opus-High was used in this PR_\n\n---------\n\nCo-authored-by: Arturo Lidueña <arturo.liduena@elastic.co>","sha":"c8f30f8add706ecc1ae81a4f4aa97cb25e6035ac","branchLabelMapping":{"^v9.4.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:skip","Team:obs-ai","v9.4.0","Team:obs-presentation"],"title":"[Obs AI] Extend `get_services` tool and add `get_trace_metrics` tool","number":247474,"url":"https://github.com/elastic/kibana/pull/247474","mergeCommit":{"message":"[Obs AI] Extend `get_services` tool and add `get_trace_metrics` tool (#247474)\n\nCloses https://github.com/elastic/obs-ai-team/issues/427\nCloses https://github.com/elastic/obs-ai-team/issues/409\n\n## Summary\n\nThis PR includes the following changes\n\n1. Extend the `get_services` tool to retrieve services from logs and\nmetrics.\n2. Adds a new tool to retrieve RED metrics for services -\n`get_trace_metrics`\n3. Allows drilling down into a particular service for further\ninvestigation\n\nDifferences between the 2 tools:\n\nTool | get_services | get_trace_metrics\n-- | -- | --\nPurpose | High-level overview of all services | Detailed drill-down\ninto specific services\nData Sources | APM, Logs, and Metrics | APM only (RED/trace metrics can\nbe obtained only for APM services)\nFiltering | By environment, health status | By KQL filter (any field)\nGrouping | Fixed (by service) | Flexible (by transaction, host,\ncontainer, etc.)\nTransaction Types | Includes only the primary transaction type (the\ntransaction type with the higher throughput) | Includes all transaction\ntypes\n\n### Checklist\n\n- [x] [Unit or functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere updated or added to match the most common scenarios\n- [x] The PR description includes the appropriate Release Notes section,\nand the correct `release_note:*` label is applied per the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n- [x] Review the [backport\nguidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)\nand apply applicable `backport:*` labels.\n\n_Cursor + Claude-4.5-Opus-High was used in this PR_\n\n---------\n\nCo-authored-by: Arturo Lidueña <arturo.liduena@elastic.co>","sha":"c8f30f8add706ecc1ae81a4f4aa97cb25e6035ac"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.4.0","branchLabelMappingKey":"^v9.4.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/247474","number":247474,"mergeCommit":{"message":"[Obs AI] Extend `get_services` tool and add `get_trace_metrics` tool (#247474)\n\nCloses https://github.com/elastic/obs-ai-team/issues/427\nCloses https://github.com/elastic/obs-ai-team/issues/409\n\n## Summary\n\nThis PR includes the following changes\n\n1. Extend the `get_services` tool to retrieve services from logs and\nmetrics.\n2. Adds a new tool to retrieve RED metrics for services -\n`get_trace_metrics`\n3. Allows drilling down into a particular service for further\ninvestigation\n\nDifferences between the 2 tools:\n\nTool | get_services | get_trace_metrics\n-- | -- | --\nPurpose | High-level overview of all services | Detailed drill-down\ninto specific services\nData Sources | APM, Logs, and Metrics | APM only (RED/trace metrics can\nbe obtained only for APM services)\nFiltering | By environment, health status | By KQL filter (any field)\nGrouping | Fixed (by service) | Flexible (by transaction, host,\ncontainer, etc.)\nTransaction Types | Includes only the primary transaction type (the\ntransaction type with the higher throughput) | Includes all transaction\ntypes\n\n### Checklist\n\n- [x] [Unit or functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere updated or added to match the most common scenarios\n- [x] The PR description includes the appropriate Release Notes section,\nand the correct `release_note:*` label is applied per the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n- [x] Review the [backport\nguidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)\nand apply applicable `backport:*` labels.\n\n_Cursor + Claude-4.5-Opus-High was used in this PR_\n\n---------\n\nCo-authored-by: Arturo Lidueña <arturo.liduena@elastic.co>","sha":"c8f30f8add706ecc1ae81a4f4aa97cb25e6035ac"}}]},{"author":{"name":"Søren Louv-Jansen","email":"soren.louv@elastic.co"},"sourceCommit":{"committedDate":"2026-01-14T22:16:47Z","message":"[Obs AI] Replace `get_data_sources` with `get_index_info` tool (#248234)\n\nCloses https://github.com/elastic/obs-ai-team/issues/455\n\nThis PR introduces the `get_index_info` tool which replaces\n`get_data_sources` and adds field discovery capabilities. This is\nsimilar to the `get_dataset_info` tool we have for Obs AI Assistant.\n\n### What it does\n\nThe tool has three operations:\n\n**`get_index_info({ operation: \"get-overview\" })`**\nReturns the same data sources as `get_data_sources` (APM indices, logs,\nmetrics, alerts) plus a list of curated observability fields that exist\nin the cluster. Each field includes a `schema` indicator (`ecs`, `otel`,\nor `internal`).\n\n**`get_index_info({ operation: \"list-fields\", index, start?, end?,\nkqlFilter?, intent? })`**\nReturns fields with actual data. If the LLM specifies an `intent` and\nthere are >100 fields, we filter them using a model to just the relevant\nones.\n\n**`get_index_info({ operation: \"get-field-values\", index, fields })`**\nReturns field values:\n- Distinct values for keyword fields\n- Min/max ranges for numeric and date fields\n\n### Purpose of tool\n\nThe LLM needs to know what fields exist in the user's cluster before\nbuilding queries. Without this, it guesses field names which leads to\ninvalid filters and confusing errors. This is especially important\nbecause customers can use different schemas (ECS vs OTel).\n\n\n## Manual testing\n\nRun the synthtrace scenario to populate observability indices with test\ndata:\n\n```bash\nnode scripts/synthtrace \\\n src/platform/packages/shared/kbn-synthtrace/src/scenarios/agent_builder/tools/get_index_info/curated_fields.ts \\\n --from \"now-15m\" --to \"now\" --clean --workers=1\n```\n\n### Execute tool to get overview\n\n```\nPOST kbn:///api/agent_builder/tools/_execute\n{\n \"tool_id\": \"observability.get_index_info\",\n \"tool_params\": { \"operation\": \"get-overview\" }\n}\n```\n\n\n### Execute tool to get field values\n\n```\nPOST kbn:///api/agent_builder/tools/_execute\n{\n \"tool_id\": \"observability.get_index_info\",\n \"tool_params\": { \"operation\": \"get-field-values\", \"index\": \"metrics-*\", \"fields\": \"host.name\" }\n}\n```","sha":"b6be8eb281bd6371e150f6cb79e6651066ccf865","branchLabelMapping":{"^v9.4.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:version","v9.3.0","v9.4.0","Team:obs-presentation"],"title":"[Obs AI] Replace `get_data_sources` with `get_index_info` tool","number":248234,"url":"https://github.com/elastic/kibana/pull/248234","mergeCommit":{"message":"[Obs AI] Replace `get_data_sources` with `get_index_info` tool (#248234)\n\nCloses https://github.com/elastic/obs-ai-team/issues/455\n\nThis PR introduces the `get_index_info` tool which replaces\n`get_data_sources` and adds field discovery capabilities. This is\nsimilar to the `get_dataset_info` tool we have for Obs AI Assistant.\n\n### What it does\n\nThe tool has three operations:\n\n**`get_index_info({ operation: \"get-overview\" })`**\nReturns the same data sources as `get_data_sources` (APM indices, logs,\nmetrics, alerts) plus a list of curated observability fields that exist\nin the cluster. Each field includes a `schema` indicator (`ecs`, `otel`,\nor `internal`).\n\n**`get_index_info({ operation: \"list-fields\", index, start?, end?,\nkqlFilter?, intent? })`**\nReturns fields with actual data. If the LLM specifies an `intent` and\nthere are >100 fields, we filter them using a model to just the relevant\nones.\n\n**`get_index_info({ operation: \"get-field-values\", index, fields })`**\nReturns field values:\n- Distinct values for keyword fields\n- Min/max ranges for numeric and date fields\n\n### Purpose of tool\n\nThe LLM needs to know what fields exist in the user's cluster before\nbuilding queries. Without this, it guesses field names which leads to\ninvalid filters and confusing errors. This is especially important\nbecause customers can use different schemas (ECS vs OTel).\n\n\n## Manual testing\n\nRun the synthtrace scenario to populate observability indices with test\ndata:\n\n```bash\nnode scripts/synthtrace \\\n src/platform/packages/shared/kbn-synthtrace/src/scenarios/agent_builder/tools/get_index_info/curated_fields.ts \\\n --from \"now-15m\" --to \"now\" --clean --workers=1\n```\n\n### Execute tool to get overview\n\n```\nPOST kbn:///api/agent_builder/tools/_execute\n{\n \"tool_id\": \"observability.get_index_info\",\n \"tool_params\": { \"operation\": \"get-overview\" }\n}\n```\n\n\n### Execute tool to get field values\n\n```\nPOST kbn:///api/agent_builder/tools/_execute\n{\n \"tool_id\": \"observability.get_index_info\",\n \"tool_params\": { \"operation\": \"get-field-values\", \"index\": \"metrics-*\", \"fields\": \"host.name\" }\n}\n```","sha":"b6be8eb281bd6371e150f6cb79e6651066ccf865"}},"sourceBranch":"main","suggestedTargetBranches":["9.3"],"targetPullRequestStates":[{"branch":"9.3","label":"v9.3.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v9.4.0","branchLabelMappingKey":"^v9.4.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/248234","number":248234,"mergeCommit":{"message":"[Obs AI] Replace `get_data_sources` with `get_index_info` tool (#248234)\n\nCloses https://github.com/elastic/obs-ai-team/issues/455\n\nThis PR introduces the `get_index_info` tool which replaces\n`get_data_sources` and adds field discovery capabilities. This is\nsimilar to the `get_dataset_info` tool we have for Obs AI Assistant.\n\n### What it does\n\nThe tool has three operations:\n\n**`get_index_info({ operation: \"get-overview\" })`**\nReturns the same data sources as `get_data_sources` (APM indices, logs,\nmetrics, alerts) plus a list of curated observability fields that exist\nin the cluster. Each field includes a `schema` indicator (`ecs`, `otel`,\nor `internal`).\n\n**`get_index_info({ operation: \"list-fields\", index, start?, end?,\nkqlFilter?, intent? })`**\nReturns fields with actual data. If the LLM specifies an `intent` and\nthere are >100 fields, we filter them using a model to just the relevant\nones.\n\n**`get_index_info({ operation: \"get-field-values\", index, fields })`**\nReturns field values:\n- Distinct values for keyword fields\n- Min/max ranges for numeric and date fields\n\n### Purpose of tool\n\nThe LLM needs to know what fields exist in the user's cluster before\nbuilding queries. Without this, it guesses field names which leads to\ninvalid filters and confusing errors. This is especially important\nbecause customers can use different schemas (ECS vs OTel).\n\n\n## Manual testing\n\nRun the synthtrace scenario to populate observability indices with test\ndata:\n\n```bash\nnode scripts/synthtrace \\\n src/platform/packages/shared/kbn-synthtrace/src/scenarios/agent_builder/tools/get_index_info/curated_fields.ts \\\n --from \"now-15m\" --to \"now\" --clean --workers=1\n```\n\n### Execute tool to get overview\n\n```\nPOST kbn:///api/agent_builder/tools/_execute\n{\n \"tool_id\": \"observability.get_index_info\",\n \"tool_params\": { \"operation\": \"get-overview\" }\n}\n```\n\n\n### Execute tool to get field values\n\n```\nPOST kbn:///api/agent_builder/tools/_execute\n{\n \"tool_id\": \"observability.get_index_info\",\n \"tool_params\": { \"operation\": \"get-field-values\", \"index\": \"metrics-*\", \"fields\": \"host.name\" }\n}\n```","sha":"b6be8eb281bd6371e150f6cb79e6651066ccf865"}}]}] BACKPORT--> --------- Co-authored-by: Viduni Wickramarachchi <viduni.wickramarachchi@elastic.co> Co-authored-by: Arturo Lidueña <arturo.liduena@elastic.co> Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
|
Looks like this PR has backport PRs but they still haven't been merged. Please merge them ASAP to keep the branches relatively in sync. |
Closes https://github.com/elastic/obs-ai-team/issues/455
This PR introduces the
get_index_infotool which replacesget_data_sourcesand adds field discovery capabilities. This is similar to theget_dataset_infotool we have for Obs AI Assistant.What it does
The tool has three operations:
get_index_info({ operation: "get-overview" })Returns the same data sources as
get_data_sources(APM indices, logs, metrics, alerts) plus a list of curated observability fields that exist in the cluster. Each field includes aschemaindicator (ecs,otel, orinternal).get_index_info({ operation: "list-fields", index, start?, end?, kqlFilter?, intent? })Returns fields with actual data. If the LLM specifies an
intentand there are >100 fields, we filter them using a model to just the relevant ones.get_index_info({ operation: "get-field-values", index, fields })Returns field values:
Purpose of tool
The LLM needs to know what fields exist in the user's cluster before building queries. Without this, it guesses field names which leads to invalid filters and confusing errors. This is especially important because customers can use different schemas (ECS vs OTel).
Manual testing
Run the synthtrace scenario to populate observability indices with test data:
Execute tool to get overview
Execute tool to get field values