[onechat] add first set of base tools by pgayvallet · Pull Request #223367 · elastic/kibana

pgayvallet · 2025-06-11T08:27:08Z

Summary

Fix https://github.com/elastic/search-team/issues/10121

Add a base set of retrieval oriented tools, and expose them to the default agent.

Tools

Warning: names are still TBD

We're starting to see two different "layers" of tool appear: "simple" tools, which are doing a simple programmatic (understand: no LLM) task, and "smart" tools, which are more like workflows, with some of the steps relying on a LLM.

This PR introduces the following tools:

Simple tools

Simple tools (name TBD) are "programmatic" tools not relying on an LLM for their execution.

This PR introduces this base set of tools:

get_document_by_id: resolve a full document based on its id/index.
list_indices: list the indices the current user has access to.
get_index_mappings: retrieve the full mappings based for a given index.
execute_esql: executes a provided ES|QL query

Smart tools

Smart tools can have multiple internal steps (even if it remains an implementation detail), and are using LLM calls for some, or all, of them.

Note: there are huge potential areas of improvement in the current implementation of all those smart tools. One of the intent of this work is precisely to identify such areas of improvement

`index_explorer`

Based on a natural language query, returns a list of indices that should be searched, and their corresponding mappings.

`generate_esql`

Based on a natural language query, generates an ES|QL query.

use the nl-2-esql task under the hood
optional use index-explorer if index is not specified.

`relevance_search`

Perform a "full-text search" based on given term and returns the most relevant highlights.

`natural_language_search`

Retrieve data based on a natural language query.

Converts a natural language query to an ES|QL one then executes it, useing generate_esql and execute_esql under the hood.

Researcher assistant

The second part of this PR is implementing a researcher agent for deep research tasks.

The researcher assistant is following a very classic "act->process->reflect" cycle.

The implementation of the cycle is currently as follow:

1. Act

Given a research topic, the research history and a list of tools, select the tool best suited to search for this topic, and call it.

The tools exposed to the agent in this phase are:

index_explorer
relevance_search
nl_search

Note: later the whole act step could evolve to instead call sub search agent with planning and multi-step execution.

2. Process

Process the results from the latest act phase and create a corresponding entry in the search log.

At the moment, we're simply storing the whole tool call + results to the search log in a LLM-friendly format.

3. Reflect

Based on the main research query and the search log, identify where the information collected are enough to answer the question. If not, identify follow-up questions or sub-problems that it would be useful to solve to gather more information

What is out of scope of the current PR

Figuring out which set of tools should be exposed by default to the main agent (right now, all the tools listed in this PR are)

…-tools

pgayvallet · 2025-06-11T08:28:33Z

/ci

pgayvallet · 2025-06-11T09:02:28Z

/ci

pgayvallet

Self-review

pgayvallet · 2025-06-11T09:04:23Z

x-pack/platform/packages/shared/onechat/onechat-common/tools/constants.ts

+  /**
+   * Tag associated to tools related to data retrieval
+   */
+  retrieval: 'retrieval',


Starting to see a pattern where we assign tags to tools to allow easily filtering for specific profiles (e.g search agent get assigned all the retrieval tools).

x-pack/platform/plugins/shared/onechat/server/tools/retrieval/search_dsl.ts

pgayvallet · 2025-06-11T09:32:04Z

/ci

…-tools

pgayvallet · 2025-06-17T11:06:40Z

/ci

pgayvallet · 2025-06-17T11:28:01Z

/ci

…-fix'

pgayvallet · 2025-06-17T12:15:50Z

/ci

pgayvallet · 2025-06-17T12:57:03Z

/ci

elasticmachine · 2025-06-17T14:56:46Z

💛 Build succeeded, but was flaky

Buildkite Build
Commit: 8138de3

Failed CI Steps

FTR Configs #95

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`onechat`	32	33	+1

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`@kbn/onechat-common`	127	136	+9
`@kbn/onechat-genai-utils`	-	75	+75
total			+84

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id	before	after	diff
`@kbn/onechat-genai-utils`	-	4	+4

Unknown metric groups

API count

id	before	after	diff
`@kbn/onechat-common`	191	203	+12
`@kbn/onechat-genai-utils`	-	79	+79
`@kbn/onechat-server`	176	177	+1
total			+92

History

💔 Build #309375 failed 46839ad
💔 Build #309338 failed 688c98a
💔 Build #309316 failed 596e428
💚 Build #307311 succeeded 4a592dc

joemcelroy · 2025-06-19T09:13:33Z

x-pack/platform/packages/shared/onechat/onechat-genai-utils/framework/compose_provider.ts

+export type ToolFilterRule = ByToolIdRule | ByProviderIdRule;
+
+const matches = (rule: ToolFilterRule, tool: ToolDescriptor): boolean => {
+  if (rule.type === 'by_tool_id') {


nitpick: use an enum for this?

To be honest, I've been traumatized by enums and the fact they are not abstract (type) and require a concrete import, so I'm trying to avoid using them as much as possible, unless forced to because of some TS inference magical type I want to build.

But this is a very personal preference.

joemcelroy · 2025-06-19T09:31:57Z

x-pack/platform/packages/shared/onechat/onechat-genai-utils/tools/index_explorer.ts

+}
+
+export const indexExplorer = async ({
+  query,


Thought: I dont have a good answer here but i pause on whether both query and indexPattern is needed. I can see indexPattern being useful for when specified in the prompt however.

Yeah. In practice today, that "indexPattern" param is even causing issues because for a prompt such as "please search for docs in the hr index", the LLM sometimes add a *hr* index pattern parameter causing troubles.

And yet I think we do need the option to let the LLMm specify that filter. So yeah, idk.

joemcelroy · 2025-06-19T09:38:18Z

x-pack/platform/packages/shared/onechat/onechat-genai-utils/tools/steps/list_indices.ts

+    format: 'json',
+  });
+
+  return response.map(({ index, status, health, uuid, 'docs.count': docsCount, pri, rep }) => ({


i wonder what list index information is useful here for LLM. Personally for the index explorer, only index, docsCount is useful.

The idea if having some kind of low level utility function for the most common ES requests (e.g listing indices, mappings and so on) that we can use to compose our tools.

Agreed that for index explorer, most of those info are likely useless. We need to tweak that with evaluation too.

joemcelroy · 2025-06-19T09:41:36Z

x-pack/platform/packages/shared/onechat/onechat-genai-utils/tools/steps/perform_match_search.ts

+    index,
+    size,
+    retriever: {
+      rrf: {


suggestion: i would group all text fields together into one retriever and then each semantic_text field as individual retrievers. RRF would not be applied if its just all text fields.

++. I wanted to keep it as a follow up because it requires performMatchSearch to know about field types, and it doesn't at the moment

joemcelroy · 2025-06-19T09:45:12Z

x-pack/platform/packages/shared/onechat/onechat-genai-utils/tools/utils/esql.ts

+/**
+ * Converts an ES|QL /_query columnar response to a JSON representation
+ */
+export const esqlResponseToJson = (esql: EsqlResponse): Array<Record<string, any>> => {


have you tried specifying the response format? https://www.elastic.co/docs/explore-analyze/query-filter/languages/esql-rest#esql-rest-format

I tried, and unless I missed something I think it doesn't really work with the ES client 🙈.

joemcelroy

Overall looks great! One thing want to try is the trigger when it uses the researcher tool vs relevence / esql tool directly.

We need to expand the prompt to give more examples when relevence / esql tool is used but we can do that separately in another PR.

…-tools

## Summary Fix elastic/search-team#10121 Add a base set of retrieval oriented tools, and expose them to the default agent. ## Tools **Warning: names are still TBD** We're starting to see two different "layers" of tool appear: "simple" tools, which are doing a simple programmatic (understand: no LLM) task, and "smart" tools, which are more like workflows, with some of the steps relying on a LLM. This PR introduces the following tools: ### Simple tools Simple tools (name TBD) are "programmatic" tools not relying on an LLM for their execution. This PR introduces this base set of tools: - `get_document_by_id`: resolve a full document based on its id/index. - `list_indices`: list the indices the current user has access to. - `get_index_mappings`: retrieve the full mappings based for a given index. - `execute_esql`: executes a provided ES|QL query ### Smart tools Smart tools can have multiple internal steps (even if it remains an implementation detail), and are using LLM calls for some, or all, of them. *Note: there are huge potential areas of improvement in the current implementation of all those smart tools. One of the intent of this work is precisely to identify such areas of improvement* #### `index_explorer` Based on a natural language query, returns a list of indices that should be searched, and their corresponding mappings. <img width="984" alt="Screenshot 2025-06-17 at 16 44 32" src="https://github.com/user-attachments/assets/edff3964-31e3-40ea-a761-adf7c45fcb17" /> #### `generate_esql` Based on a natural language query, generates an ES|QL query. - use the `nl-2-esql` task under the hood - optional use `index-explorer` if `index` is not specified. <img width="891" alt="Screenshot 2025-06-17 at 16 51 56" src="https://github.com/user-attachments/assets/ce141d6b-dd4f-4eb9-ab32-823b81bc810b" /> #### `relevance_search` Perform a "full-text search" based on given term and returns the most relevant highlights. <img width="1071" alt="Screenshot 2025-06-17 at 16 59 49" src="https://github.com/user-attachments/assets/1f873e70-e277-424d-93e4-24b269a554e5" /> #### `natural_language_search` Retrieve data based on a natural language query. Converts a natural language query to an ES|QL one then executes it, useing `generate_esql` and `execute_esql` under the hood. <img width="768" alt="Screenshot 2025-06-18 at 08 31 52" src="https://github.com/user-attachments/assets/cb319831-17ed-4ad7-9e1f-2fe90c2472fa" /> ## Researcher assistant The second part of this PR is implementing a researcher agent for deep research tasks. The researcher assistant is following a very classic "act->process->reflect" cycle. <img width="960" alt="Screenshot 2025-06-18 at 09 16 17" src="https://github.com/user-attachments/assets/c24be323-ecf2-4c43-88fb-eaf874b18afc" /> The implementation of the cycle is currently as follow: **1. Act** Given a research topic, the research history and a list of tools, select the tool best suited to search for this topic, and call it. The tools exposed to the agent in this phase are: - `index_explorer` - `relevance_search` - `nl_search` *Note: later the whole `act` step could evolve to instead call sub search agent with planning and multi-step execution.* **2. Process** Process the results from the latest `act` phase and create a corresponding entry in the search log. At the moment, we're simply storing the whole tool call + results to the search log in a LLM-friendly format. **3. Reflect** Based on the main research query and the search log, identify where the information collected are enough to answer the question. If not, identify follow-up questions or sub-problems that it would be useful to solve to gather more information ## What is out of scope of the current PR - Figuring out which set of tools should be exposed by default to the main agent (right now, all the tools listed in this PR are) --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

## Summary Follow-up of #223367 Fix elastic/search-team#10259 This PR introduce the concept of agent **mode**, and expose the "deep research" agent as a mode instead of a tool. ## Examples ### Calling the Q/A (default) mode ```curl POST kbn:/internal/onechat/chat { "nextMessage": "Find all info related to our work from home policy" } ``` ### Calling the researcher mode ```curl POST kbn:/internal/onechat/chat { "mode": "researcher", "nextMessage": "Find all info related to our work from home policy" } ``` --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

Add first set of tools

e0cbaba

pgayvallet added release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting v9.1.0 labels Jun 11, 2025

Merge remote-tracking branch 'upstream/main' into onechat-10121-first…

e102b4d

…-tools

refactor reranking tool

76b86c0

pgayvallet commented Jun 11, 2025

View reviewed changes

tweak system prompt to use the new default tools

4a592dc

pgayvallet marked this pull request as ready for review June 11, 2025 11:40

pgayvallet requested a review from a team as a code owner June 11, 2025 11:40

pgayvallet marked this pull request as draft June 12, 2025 05:45

pgayvallet added 10 commits June 12, 2025 20:53

WIP researcher agent

83b8009

WIP smart tools

097c2c0

wip

408ba49

move tool implementation to dedicated package

bb64fb6

move / rename things

2243e44

start moving back to the agent

55b66d8

better action log representation + improved prompts

71c68fc

implement cycle budget

06c2ec8

remove console.log & unused imports

28e6b22

Merge remote-tracking branch 'upstream/main' into onechat-10121-first…

596e428

…-tools

[CI] Auto-commit changed files from 'node scripts/generate codeowners'

688c98a

kibanamachine added 2 commits June 17, 2025 11:46

[CI] Auto-commit changed files from 'node scripts/yarn_deduplicate'

9e3c077

[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…

46839ad

…-fix'

remove console.log

8138de3

pgayvallet marked this pull request as ready for review June 18, 2025 07:19

joemcelroy reviewed Jun 19, 2025

View reviewed changes

joemcelroy approved these changes Jun 19, 2025

View reviewed changes

pgayvallet added 2 commits June 20, 2025 10:12

start working on reflection events

44dd5a1

Merge remote-tracking branch 'upstream/main' into onechat-10121-first…

233f0ce

…-tools

pgayvallet merged commit 04b294e into elastic:main Jun 20, 2025
11 checks passed

pgayvallet mentioned this pull request Jun 23, 2025

[onechat] Implement agent modes #224801

Merged

Comments

Conversation

pgayvallet commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tools

Simple tools

Smart tools

index_explorer

generate_esql

relevance_search

natural_language_search

Researcher assistant

What is out of scope of the current PR

Uh oh!

pgayvallet commented Jun 11, 2025

Uh oh!

pgayvallet commented Jun 11, 2025

Uh oh!

pgayvallet left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pgayvallet commented Jun 11, 2025

Uh oh!

pgayvallet commented Jun 17, 2025

Uh oh!

pgayvallet commented Jun 17, 2025

Uh oh!

pgayvallet commented Jun 17, 2025

Uh oh!

pgayvallet commented Jun 17, 2025

Uh oh!

elasticmachine commented Jun 17, 2025

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

Module Count

Public APIs missing comments

Public APIs missing exports

API count

History

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joemcelroy Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joemcelroy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pgayvallet commented Jun 11, 2025 •

edited

Loading

`index_explorer`

`generate_esql`

`relevance_search`

`natural_language_search`

joemcelroy Jun 19, 2025 •

edited

Loading