[Agent Builder] [SML] Retrieval layer: schema, autocomplete, hybrid search #269277
Draft
Apmats wants to merge 11 commits into
Draft
[Agent Builder] [SML] Retrieval layer: schema, autocomplete, hybrid search #269277Apmats wants to merge 11 commits into
Apmats wants to merge 11 commits into
Conversation
Builds on Peter's merged elastic#266573 by adding the schema fields the team converged on and refactoring title/description/content for BM25 + a single unified vector retrieval surface. Fields added: - tags (keyword[], free-form labels) - payload (flattened, type-specific opaque data) - title_autocomplete (search_as_you_type, copy target of title) - unified_semantic (semantic_text, copy target of title/description/content) Behavior changes: - title becomes `text` with copy_to fanning into title_autocomplete and unified_semantic. Three retrieval modes (BM25/lexical, prefix/typeahead, semantic) from one producer-set field. - description and content become `text` with copy_to: 'unified_semantic'. One inference pass per record instead of three; recall doesn't fragment across overlapping content. - buildSmlSearchQuery: SAYT field paths move from title.* to title_autocomplete.*; the should-block uses match: { unified_semantic } in place of separate matches on content and description. origin_id is unchanged. An earlier draft of this PR added a parallel `origin` URI field per the gist's vision but it was dropped: the URI form is computable on the fly from type + origin_id, Sean's merged buildTypeFilters targets origin_id directly, and adding a parallel representation would just be redundant. Type writers need no changes: they keep returning the same SmlChunk shape; ES copy_to handles the fan-out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the single `unified_semantic` copy_to aggregator with per-field semantic_text mirrors (`title_semantic`, `description_semantic`, `content_semantic`). The search query now uses the `rrf` retriever with `query`/`fields` for the semantic side and a `standard` sub-retriever for BM25 + SAYT prefix matching. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the @ menu / typeahead query path on top of the schema PR. The
autocomplete surface is unified through `discovery_labels` — title and
type are no longer special-cased.
discovery_labels: nested { value, kind }
- `value.autocomplete` is a `text` field with a custom edge-ngram analyzer
(NOT `search_as_you_type` — see WHY below).
- The indexer auto-prepends `{value: chunk.title, kind: 'title'}` and
`{value: chunk.type, kind: 'type'}` to every record, plus any entries
the producer provides (taglines, nicknames, categories, etc.). Type
writers don't change their SmlChunk shape.
Drops `title_autocomplete` (top-level SAYT field) and the `.autocomplete`
subfield from `type`. Title's and type's autocomplete reach now goes
through their auto-prepended discovery_labels entries; `type` stays as
plain keyword for exact filtering.
POST /internal/agent_context_layer/sml/_autocomplete — dedicated route
for the @ menu, separate from /sml/_search.
buildSmlAutocompleteQuery: a single nested query against
discovery_labels.value.autocomplete with `match` + `operator: and` (every
typed token must match). `inner_hits` returns the matched entries with
their `kind` and an ES `unified`-highlighter snippet wrapping the
matched word(s) in <em>...</em>.
Response: { id, type, title, origin_id, matched_discovery_labels? }.
No content, no description, no score — order is sufficient; UI gets
exactly what it needs to render a typeahead row with badges.
filterResultsByPermissions is generic over the result type and shared
with the search path.
SAYT (`type: 'search_as_you_type'`) is the obvious first choice for an
autocomplete field. It does not produce useful highlights for prefix
queries in a nested context. Confirmed in two ways:
1. Empirical: tested unified, plain, fvh highlighters; tested
`require_field_match: false`, `highlight_query: prefix`, term_vector
with offsets, top-level vs inner_hits highlight — none returned a
snippet for nested + SAYT + bool_prefix on prefix-only matches.
2. Documented: this is the bug behind elastic/elasticsearch#53744
("search_as_you_type, prefix queries and broken highlighting"),
open since 2020-03, explicitly low-priority per the ES Search team
lead at the time. Root cause documented in elastic#49069 ("Highlight with
index_prefixes"): SAYT's `_index_prefix` subfield indexes
position-spanning ngrams with offsets that span from the prefix
start to the end of the entire field, which the highlighter can't
wrap per-word. The ES-community workaround (see
elastic/elasticsearch#53744 comment by peterbe) is exactly what
this PR does: use a regular text field with a custom analyzer.
The custom analyzer: standard tokenizer + lowercase + edge_ngram filter
(1–20) at index time; standard + lowercase at search time. Per-token
edge_ngrams preserve each word's offsets, so the highlighter wraps the
matched word correctly — verified live for queries like "git" →
"<em>github</em>", "github c" → "<em>GitHub</em> <em>Connector</em>",
"prod logs ana" → "<em>production</em> <em>logs</em> <em>analytics</em>".
Trade-offs vs SAYT:
- Smaller inverted index (one field, not four).
- Comparable query latency (posting-list lookup per token).
- Loses SAYT's _2gram/_3gram shingle subfields used for phrase-prefix
scoring. BM25 on edge_ngram tokens is good enough for typeahead
ranking.
Adds an optional `analysis` field to `IndexStorageSettings`, threaded
through to the index template's `settings.analysis` block. Required so
that consumers (like SML here) can declare custom analyzers without
bypassing the storage adapter's bootstrap flow. Backwards-compatible —
field is optional. Ignored on serverless ES (no index-level settings).
Owners: @elastic/observability-ui, @elastic/kibana-core.
- Service test: query shape (match + operator: and + nested + inner_hits
+ highlight), inner_hits mapping into matched_discovery_labels,
highlight passthrough.
- Route test: response shape, no permissions/spaces/content leak.
- FTR (real-ES): /sml/_autocomplete against an indexed record with
title / type / tagline discovery_labels; asserts matched_discovery_labels,
kind, and that the highlight wraps the matched word.
- 121/121 jest in agent_context_layer; 14/14 jest in kbn-storage-adapter.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stacks on 03da4f4 (which introduced custom edge_ngram analyzer for discovery_labels.value.autocomplete to get nested-context highlights working). The edge_ngram approach highlights correctly but is loose: every typed token behaves as a prefix at index time, so "github c" matches "Githubster Cup" (because "github" is an indexed edge_ngram of "githubster"). Verified live. This commit pivots discovery_labels.value to `search_as_you_type` with `multi_match bool_prefix operator:and`. SAYT's native query type gives tight per-token semantics: all-but-last analyzed tokens require exact indexed-term match (against ES's auto-generated _2gram/_3gram shingle subfields); only the last (trailing partial) is treated as a prefix (against _index_prefix). Net: "github c" no longer matches "Githubster Cup". Trade-off: highlights. SAYT + bool_prefix + nested + inner_hits is the exact case behind ES bug elastic/elasticsearch#53744 (open since 2020), so `matched_discovery_labels` entries return without `highlighted`. The route still configures the highlighter (forward-compat when the bug lands); UI falls back to rendering plain `value` when `highlighted` is absent. Alternative considered (not taken in this PR): hybrid AND — bool { must: [match(edge_ngram), match_bool_prefix(plain text)] } Live-tested to give tight semantics AND working highlights (from the edge_ngram clause). Costs an extra plain-text subfield on `discovery_labels.value` (~5% index footprint with index_options:freqs norms:false) and still has the "git c" mid-typing gap where match_bool_prefix demands a complete indexed token. Slotted for PR3 if user feedback warrants. Removes the custom edge_ngram analyzers and the storageSettings `analysis` block — SAYT needs no custom analysis. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nt connector filter support
Two changes that ship together because they touch the same FE surface:
1. Wire the @ menu typeahead to POST /sml/_autocomplete (instead of
the previously used POST /sml/_search). Adds:
- `smlService.autocomplete({ query, size, filters? })`
- `queryKeys.sml.autocomplete(query, filters?)`
- `useSmlAutocomplete(query, { filters? })` hook
- `render_sml_highlight.tsx` helper that parses ES highlight snippets
(`<em>...</em>`) into React fragments — currently inert on the SAYT
path because of ES bug elastic#53744, forward-compatible when it lands.
- `Sml` component switches to the new hook and renders per-row
`matched_discovery_labels` (`kind`-aware) instead of client-side
`EuiHighlight` on `${type}/${title}`.
- Drops `sml_command_menu_highlight.{ts,test.ts}` — the old
client-side highlight parser is no longer needed.
2. Restore agent-centric connector scoping (Sean's
`agent-centric connectors` feature, elastic#267333) on the autocomplete
route. The earlier autocomplete commit (03da4f4) added the route
without a `filters` parameter, which silently broke the @ menu's
respect for `agent.configuration.connector_ids`. This commit:
- BE: adds `filters?: SmlSearchFilters` to the autocomplete route
schema and threads it through `SmlService.autocomplete` and
`autocompleteSml`, which now calls the existing `buildTypeFilters`
helper (shared with the search path) so the filter clause is
identical between routes.
- FE: `useSmlAutocomplete` accepts `{ filters? }`,
`usePrefetchSml(filters?)` regains the filter param, the
command-menu orchestrator re-prefetches SML when the agent's
filters change after the async agent fetch resolves.
- Tests: new BE unit test asserting filters flow into the ES filter
clauses via `buildTypeFilters`; new FE hook test covering the
filters arg; existing `sml.test.tsx` + `use_prefetch_sml.test.tsx`
updated to the new mocks.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…@ menu The previous commit on this branch (b46c879) rewrote the @ menu rendering to use the server's `matched_discovery_labels` and a custom `<em>...</em>` parser. That assumed working server-side highlights — true for the edge_ngram iteration, but the SAYT pivot (202ce94) makes ES return no highlight snippets (elastic#53744), so the new path renders plain text with no visual indication of what matched. The new `SmlAutocompleteHttpResultItem` still has `type` and `title` top-level, which is everything the original client-side EuiHighlight rendering needed. Restoring it gets us back the "bold the typed query in the displayed row" UX with no schema or route changes: - `sml.tsx` back to two `<EuiHighlight>` blocks over `type` and `title`, using the same `getSmlMenuHighlightSearchStrings` parser. - Restored `sml_command_menu_highlight.{ts,test.ts}`. - Dropped `render_sml_highlight.tsx` — no longer used. (Forward-compat for elastic#53744 isn't load-bearing; whoever revives server-side highlights can write it then.) - `matched_discovery_labels` is still returned by the BE and still on the FE result type — FE engineers can switch the rendering to use it whenever they want kind-aware badges. Hook + filter wiring (useSmlAutocomplete with agent connector filters) is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier autocomplete commit on this branch added an optional `analysis` field to `IndexStorageSettings` so the (then-planned) custom edge_ngram analyzer could be declared through the storage adapter. The SAYT pivot on the following commit dropped the custom analyzer entirely — SAYT brings its own auto-generated subfields and doesn't need index-level `analysis`. `grep` confirms no SML file references `analysis` today, and no other consumer in the tree was planning to use the hook. Reverting the two storage-adapter files to their main-branch state removes ~18 lines of unused code and drops `@elastic/observability-ui` / `@elastic/kibana-core` from the PR's required reviewers, since this PR no longer touches their package. If a future consumer needs to declare custom analyzers through the storage adapter, the hook can be reintroduced then with a real motivating use case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… abandoned iterations Two doc-comment blocks (one in sml_storage's discovery_labels mapping, one in sml_service's buildSmlAutocompleteQuery) carried a tail referring to the custom edge_ngram approach and hybrid-AND alternative — iterations we tried during this PR and did not ship. These comments rot the moment the PR closes: the "PR description" reference becomes opaque, and the "earlier commit" comparison no longer maps to anything visible in the file. The technical content stays: SAYT mechanics, the elastic#53744 limitation, what `matched_discovery_labels.highlighted` means. That's still useful for future readers and isn't visible from the code alone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lastic#14363) Land the natural-language search API over SML, replacing the transitional bool.should mash with a proper RRF retriever and introducing the trust-boundary split between runtime-imposed scoping and agent-discoverable filters. Decisions per the implementation prompt §6: - §6.1 Retrieval: RRF over BM25 (title^2, description, content via best_fields multi_match) and semantic (unified_semantic). Filters mirrored to each child retriever so RRF can't pull unauthorized docs into the fused top-k. Defaults rank_constant=60, rank_window_size=50. - §6.2 Filter merging: option (2). Service signature splits `scoping?: SmlSearchScoping` (runtime-imposed, unconditional) from `filters?: SmlSearchFilters` (caller refinement). Trust boundary visible at the API layer; merge happens in one place. Sean's per-type id-allowlist shape (elastic#267333) survives unchanged as `SmlSearchScoping`. - §6.3 Agent-supplied dimensions: `types?: string[]`, `tags?: string[]`, each lowering into a `terms` clause. Time-range and free-form payload filters deferred. - §6.4 Compact LLM response: SmlSearchResult drops the `content` blob, full `payload`, dates, and discovery_labels; gains `more_content` (true when the indexed record has non-empty content worth fetching via sml_read). `permissions` retained internally for post-hoc filtering, stripped from the HTTP response. - §6.5 Reference resolution: URI strings only — SML does not dereference. Inline-expansion deferred until eval data justifies it (Reading 1; Sean's "SML shouldn't turn pointers into blobs"). - §6.6 Tool description rewrite with when-to-use guidance and four worked examples (plain query, types+tags filter, connector restriction, wildcard inventory). Runtime scoping is not exposed in the LLM input schema. Fixes the `total` bug: post-permission-filter `results.length` no longer overwrites ES `hits.total.value`. Frontend / HTTP route updates: - POST /sml/_search body accepts `scoping?` and `filters?` (replaces the old `filters` per-type id-allowlist field and `skip_content`). - POST /sml/_autocomplete renames `filters` → `scoping`; no agent-discoverable filters on this route (no LLM in the path). - Frontend SmlService, useSmlSearch, useSmlAutocomplete, usePrefetchSml, queryKeys, and buildSmlScopingFromAgent (formerly buildSmlFiltersFromAgent) all follow the new naming. Tests: - New unit tests for RRF body, filter threading through child retrievers, vocabulary-mismatch over the semantic leg, more_content on/off, compact result mapping, agent-filter builder, and tool wrapper scoping/filters mapping. - FTR + Scout integration tests updated to drop content/skip_content assertions and lock in the no-content invariant. Out of scope (deferred per epic): reranker, kneedle, per-type retriever profiles, HyDE/query expansion, reference inline-expansion, telemetry instrumentation (elastic#14366), recrawl/versioning (elastic#14367). Open coordination points (not blocking this PR): - Inference model on unified_semantic still relies on defaults (Kathleen Mar 19: pin Jina V5). Belongs to A. - Kathleen review on A's PR proposes splitting unified_semantic into per-field semantic_text fields. If A reverts unified, the semantic leg in this PR fans out from 1 to 2-3 child retrievers — small retriever-only change. - Pierre's configuration_override-aware tool handler context (elastic#267333 review) — wrapper currently reads agentConfiguration from the tool handler context, which `runner.ts` already resolves with overrides applied via `resolveConfiguration`. Note in code. Stacked on top of apmats/sml-autocomplete-followup (elastic#14364). Once autocomplete merges, this PR rebases to main. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ers on autocomplete Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… scoping semantics Switch all three semantic_text mirrors (title_semantic, description_semantic, content_semantic) from the default HNSW to disk-based BBQ quantization. Agents are latency-tolerant and the SML index can grow large — bbq_disk avoids the per-field HNSW graph memory overhead that comes with N separate semantic fields. Also adds a comment to SmlSearchScoping documenting the cross-type OR semantics and the id-allowlist complexity ceiling. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
🤖 Jobs for this PR can be triggered through checkboxes. 🚧
ℹ️ To trigger the CI, please tick the checkbox below 👇
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR consolidates three stacked PRs into a single reviewable diff:
discovery_labels+ autocomplete route