[SigEvents] Add KI feature identification endpoints and refactor task to use shared service by cesco-f · Pull Request #263528 · elastic/kibana

cesco-f · 2026-04-15T14:51:04Z

Summary

Adds two internal endpoints for KI feature identification and refactors the Task Manager task to use the same shared logic, eliminating code duplication and preparing for the upcoming Workflows Management migration.

New internal endpoints

POST /internal/streams/{name}/features/_identify/inferred — Runs one iteration of feature identification: samples documents, runs LLM inference, reconciles results against known/excluded features, persists changes, and emits EBT telemetry. Accepts and returns accumulated state (discoveredFeatures, totalTokensUsed, successCount, iterationResults) so a caller can drive the iteration loop externally.
POST /internal/streams/{name}/features/_identify/computed — Generates computed KI features (e.g. from ES aggregations), reconciles UUIDs/metadata, and persists them.

These endpoints are designed for the declarative features identification workflow that will replace the Task Manager task once Workflows Management is available.

Extract feature identification logic into a shared service (features_identification_service.ts) with two top-level functions: identifyInferredFeatures and identifyComputedFeatures. Add two internal endpoints (_identify/inferred and _identify/computed) as thin wrappers around the service, designed for use by the upcoming Workflows Management orchestrator. Made-with: Cursor

Replace FeatureAccumulator, identifyStreamFeatures, and duplicated reconciliation logic with calls to identifyInferredFeatures and identifyComputedFeatures from the shared service. The task now mirrors the workflow YAML structure: init state, loop N iterations calling the service, then run computed features sequentially. Made-with: Cursor

- Slim AccumulatedIterationState to discoveredFeatures + iterationResults; derive successCount and totalTokensUsed via helpers - Make runId optional (auto-generated), remove iteration/successCount/ totalTokensUsed from route params - Remove iteration/successCount from computed route (caller responsibility) - Move Zod schemas (tokenCountSchema, iterationResultSchema) to @kbn/streams-schema alongside the TypeScript types - Export MS_PER_DAY from service, remove duplicate in route - Replace EMPTY_ACCUMULATED_STATE const with createEmptyAccumulatedState() factory - Replace _/__ throwaway destructuring with explicit tuning construction - Restore computed features parallelism in task (Promise started before iteration loop) - Handle computedFeaturesPromise rejection on error path - Distinguish logger namespaces: inferred vs computed - Add error logging with context to both route handlers Made-with: Cursor

cesco-f · 2026-04-15T15:41:42Z

/ci

cesco-f · 2026-04-15T15:41:54Z

@coderabbitai full review

coderabbitai · 2026-04-15T15:42:01Z

✅ Actions performed

Full review triggered.

coderabbitai · 2026-04-15T16:12:14Z

✅ Actions performed

Full review triggered.

coderabbitai · 2026-04-15T16:22:48Z

Caution

Review failed

The head commit changed during the review from e024b20 to f182a2d.

📝 Walkthrough

Walkthrough

Adds runtime Zod schemas (tokenCountSchema, iterationResultSchema) and exposes them from the shared kbn-streams-schema package. Introduces a new feature identification service that implements inferred (LLM-based) and computed feature generation, state accumulation, telemetry derivation, and bulk write reconciliation. Refactors the existing task to delegate iterations to the new service, tightens sampling ratio validation, and surfaces two new internal HTTP routes to trigger inferred and computed feature identification.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

🛠️ Update Documentation: Commit on current branch
🛠️ Update Documentation: Create PR

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai · 2026-04-15T16:42:03Z

✅ Actions performed

Full review triggered.

coderabbitai · 2026-04-15T16:57:24Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: 237aa99a-a61b-4054-811b-3ed41447ad70

📥 Commits

Reviewing files that changed from the base of the PR and between 1d84e53 and 4857b58.

📒 Files selected for processing (7)

x-pack/platform/packages/shared/kbn-streams-schema/index.ts
x-pack/platform/packages/shared/kbn-streams-schema/src/api/features/index.ts
x-pack/platform/plugins/shared/streams/server/lib/sig_events/features/features_identification_service.ts
x-pack/platform/plugins/shared/streams/server/lib/tasks/task_definitions/features_identification/fetch_sample_documents.ts
x-pack/platform/plugins/shared/streams/server/lib/tasks/task_definitions/features_identification/index.ts
x-pack/platform/plugins/shared/streams/server/routes/index.ts
x-pack/platform/plugins/shared/streams/server/routes/internal/sig_events/features/identify_route.ts

📝 Walkthrough

Walkthrough

This change introduces a feature identification system for SIG Events. It adds Zod validation schemas (tokenCountSchema, iterationResultSchema) to the shared schema package. A new feature identification service module provides helpers for generating and persisting computed and inferred features, including TTL metadata, UUID assignment, and reconciliation logic. The features task definition is refactored to use this new shared service. Parameter validation is added to sample document fetching. Two new internal POST endpoints are exposed for triggering inferred and computed feature identification.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

🛠️ Update Documentation: Commit on current branch
🛠️ Update Documentation: Create PR

Warning

Tools execution failed with the following error:

Failed to run tools: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error)

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Made-with: Cursor

Break the 732-line monolith into cohesive files by responsibility: - iteration_state.ts: accumulated state types and helpers - reconcile_features.ts: inferred/computed feature reconciliation - identify_inferred_features.ts: LLM iteration, telemetry, top-level handler - identify_computed_features.ts: computed features handler - index.ts: barrel re-exports preserving the public API Made-with: Cursor

…inline durationMs - Replace manual IterationResult interface with z.infer<typeof iterationResultSchema> - Track failure telemetry in inferred features route catch block - Compute durationMs inline instead of via thunk Made-with: Cursor

…try iteration count - Drop iterationResult field from IdentifyInferredFeaturesResult (always last element of state.iterationResults) - Fix off-by-one in route failure telemetry: use iterationResults.length + 1 Made-with: Cursor

Made-with: Cursor

cesco-f · 2026-04-16T10:28:39Z

@coderabbitai full review

coderabbitai · 2026-04-16T10:28:45Z

✅ Actions performed

Full review triggered.

macroscopeapp · 2026-04-16T10:31:03Z

Approvability

Verdict: Needs human review

This PR adds two new internal API endpoints for KI feature identification and refactors the background task to use shared services. The new endpoints expose previously task-only functionality via HTTP, representing new capability. Open review comments question the API design approach (leaking internal state to clients). All changed files are owned by teams other than the author.

^{You can customize Macroscope's approvability policy. Learn more.}

coderabbitai · 2026-04-16T10:58:45Z

✅ Actions performed

Full review triggered.

coderabbitai · 2026-04-16T11:28:46Z

✅ Actions performed

Full review triggered.

klacabane · 2026-04-20T11:52:02Z

+        discoveredFeatures: z.array(featureSchema).optional().default([]),
+        iterationResults: z.array(iterationResultSchema).optional().default([]),


To avoid that we'd need to persist that state somehow, any chance the workflow already stores and exposes the results of the previous steps so we can lean on that ?
Otherwise, leaking all that internal state to the client feels off, so I'm wondering if we should instead have a convenience api that does the looping internally.

What would we lose if we call a route that handles the looping with a iterations parameter, instead of a single step ?

The endpoint is designed to be workflow-agnostic, it shouldn't be coupled to the workflow engine's state management, so I'd prefer not to have it read previous step results from the workflow context.

That said, I'd like to understand what specifically feels off about passing discoveredFeatures and iterationResults back. These are internal endpoints, the caller is always Kibana itself.

What is your concern about?

My main concern is pushing too much state to the client:

for discoveredFeatures, it looks like a gap in storage that we attempt to patch

for iterationResults, it looks like the server only reads its length (for the iteration counter in telemetry and the response), so perhaps we don't need it at all

Agreed that coupling to Workflow isn't a good choice, but how about storing run_id on the features ? The endpoint could then determine what was discovered in this run without relying on the client, and it gives us an audit trail for cheap.

For iterationResults, we could drop it from the request entirely, or replace the array with an integer if we still want to stamp the iteration number server side

Good call on the run_id approach, I've implemented it:

Features now get tagged with run_id when persisted. The endpoint derives discoveredFeatures by filtering stored features on run_id, so the caller no longer passes them back.

iterationResults removed from the request body, replaced with an optional iteration: number (the only thing the server actually needed was the counter).

The request body is now just tuning parameters + runId + iteration. No internal state roundtrips.

915e83d

Features are now tagged with run_id when persisted. The inferred identification endpoint derives discoveredFeatures from ES by filtering on run_id instead of requiring them from the caller. iterationResults removed from request body, replaced with an optional iteration number parameter. Made-with: Cursor

klacabane · 2026-04-21T16:37:44Z

+  streamName: string;
+  featureTtlDays?: number;
+}): Feature[] {
+  const metadata = createFeatureMetadata({ featureTtlDays });


Is there any downside to add the run_id to computed features ? I know for now we only keep one so it's not that useful, but if we move to data stream we'll likely keep history of it and a single query with a given run_id would give you the complete snapshot of discovered features for a run

It makes sense to add the run_id to computed features as well, done here: 144c5af

klacabane

LGTM - tested by manually calling the endpoint. Features from previous iterations of the same run are correctly fed back to next iterations.

Left a comment about persisting the run_id for computed features

Make runId required across all feature reconciliation paths. The computed features route now generates a runId when not provided, matching the inferred route behavior. This enables querying a complete feature snapshot by run_id. Made-with: Cursor

elasticmachine · 2026-04-22T07:14:51Z

💚 Build Succeeded

Buildkite Build
Commit: f156864

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`datasetQuality`	1073	1074	+1
`streams`	235	236	+1
`streamsApp`	1820	1821	+1
total			+3

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`@kbn/streams-schema`	408	404	-4

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`datasetQuality`	523.2KB	523.6KB	+417.0B
`streams`	225.9KB	226.3KB	+471.0B
`streamsApp`	2.0MB	2.0MB	+417.0B
total			+1.3KB

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id	before	after	diff
`streams`	36	37	+1

Unknown metric groups

API count

id	before	after	diff
`@kbn/streams-schema`	476	472	-4

History

💛 Build #431369 was flaky 598a0e8
💔 Build #431236 failed ed0e9ae
💛 Build #430206 was flaky 57cfbce
💚 Build #428795 succeeded f182a2d
💛 Build #428284 was flaky 4857b58

…sationChanges23 * commit '9a7b717c662d1c904052bc59f0e5a81daab87c7f': (145 commits) Upgrade EUI to v114.2.0 (elastic#264550) [Entity Analytics] Add missing OpenAPI descriptions and examples to p… (elastic#264778) [Entity Resolution] Clarify CSV upload result for already-linked entities (elastic#264689) [AI Infra] Fix failing GenAI Settings Scout tests (elastic#260496) [Agent Builder] [Bug Bash] OAuth connector settings mention fields that are not there (elastic#264756) [performance] process-wide cache for advanced settings lookup (elastic#262618) [CI] Update limits.yml for securitySolution (elastic#264946) [SLO] Fix APM embeddable ids (elastic#264750) [EDR Workflows] Unify artifacts empty state buttons (elastic#264389) [Alert Triage workflow] Adds security.buildAlertEntityGraph and security.renderAlertNarrative… (elastic#259159) [SigEvents] Add KI feature identification endpoints and refactor task to use shared service (elastic#263528) [Scout] Migrate Data Views API tests from FTR - Part5 (elastic#264088) [Cases] Apply shared extended_fields path util server side (elastic#264706) [Lens as code] Fix metric trendline (elastic#264777) [api-docs] 2026-04-22 Daily api_docs build (elastic#264882) [Scout] Update test config manifests (elastic#264575) [workflows_management] Lazy-load Zod connector schemas to cut idle memory (elastic#264283) [ES|QL] Fix ES|QL columns reset race during active fetch (elastic#263947) [Content List] Column layout props, sticky actions, and title click handlers (elastic#264203) [Lens as code] Validate `id` in route for new vis types (elastic#264480) ...

… to use shared service (elastic#263528) ## Summary Adds two internal endpoints for KI feature identification and refactors the Task Manager task to use the same shared logic, eliminating code duplication and preparing for the upcoming Workflows Management migration. ### New internal endpoints - **`POST /internal/streams/{name}/features/_identify/inferred`** — Runs one iteration of feature identification: samples documents, runs LLM inference, reconciles results against known/excluded features, persists changes, and emits EBT telemetry. Accepts and returns accumulated state (`discoveredFeatures`, `totalTokensUsed`, `successCount`, `iterationResults`) so a caller can drive the iteration loop externally. - **`POST /internal/streams/{name}/features/_identify/computed`** — Generates computed KI features (e.g. from ES aggregations), reconciles UUIDs/metadata, and persists them. These endpoints are designed for the declarative features identification workflow that will replace the Task Manager task once Workflows Management is available.

cesco-f added 2 commits April 15, 2026 16:47

cesco-f added release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting Feature:SigEvents Significant events feature, related to streams and rules/alerts (RnA) Team:SigEvents Project team working on Significant Events labels Apr 15, 2026

github-actions Bot added the author:actionable-obs PRs authored by the actionable obs team label Apr 15, 2026

cesco-f added 4 commits April 15, 2026 20:31

Add runId to IterationResult for run traceability

ef37bd1

Made-with: Cursor

fix(error): better error handling

3849668

macroscopeapp Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread ...platform/plugins/shared/streams/server/routes/internal/sig_events/features/identify_route.ts Outdated

cesco-f added 3 commits April 16, 2026 10:48

Rename path param to streamName and make request body optional

0614a3f

Made-with: Cursor

fix(cr): code review

e024b20

cesco-f marked this pull request as ready for review April 16, 2026 10:28

cesco-f requested review from a team as code owners April 16, 2026 10:28

Merge branch 'main' into features-identification-endpoints

f182a2d

ruflin reviewed Apr 17, 2026

View reviewed changes

Comment thread ...platform/plugins/shared/streams/server/lib/sig_events/features/identify_inferred_features.ts

cesco-f and others added 2 commits April 18, 2026 20:40

Merge branch 'main' into features-identification-endpoints

65e8d5f

Merge branch 'main' into features-identification-endpoints

57cfbce

klacabane reviewed Apr 20, 2026

View reviewed changes

cesco-f requested a review from klacabane April 20, 2026 13:15

cesco-f and others added 4 commits April 21, 2026 12:28

Merge branch 'main' into features-identification-endpoints

ed0e9ae

Merge branch 'main' into features-identification-endpoints

598a0e8

Merge branch 'main' into features-identification-endpoints

ba66a5b

klacabane reviewed Apr 21, 2026

View reviewed changes

klacabane approved these changes Apr 21, 2026

View reviewed changes

cesco-f and others added 2 commits April 22, 2026 07:53

Merge branch 'main' into features-identification-endpoints

f156864

cesco-f enabled auto-merge (squash) April 22, 2026 05:59

cesco-f merged commit d26065e into elastic:main Apr 22, 2026
20 checks passed

cesco-f deleted the features-identification-endpoints branch April 22, 2026 07:13

kibanamachine added the v9.5.0 label Apr 22, 2026

		discoveredFeatures: z.array(featureSchema).optional().default([]),
		iterationResults: z.array(iterationResultSchema).optional().default([]),

Conversation

cesco-f commented Apr 15, 2026

Summary

New internal endpoints

Uh oh!

cesco-f commented Apr 15, 2026

Uh oh!

cesco-f commented Apr 15, 2026

Uh oh!

coderabbitai Bot commented Apr 15, 2026

Uh oh!

coderabbitai Bot commented Apr 15, 2026

Uh oh!

coderabbitai Bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Uh oh!

coderabbitai Bot commented Apr 15, 2026

Uh oh!

coderabbitai Bot commented Apr 15, 2026

Walkthrough

Uh oh!

Uh oh!

cesco-f commented Apr 16, 2026

Uh oh!

coderabbitai Bot commented Apr 16, 2026

Uh oh!

macroscopeapp Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Approvability

Uh oh!

coderabbitai Bot commented Apr 16, 2026

Uh oh!

coderabbitai Bot commented Apr 16, 2026

Uh oh!

Uh oh!

klacabane Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

cesco-f Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

klacabane Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cesco-f Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

klacabane Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

cesco-f Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

klacabane left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticmachine commented Apr 22, 2026

💚 Build Succeeded

Metrics [docs]

Module Count

Public APIs missing comments

Async chunks

Public APIs missing exports

API count

History

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

coderabbitai Bot commented Apr 15, 2026 •

edited

Loading

macroscopeapp Bot commented Apr 16, 2026 •

edited

Loading

cesco-f Apr 20, 2026 •

edited

Loading

klacabane Apr 20, 2026 •

edited

Loading

cesco-f Apr 21, 2026 •

edited

Loading