Skip to content

[SigEvents] Add KI feature identification endpoints and refactor task to use shared service#263528

Merged
cesco-f merged 19 commits intoelastic:mainfrom
cesco-f:features-identification-endpoints
Apr 22, 2026
Merged

[SigEvents] Add KI feature identification endpoints and refactor task to use shared service#263528
cesco-f merged 19 commits intoelastic:mainfrom
cesco-f:features-identification-endpoints

Conversation

@cesco-f
Copy link
Copy Markdown
Contributor

@cesco-f cesco-f commented Apr 15, 2026

Summary

Adds two internal endpoints for KI feature identification and refactors the Task Manager task to use the same shared logic, eliminating code duplication and preparing for the upcoming Workflows Management migration.

New internal endpoints

  • POST /internal/streams/{name}/features/_identify/inferred — Runs one iteration of feature identification: samples documents, runs LLM inference, reconciles results against known/excluded features, persists changes, and emits EBT telemetry. Accepts and returns accumulated state (discoveredFeatures, totalTokensUsed, successCount, iterationResults) so a caller can drive the iteration loop externally.

  • POST /internal/streams/{name}/features/_identify/computed — Generates computed KI features (e.g. from ES aggregations), reconciles UUIDs/metadata, and persists them.

These endpoints are designed for the declarative features identification workflow that will replace the Task Manager task once Workflows Management is available.

cesco-f added 2 commits April 15, 2026 16:47
Extract feature identification logic into a shared service
(features_identification_service.ts) with two top-level functions:
identifyInferredFeatures and identifyComputedFeatures. Add two
internal endpoints (_identify/inferred and _identify/computed) as
thin wrappers around the service, designed for use by the upcoming
Workflows Management orchestrator.

Made-with: Cursor
Replace FeatureAccumulator, identifyStreamFeatures, and duplicated
reconciliation logic with calls to identifyInferredFeatures and
identifyComputedFeatures from the shared service. The task now
mirrors the workflow YAML structure: init state, loop N iterations
calling the service, then run computed features sequentially.

Made-with: Cursor
@cesco-f cesco-f added release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting Feature:SigEvents Significant events feature, related to streams and rules/alerts (RnA) Team:SigEvents Project team working on Significant Events labels Apr 15, 2026
@github-actions github-actions Bot added the author:actionable-obs PRs authored by the actionable obs team label Apr 15, 2026
- Slim AccumulatedIterationState to discoveredFeatures + iterationResults;
  derive successCount and totalTokensUsed via helpers
- Make runId optional (auto-generated), remove iteration/successCount/
  totalTokensUsed from route params
- Remove iteration/successCount from computed route (caller responsibility)
- Move Zod schemas (tokenCountSchema, iterationResultSchema) to
  @kbn/streams-schema alongside the TypeScript types
- Export MS_PER_DAY from service, remove duplicate in route
- Replace EMPTY_ACCUMULATED_STATE const with createEmptyAccumulatedState()
  factory
- Replace _/__ throwaway destructuring with explicit tuning construction
- Restore computed features parallelism in task (Promise started before
  iteration loop)
- Handle computedFeaturesPromise rejection on error path
- Distinguish logger namespaces: inferred vs computed
- Add error logging with context to both route handlers

Made-with: Cursor
@cesco-f
Copy link
Copy Markdown
Contributor Author

cesco-f commented Apr 15, 2026

/ci

@cesco-f
Copy link
Copy Markdown
Contributor Author

cesco-f commented Apr 15, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 15, 2026

✅ Actions performed

Full review triggered.

1 similar comment
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 15, 2026

✅ Actions performed

Full review triggered.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 15, 2026

Caution

Review failed

The head commit changed during the review from e024b20 to f182a2d.

📝 Walkthrough

Walkthrough

Adds runtime Zod schemas (tokenCountSchema, iterationResultSchema) and exposes them from the shared kbn-streams-schema package. Introduces a new feature identification service that implements inferred (LLM-based) and computed feature generation, state accumulation, telemetry derivation, and bulk write reconciliation. Refactors the existing task to delegate iterations to the new service, tightens sampling ratio validation, and surfaces two new internal HTTP routes to trigger inferred and computed feature identification.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • 🛠️ Update Documentation: Commit on current branch
  • 🛠️ Update Documentation: Create PR

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 15, 2026

✅ Actions performed

Full review triggered.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 15, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: 237aa99a-a61b-4054-811b-3ed41447ad70

📥 Commits

Reviewing files that changed from the base of the PR and between 1d84e53 and 4857b58.

📒 Files selected for processing (7)
  • x-pack/platform/packages/shared/kbn-streams-schema/index.ts
  • x-pack/platform/packages/shared/kbn-streams-schema/src/api/features/index.ts
  • x-pack/platform/plugins/shared/streams/server/lib/sig_events/features/features_identification_service.ts
  • x-pack/platform/plugins/shared/streams/server/lib/tasks/task_definitions/features_identification/fetch_sample_documents.ts
  • x-pack/platform/plugins/shared/streams/server/lib/tasks/task_definitions/features_identification/index.ts
  • x-pack/platform/plugins/shared/streams/server/routes/index.ts
  • x-pack/platform/plugins/shared/streams/server/routes/internal/sig_events/features/identify_route.ts

📝 Walkthrough

Walkthrough

This change introduces a feature identification system for SIG Events. It adds Zod validation schemas (tokenCountSchema, iterationResultSchema) to the shared schema package. A new feature identification service module provides helpers for generating and persisting computed and inferred features, including TTL metadata, UUID assignment, and reconciliation logic. The features task definition is refactored to use this new shared service. Parameter validation is added to sample document fetching. Two new internal POST endpoints are exposed for triggering inferred and computed feature identification.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • 🛠️ Update Documentation: Commit on current branch
  • 🛠️ Update Documentation: Create PR

Warning

Tools execution failed with the following error:

Failed to run tools: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error)


Comment @coderabbitai help to get the list of available commands and usage tips.

cesco-f added 4 commits April 15, 2026 20:31
Break the 732-line monolith into cohesive files by responsibility:
- iteration_state.ts: accumulated state types and helpers
- reconcile_features.ts: inferred/computed feature reconciliation
- identify_inferred_features.ts: LLM iteration, telemetry, top-level handler
- identify_computed_features.ts: computed features handler
- index.ts: barrel re-exports preserving the public API

Made-with: Cursor
…inline durationMs

- Replace manual IterationResult interface with z.infer<typeof iterationResultSchema>
- Track failure telemetry in inferred features route catch block
- Compute durationMs inline instead of via thunk

Made-with: Cursor
cesco-f added 3 commits April 16, 2026 10:48
…try iteration count

- Drop iterationResult field from IdentifyInferredFeaturesResult (always
  last element of state.iterationResults)
- Fix off-by-one in route failure telemetry: use iterationResults.length + 1

Made-with: Cursor
@cesco-f cesco-f marked this pull request as ready for review April 16, 2026 10:28
@cesco-f cesco-f requested review from a team as code owners April 16, 2026 10:28
@cesco-f
Copy link
Copy Markdown
Contributor Author

cesco-f commented Apr 16, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 16, 2026

✅ Actions performed

Full review triggered.

@macroscopeapp
Copy link
Copy Markdown
Contributor

macroscopeapp Bot commented Apr 16, 2026

Approvability

Verdict: Needs human review

This PR adds two new internal API endpoints for KI feature identification and refactors the background task to use shared services. The new endpoints expose previously task-only functionality via HTTP, representing new capability. Open review comments question the API design approach (leaking internal state to clients). All changed files are owned by teams other than the author.

You can customize Macroscope's approvability policy. Learn more.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 16, 2026

✅ Actions performed

Full review triggered.

1 similar comment
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 16, 2026

✅ Actions performed

Full review triggered.

Comment on lines +58 to +59
discoveredFeatures: z.array(featureSchema).optional().default([]),
iterationResults: z.array(iterationResultSchema).optional().default([]),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid that we'd need to persist that state somehow, any chance the workflow already stores and exposes the results of the previous steps so we can lean on that ?
Otherwise, leaking all that internal state to the client feels off, so I'm wondering if we should instead have a convenience api that does the looping internally.

What would we lose if we call a route that handles the looping with a iterations parameter, instead of a single step ?

Copy link
Copy Markdown
Contributor Author

@cesco-f cesco-f Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The endpoint is designed to be workflow-agnostic, it shouldn't be coupled to the workflow engine's state management, so I'd prefer not to have it read previous step results from the workflow context.

That said, I'd like to understand what specifically feels off about passing discoveredFeatures and iterationResults back. These are internal endpoints, the caller is always Kibana itself.

What is your concern about?

Copy link
Copy Markdown
Contributor

@klacabane klacabane Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern is pushing too much state to the client:

  • for discoveredFeatures, it looks like a gap in storage that we attempt to patch
  • for iterationResults, it looks like the server only reads its length (for the iteration counter in telemetry and the response), so perhaps we don't need it at all

Agreed that coupling to Workflow isn't a good choice, but how about storing run_id on the features ? The endpoint could then determine what was discovered in this run without relying on the client, and it gives us an audit trail for cheap.

For iterationResults, we could drop it from the request entirely, or replace the array with an integer if we still want to stamp the iteration number server side

Copy link
Copy Markdown
Contributor Author

@cesco-f cesco-f Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call on the run_id approach, I've implemented it:

  • Features now get tagged with run_id when persisted. The endpoint derives discoveredFeatures by filtering stored features on run_id, so the caller no longer passes them back.
  • iterationResults removed from the request body, replaced with an optional iteration: number (the only thing the server actually needed was the counter).

The request body is now just tuning parameters + runId + iteration. No internal state roundtrips.

915e83d

@cesco-f cesco-f requested a review from klacabane April 20, 2026 13:15
cesco-f and others added 4 commits April 21, 2026 12:28
Features are now tagged with run_id when persisted. The inferred
identification endpoint derives discoveredFeatures from ES by
filtering on run_id instead of requiring them from the caller.
iterationResults removed from request body, replaced with an
optional iteration number parameter.

Made-with: Cursor
streamName: string;
featureTtlDays?: number;
}): Feature[] {
const metadata = createFeatureMetadata({ featureTtlDays });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any downside to add the run_id to computed features ? I know for now we only keep one so it's not that useful, but if we move to data stream we'll likely keep history of it and a single query with a given run_id would give you the complete snapshot of discovered features for a run

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to add the run_id to computed features as well, done here: 144c5af

Copy link
Copy Markdown
Contributor

@klacabane klacabane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - tested by manually calling the endpoint. Features from previous iterations of the same run are correctly fed back to next iterations.

Left a comment about persisting the run_id for computed features

cesco-f and others added 2 commits April 22, 2026 07:53
Make runId required across all feature reconciliation paths.
The computed features route now generates a runId when not provided,
matching the inferred route behavior. This enables querying a
complete feature snapshot by run_id.

Made-with: Cursor
@cesco-f cesco-f enabled auto-merge (squash) April 22, 2026 05:59
@cesco-f cesco-f merged commit d26065e into elastic:main Apr 22, 2026
20 checks passed
@cesco-f cesco-f deleted the features-identification-endpoints branch April 22, 2026 07:13
@elasticmachine
Copy link
Copy Markdown
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
datasetQuality 1073 1074 +1
streams 235 236 +1
streamsApp 1820 1821 +1
total +3

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/streams-schema 408 404 -4

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
datasetQuality 523.2KB 523.6KB +417.0B
streams 225.9KB 226.3KB +471.0B
streamsApp 2.0MB 2.0MB +417.0B
total +1.3KB

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id before after diff
streams 36 37 +1
Unknown metric groups

API count

id before after diff
@kbn/streams-schema 476 472 -4

History

mbondyra added a commit to mbondyra/kibana that referenced this pull request Apr 22, 2026
…sationChanges23

* commit '9a7b717c662d1c904052bc59f0e5a81daab87c7f': (145 commits)
  Upgrade EUI to v114.2.0 (elastic#264550)
  [Entity Analytics] Add missing OpenAPI descriptions and examples to p… (elastic#264778)
  [Entity Resolution] Clarify CSV upload result for already-linked entities (elastic#264689)
  [AI Infra] Fix failing GenAI Settings Scout tests (elastic#260496)
  [Agent Builder] [Bug Bash] OAuth connector settings mention fields that are not there (elastic#264756)
  [performance] process-wide cache for advanced settings lookup (elastic#262618)
  [CI] Update limits.yml for securitySolution (elastic#264946)
  [SLO] Fix APM embeddable ids (elastic#264750)
  [EDR Workflows] Unify artifacts empty state buttons (elastic#264389)
  [Alert Triage workflow] Adds security.buildAlertEntityGraph and security.renderAlertNarrative… (elastic#259159)
  [SigEvents] Add KI feature identification endpoints and refactor task to use shared service (elastic#263528)
  [Scout] Migrate Data Views API tests from FTR - Part5 (elastic#264088)
  [Cases] Apply shared extended_fields path util server side (elastic#264706)
  [Lens as code] Fix metric trendline (elastic#264777)
  [api-docs] 2026-04-22 Daily api_docs build (elastic#264882)
  [Scout] Update test config manifests (elastic#264575)
  [workflows_management] Lazy-load Zod connector schemas to cut idle memory (elastic#264283)
  [ES|QL] Fix ES|QL columns reset race during active fetch (elastic#263947)
  [Content List] Column layout props, sticky actions, and title click handlers (elastic#264203)
  [Lens as code] Validate `id` in route for new vis types (elastic#264480)
  ...
SoniaSanzV pushed a commit to SoniaSanzV/kibana that referenced this pull request Apr 27, 2026
… to use shared service (elastic#263528)

## Summary

Adds two internal endpoints for KI feature identification and refactors
the Task Manager task to use the same shared logic, eliminating code
duplication and preparing for the upcoming Workflows Management
migration.

### New internal endpoints

- **`POST /internal/streams/{name}/features/_identify/inferred`** — Runs
one iteration of feature identification: samples documents, runs LLM
inference, reconciles results against known/excluded features, persists
changes, and emits EBT telemetry. Accepts and returns accumulated state
(`discoveredFeatures`, `totalTokensUsed`, `successCount`,
`iterationResults`) so a caller can drive the iteration loop externally.

- **`POST /internal/streams/{name}/features/_identify/computed`** —
Generates computed KI features (e.g. from ES aggregations), reconciles
UUIDs/metadata, and persists them.

These endpoints are designed for the declarative features identification
workflow that will replace the Task Manager task once Workflows
Management is available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

author:actionable-obs PRs authored by the actionable obs team backport:skip This PR does not require backporting Feature:SigEvents Significant events feature, related to streams and rules/alerts (RnA) release_note:skip Skip the PR/issue when compiling release notes Team:SigEvents Project team working on Significant Events v9.5.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants