[Security Solution] Batched Attack Discovery with hierarchical merge by patrykkopycinski · Pull Request #257831 · elastic/kibana

patrykkopycinski · 2026-03-15T23:40:01Z

Summary

Removes the alert count ceiling from Attack Discovery by implementing batch processing with LLM-based hierarchical merge. This enables Attack Discovery to process arbitrarily large alert sets by splitting them into manageable batches, running the existing AD graph on each batch in parallel, then consolidating discoveries across batches using a dedicated LLM merge pass.

Ref: elastic/security-team#16339 (Task 0B — Remove alert count ceiling)

Architecture

Alerts (N) → [Adaptive split into K batches] → [Parallel AD graph on each batch]
                                                         ↓
                                                  [Collect batch results]
                                                         ↓
                                                  [LLM Merge Pass: consolidate related discoveries]
                                                         ↓
                                                  [Return merged discoveries + quality metrics]

Key Components

Module	Purpose
`batch/split.ts`	Adaptive batch sizing (context window → optimal batch size), alert splitting
`batch/merge.ts`	Hierarchical merge with LLM consolidation pass, quality metrics
`batch/orchestrator.ts`	Batch orchestration with configurable concurrency control
`batch/types.ts`	Interfaces, constants, known context window map
`invoke_attack_discovery_graph`	Routing: batched path when alert count > adaptive batch size

Adaptive Batch Sizing

Batch size is computed from the LLM connector's context window:

available_tokens = context_window × 0.7 − 8000 (reserved for prompt/output)
batch_size = floor(available_tokens / 800 tokens per alert)
clamped to [10, 500]

Supports known model lookups (GPT-4o, Claude 3.x, Gemini) with partial matching, explicit context window override, and graceful fallback to default (50).

Hierarchical Merge Strategy

Single batch: No merge pass needed — direct passthrough
Multiple batches: LLM consolidation pass that:
- Identifies discoveries describing the same attack across batches
- Merges related discoveries (combines alert IDs, MITRE tactics, details)
- Preserves genuinely distinct attacks unchanged
- Guarantees no alert ID loss (every input alert ID appears in output)
Merge failure: Graceful degradation — returns unmerged discoveries with warning

Quality Metrics

Every batched run produces MergeQualityMetrics:

consolidationRatio — how many discoveries were merged (1.0 = none, lower = more consolidation)
alertCoverage — ratio of alert IDs preserved after merge (should be 1.0)
batchesProcessed / batchesFailed — batch success tracking
totalDurationMs / mergeDurationMs — performance tracking

Error Handling

Individual batch failures don't block other batches (Promise.allSettled)
Failed batches recorded with empty discoveries and error details
LLM merge pass failure returns unmerged results (no data loss)
Empty alert retrieval returns early

Configuration

Parameter	Default	Description
`batchSize`	Adaptive	Max alerts per batch (auto-calculated from context window)
`maxBatches`	20	Max batches to process (0 = unlimited)
`concurrency`	2	Max parallel batch executions

Testing

29 unit tests across 3 test suites:
- split.test.ts — batch splitting, adaptive sizing, model lookup, edge cases
- merge.test.ts — single/multi-batch merge, metrics, error handling, replacement combining
- orchestrator.test.ts — single/multi-batch orchestration, concurrency, failure resilience

Test plan

All 29 unit tests pass (yarn test:jest ...batch/)
ESLint passes on all changed files
No lint errors (ReadLints)
Type check passes (CI)
Existing AD tests still pass (CI)
Manual test with connector: < batch size alerts → single pass (no merge)
Manual test with connector: > batch size alerts → batched with merge
Verify merge metrics logged correctly
Verify partial batch failure doesn't lose other batches' results

Made with Cursor

Removes the alert count ceiling from Attack Discovery by implementing batch processing with LLM-based hierarchical merge. Large alert sets are split into batches, processed in parallel through the existing AD graph, then consolidated via a dedicated merge LLM pass that identifies and combines related attacks across batches. Key changes: - batch/split.ts: adaptive batch sizing from LLM context window, alert splitting - batch/merge.ts: hierarchical merge with LLM consolidation pass and quality metrics - batch/orchestrator.ts: batch orchestration with concurrency control - batch/types.ts: interfaces, constants, known context windows - invoke_attack_discovery_graph: routing to batched path when alerts exceed batch size Ref: elastic/security-team#16339

elasticmachine · 2026-03-15T23:40:22Z

🤖 Jobs for this PR can be triggered through checkboxes. 🚧

ℹ️ To trigger the CI, please tick the checkbox below 👇

Click to trigger kibana-pull-request for this PR!
Click to trigger kibana-deploy-project-from-pr for this PR!
Click to trigger kibana-deploy-cloud-from-pr for this PR!
Click to trigger kibana-entity-store-performance-from-pr for this PR!
Click to trigger kibana-storybooks-from-pr for this PR!

patrykkopycinski · 2026-03-15T23:45:47Z

/ci

patrykkopycinski · 2026-03-16T00:20:48Z

/ci

…known[] Fixes TS2322 error where spreading unknown[] tracers into callbacks parameter expected (BaseCallbackHandler | BaseCallbackHandlerMethodsClass)[].

patrykkopycinski · 2026-03-16T02:15:42Z

/ci

elasticmachine · 2026-03-16T03:53:54Z

💛 Build succeeded, but was flaky

Buildkite Build
Commit: ffa36ca

Failed CI Steps

Rule Management - Prebuilt Rules Upgrade - Security Solution Cypress Tests #1

Metrics [docs]

✅ unchanged

History

💔 Build #410291 failed a6879c7
💔 Build #410288 failed a6879c7

fix: type traceOptions.tracers as BaseCallbackHandler[] instead of un…

ffa36ca

…known[] Fixes TS2322 error where spreading unknown[] tracers into callbacks parameter expected (BaseCallbackHandler | BaseCallbackHandlerMethodsClass)[].

davethegut mentioned this pull request Mar 19, 2026

Add markdown output, CSS-verbatim, and GitHub sharing davethegut/deep-dive-skill#4

Merged

5 tasks

patrykkopycinski closed this Mar 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Security Solution] Batched Attack Discovery with hierarchical merge#257831

[Security Solution] Batched Attack Discovery with hierarchical merge#257831
patrykkopycinski wants to merge 2 commits intoelastic:mainfrom
patrykkopycinski:batched-attack-discovery-16182

patrykkopycinski commented Mar 15, 2026

Uh oh!

elasticmachine commented Mar 15, 2026

Uh oh!

patrykkopycinski commented Mar 15, 2026

Uh oh!

patrykkopycinski commented Mar 16, 2026

Uh oh!

patrykkopycinski commented Mar 16, 2026

Uh oh!

elasticmachine commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

patrykkopycinski commented Mar 15, 2026

Summary

Architecture

Key Components

Adaptive Batch Sizing

Hierarchical Merge Strategy

Quality Metrics

Error Handling

Configuration

Testing

Test plan

Uh oh!

elasticmachine commented Mar 15, 2026

Uh oh!

patrykkopycinski commented Mar 15, 2026

Uh oh!

patrykkopycinski commented Mar 16, 2026

Uh oh!

patrykkopycinski commented Mar 16, 2026

Uh oh!

elasticmachine commented Mar 16, 2026

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

History

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants