Skip to content

[Spike][Security Solution] AI SOC: Default Agent skills, tools, and workflow playbooks#259559

Draft
patrykkopycinski wants to merge 9 commits into
elastic:mainfrom
patrykkopycinski:elastic-ai-soc-plan
Draft

[Spike][Security Solution] AI SOC: Default Agent skills, tools, and workflow playbooks#259559
patrykkopycinski wants to merge 9 commits into
elastic:mainfrom
patrykkopycinski:elastic-ai-soc-plan

Conversation

@patrykkopycinski
Copy link
Copy Markdown
Contributor

@patrykkopycinski patrykkopycinski commented Mar 25, 2026

Summary

Spike implementing the Elastic AI SOC feature using the Default Elastic Agent + Skills pattern — zero custom agents. All SOC capabilities are registered as enhanced Agent Builder skills that activate contextually within the default agent, orchestrated through One Workflow playbooks.

All new registrations are gated behind the aiSocAgents experimental feature flag + experimental: true on each skill.

Architecture: Default Agent + Skills (No Custom Agents)

User asks "triage this alert" → Default Elastic Agent
  → Alert Triage skill activates (matched by description)
  → Skill exposes security tools (alerts, entity risk, TI enrich, etc.)
  → Agent follows skill methodology → structured verdict output

This PR also migrates the existing Threat Hunting Agent to a skill, completing the transition to zero custom agents in Security Solution. The registerAgents() call is removed from plugin.ts entirely.

Why Skills Instead of Custom Agents

Aspect Custom Agents (before) Skills (after)
User experience Must manually select the right agent Default agent auto-activates the right skill
Tool access Separate tool list per agent Skills expose tools within default agent context
Workflow integration agent-id: security.triage Default agent + message activates skill
Registration N BuiltInAgentDefinition objects N enhanced SkillDefinition objects
Feature gating Feature flag in registration code experimental: true on skill + feature flag

New Components

7 Agent Builder tools:

Tool Description API Used
response_actions Endpoint isolation/release/kill/suspend EndpointAppContextService.getInternalResponseActionsClient()
mitre_mapping LLM-powered MITRE ATT&CK technique mapping Model provider structured output
threat_intel_enrich IOC enrichment against TI indicator indices ES query on logs-ti_* indices
timeline_create Investigation timeline creation with pinned events createTimeline() + savePinnedEvents() Timeline API
report_generate Structured incident report generation (markdown/JSON) Pure formatting (no external deps)
case_manage Full case lifecycle (create/update/comment/attach/status) cases.getCasesClientWithRequest() plugin contract
entity_store_query Entity Store v2 unified entity profile queries ES query on .entities.v1.latest.* indices

6 Agent Builder skills (all with experimental: true except threat-hunting):

Skill Tools Exposed Purpose
Threat Hunting (migrated from agent) alerts, entity_risk, attack_discovery, labs + 8 platform tools Proactive threat hunting, alert analysis, entity investigation
Alert Triage alerts, entity_risk, attack_discovery, labs, TI enrich + platform Systematic verdict classification with confidence scoring
Investigation alerts, entity_risk, attack_discovery, TI enrich, timeline, cases, labs, MITRE mapping, entity_store + platform Timeline reconstruction, root cause analysis, case creation
MITRE Coverage alerts, attack_discovery, MITRE mapping, create_detection_rule, labs + platform Detection gap analysis against ATT&CK framework
Incident Reporting alerts, attack_discovery, MITRE mapping, labs, report_generate, cases + platform Executive/technical/compliance report generation
Response Recommendation alerts, entity_risk, response_actions, cases + platform Blast radius assessment, confidence-scored containment actions

4 One Workflow playbooks (YAML with structured output schemas):

  • Incident Response — alert-triggered: triage → data.map → investigate → respond → approval gate → report
  • Full Investigation — manual: investigate → correlate → MITRE map → report
  • Proactive Threat Hunt — weekly scheduled: hunt → correlate → create rules
  • Detection Coverage Audit — monthly scheduled: audit → generate rules

Workflow trigger: security.alertCreated registered with workflowsExtensions

Existing Code Changes (Threat Hunting Agent Migration)

  • use_agent_builder_attachment.ts — removed agentId: THREAT_HUNTING_AGENT_ID (uses default agent)
  • use_agent_builder_stream.ts — removed agent_id: THREAT_HUNTING_AGENT_ID (uses default agent)
  • common/constants.ts — removed THREAT_HUNTING_AGENT_ID constant and unused internalNamespaces import
  • server/plugin.ts — removed registerAgents() call entirely
  • agents/index.ts — replaced with migration comment (file to be deleted)

Key Design Decisions

  • Default Elastic Agent + Skills — zero custom agents; all skills activate contextually
  • Structured output schemas on all ai.agent workflow steps for deterministic condition routing
  • data.map steps between workflow steps for structured field extraction
  • Confidence-gated human-in-loop — auto-execute ≥0.90, notify 0.70-0.89, require approval <0.70
  • Proper platform APIs: Response Actions via EndpointAppContextService, Timelines via Timeline API, Cases via plugin contract
  • Dogfooding: Entity Store v2, Workflows Extensions triggers, Knowledge Base (productDocumentation) in skills

Testing

Type Count Details
Unit tests 98 6 test files covering all new tools
@kbn/evals suites 6 30 test cases across all SOC capabilities
Integration tests 1 Tool handler integration tests
Eval infrastructure 3 Framework files (evaluate, dataset, chat_client)

Handoff Items

Area Team What's Needed
Trigger approval @elastic/response-ops-workflow-eng Approve security.alertCreated trigger
Response actions review @elastic/security-defend-workflows Review getInternalResponseActionsClient usage
Timeline API review @elastic/security-threat-hunting-investigations Review createTimeline + savePinnedEvents pattern
Eval execution @elastic/security-generative-ai Run eval suites against live LLM connectors
Workflow playbook QA @elastic/response-ops-workflow-eng Import and test playbooks in Workflows Management UI
Delete dead agent files Developer threat_hunting_agent.ts + 6 empty SOC agent files need manual deletion

Test plan

  • Enable feature flag: xpack.securitySolution.enableExperimental: ['aiSocAgents']
  • Verify 6 new skills appear in Agent Builder skills list
  • Test each skill via default agent chat (ask triage/investigation/MITRE/hunting questions)
  • Verify Threat Hunting skill activates for existing security workflows (alert flyout → Agent Builder)
  • Import workflow playbooks via Workflows Management YAML editor
  • Test IR playbook end-to-end with a high-severity alert
  • Verify confidence-gated approval step pauses for low-confidence responses
  • Run unit tests: yarn test:jest on new test files
  • Run @kbn/evals suites against configured LLM connector

🤖 Generated with Claude Code

Production-Readiness Checklist — Agent Skills Ecosystem

Generated against [Epic] Creation of the Agent Skills Ecosystem for Elastic Security.

Narrative role: Flagship end-to-end proof of the epic's "Default Elastic Agent + Skills (no custom agents)" architecture. The single PR most directly executing on the vision's positioning statement.

Must-do before this can ship

  • Fix the 2 failing CI checks
  • Cross-solution coordination. Removing registerAgents() is architecturally correct but must be aligned with Observability (#255706) — land a cross-solution Decision Record first
  • HITL default on every high-blast-radius tool: response_actions, case_manage (full lifecycle), timeline_create, threat_intel_enrich (if external calls). Each needs requiredPrivileges gating + an explicit user confirmation path
  • Concrete Workflows playbook templates (not abstract "mentioned"): isolation playbook, credential-compromise playbook, ransomware playbook. Ship at least two as reference content
  • Skill-activation telemetry: record which skill was activated for which user message — this IS the vision's "invocation frequency" KPI
  • entity_store_query should become a standalone Entity Analytics skill primitive (vision's "horizontal enrichment layer") — design the shared contract before baking it into this PR's tool
  • Dark by default behind aiSocAgents + per-skill experimental: true (already in scope — verify end-to-end)

Follow-ups (post-merge)

  • Publish all 7 tools against the Skill Authoring Standard (param-bound ES|QL, scope claims where relevant)
  • Build eval suites for each skill via the #255890 CLI
  • Once #258979 multi-agent Triage exists, re-host its logic as a skill composition inside this PR's default-agent model

@elasticmachine
Copy link
Copy Markdown
Contributor

🤖 Jobs for this PR can be triggered through checkboxes. 🚧

ℹ️ To trigger the CI, please tick the checkbox below 👇

  • Click to trigger kibana-pull-request for this PR!
  • Click to trigger kibana-deploy-project-from-pr for this PR!
  • Click to trigger kibana-deploy-cloud-from-pr for this PR!
  • Click to trigger kibana-entity-store-performance-from-pr for this PR!
  • Click to trigger kibana-storybooks-from-pr for this PR!

@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

@patrykkopycinski patrykkopycinski changed the title [Security Solution] AI SOC: Agent Builder agents, tools, skills, and workflow playbooks [Spike] AI SOC: Agent Builder agents, tools, skills, and workflow playbooks Mar 25, 2026
patrykkopycinski and others added 2 commits March 25, 2026 13:05
…aybooks

Implements the Elastic AI SOC feature as a spike to validate the end-to-end
value chain of Agent Builder agents orchestrated through One Workflow playbooks
for automated security operations.

All new registrations are gated behind the `aiSocAgents` experimental feature flag.

**7 new Agent Builder tools:**
- `response_actions` — endpoint isolation/release/kill/suspend via Response Actions API
- `mitre_mapping` — LLM-powered MITRE ATT&CK technique mapping with confidence scoring
- `threat_intel_enrich` — IOC enrichment against TI indicator indices
- `timeline_create` — investigation timeline creation via Timeline API service
- `report_generate` — structured incident report generation (markdown/JSON)
- `case_manage` — full Cases API integration (create/update/comment/attach/status)
- `entity_store_query` — Entity Store v2 unified entity profile queries

**5 new Agent Builder skills:**
- Alert Triage, Investigation, MITRE Coverage Analysis, Incident Reporting,
  Response Recommendation — each with comprehensive methodology guides

**6 new Agent Builder agents:**
- Triage, Investigator, Correlator, Responder, Reporter, MITRE Analyst —
  each with focused tool assignments and specialized system prompts

**4 pre-built One Workflow playbooks (YAML):**
- Incident Response (alert-triggered: triage → investigate → respond → report)
- Full Investigation (manual: investigate → correlate → MITRE map → report)
- Proactive Threat Hunt (weekly scheduled: hunt → correlate → create rules)
- Detection Coverage Audit (monthly scheduled: audit → generate rules)
- All playbooks use structured output schemas on ai.agent steps for
  deterministic condition routing instead of fragile string matching

**Testing:**
- 98 unit tests across 6 tool test files
- 6 @kbn/evals agent evaluation suites (30 test cases total)
- Integration test suite for tool handlers
- Workflow trigger definition for `security.alertCreated` events

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The tool was rewritten to use `createTimeline()` + `savePinnedEvents()`
from the Timeline API service, but the test file still mocked raw
saved objects. Updated tests to mock the Timeline service functions
and verify the correct API calls are made.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@patrykkopycinski patrykkopycinski changed the title [Spike] AI SOC: Agent Builder agents, tools, skills, and workflow playbooks [Security Solution] AI SOC: Agent Builder agents, tools, skills, and workflow playbooks Mar 25, 2026
@patrykkopycinski patrykkopycinski changed the title [Security Solution] AI SOC: Agent Builder agents, tools, skills, and workflow playbooks [Spike] AI SOC: Agent Builder agents, tools, skills, and workflow playbooks Mar 25, 2026
Elastic promotes the Default Elastic Agent approach — no custom agents.
All SOC capabilities now live as enhanced skills that activate contextually
within the default agent based on message content.

Changes:
- Removed 6 custom agent ID constants from common/constants.ts
- Reverted agents/index.ts to only register the Threat Hunting Agent
- Enhanced 5 skills with full tool assignments (getRegistryTools) from agents
- Added `experimental: true` to all SOC skills
- Merged agent system prompt content (escalation, case creation, rollback
  procedures) into skill content
- Updated workflow playbooks to use default agent (removed agent-id configs)
- Playbook messages now explicitly reference skill names for activation

Note: The 6 emptied agent files still need to be deleted manually.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@patrykkopycinski patrykkopycinski changed the title [Spike] AI SOC: Agent Builder agents, tools, skills, and workflow playbooks [Spike][Security Solution] AI SOC: Default Agent skills, tools, and workflow playbooks Mar 25, 2026
patrykkopycinski and others added 2 commits March 25, 2026 14:02
Completes the migration to the Default Elastic Agent + Skills pattern
by converting the last custom agent (Threat Hunting) to a skill.

Changes:
- Created threat_hunting skill with all platform + security tools
- Removed registerAgents() call from plugin.ts (no more agent registrations)
- Updated public hooks (use_agent_builder_attachment, use_agent_builder_stream)
  to use the default agent instead of THREAT_HUNTING_AGENT_ID
- Removed THREAT_HUNTING_AGENT_ID constant and internalNamespaces import
- Removed last agent-id reference from workflow playbooks
- agents/index.ts now contains only a migration comment

Note: threat_hunting_agent.ts file needs manual deletion (hook blocks it)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- entity_store_query: Use space-specific index pattern in availability check
- entity_store_query: Pass time_range parameter to query functions
- response_actions: Fix PID 0 validation (use === undefined instead of falsy)
- mitre_mapping: Handle content block arrays from LLM responses
- integration test: Add missing ES search mock for attach_alerts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

The attach_alerts action now queries ES for alert rule metadata before
attaching. Tests were missing the esClient.asCurrentUser.search mock,
causing them to fail on .hits.hits access.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

Comment on lines +122 to +124
const providers = (
result.results[0].data.matches as Array<Record<string, unknown>>
).map((m) => m.provider);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low __integration__/soc_tools.integration.test.ts:122

The assertion on line 124 reads m.provider but the match objects are spread from Elasticsearch _source where provider is nested at threat.indicator.provider. Since the tool spreads _source directly into matches, m.provider returns undefined for every item, causing the toContain assertions to incorrectly fail even when the data is present.

-      const providers = (
-        result.results[0].data.matches as Array<Record<string, unknown>>
-      ).map((m) => m.provider);
+      const providers = (
+        result.results[0].data.matches as Array<Record<string, unknown>>
+      ).map((m) => (m.threat as Record<string, unknown>)?.indicator?.provider);
🤖 Copy this AI Prompt to have your agent fix this:
In file x-pack/solutions/security/plugins/security_solution/server/agent_builder/tools/__integration__/soc_tools.integration.test.ts around lines 122-124:

The assertion on line 124 reads `m.provider` but the match objects are spread from Elasticsearch `_source` where `provider` is nested at `threat.indicator.provider`. Since the tool spreads `_source` directly into matches, `m.provider` returns `undefined` for every item, causing the `toContain` assertions to incorrectly fail even when the data is present.

Evidence trail:
x-pack/solutions/security/plugins/security_solution/server/agent_builder/tools/threat_intel_enrich_tool.ts lines 138-144 (matches spread `_source` directly)
x-pack/solutions/security/plugins/security_solution/server/agent_builder/tools/__integration__/soc_tools.integration.test.ts lines 50-95 (mock _source has provider at threat.indicator.provider)
x-pack/solutions/security/plugins/security_solution/server/agent_builder/tools/__integration__/soc_tools.integration.test.ts lines 122-127 (test asserts on m.provider which is undefined)

@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

Comment on lines +59 to +143
async converse({ messages, conversationId, agentId }: ConverseParams): Promise<ConverseResponse> {
const callConverseApi = async (): Promise<ConverseResponse> => {
const response = await this.fetch('/api/agent_builder/converse', {
method: 'POST',
version: '2023-10-31',
body: JSON.stringify({
agent_id: agentId,
connector_id: this.connectorId,
conversation_id: conversationId,
input: messages[messages.length - 1].message,
}),
});

const chatResponse = response as {
conversation_id: string;
trace_id?: string;
steps: Step[];
response: { message: string };
model_usage?: ModelUsageStats;
};

const {
conversation_id: conversationIdFromResponse,
response: latestResponse,
steps,
trace_id: traceId,
model_usage: modelUsage,
} = chatResponse;

return {
conversationId: conversationIdFromResponse,
messages: [...messages, latestResponse],
steps,
traceId,
modelUsage,
errors: [],
};
};

try {
return await pRetry(callConverseApi, {
retries: RETRIES,
minTimeout: MIN_TIMEOUT,
onFailedAttempt: (error) => {
const isLastAttempt = error.retriesLeft === 0;

if (isLastAttempt) {
this.log.error(
new Error(`Failed to call converse API after ${error.attemptNumber} attempts`, {
cause: error,
})
);
} else {
this.log.warning(
new Error(`Converse API call failed on attempt ${error.attemptNumber}; retrying...`, {
cause: error,
})
);
}
},
});
} catch (error) {
this.log.error('Error occurred while calling converse API');
return {
conversationId,
steps: [],
messages: [
...messages,
{
message:
'This question could not be answered as an internal error occurred. Please try again.',
},
],
errors: [
{
error: {
message: error instanceof Error ? error.message : 'Unknown error',
stack: error instanceof Error ? error.stack : undefined,
},
type: 'error',
},
],
};
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low evals/chat_client.ts:59

Accessing messages[messages.length - 1].message throws TypeError: Cannot read property 'message' of undefined when messages is an empty array. The Messages type permits empty arrays, and there's no validation before accessing the last element. Consider adding a check for empty messages before the API call, or document the precondition if callers are expected to always provide at least one message.

   async converse({ messages, conversationId, agentId }: ConverseParams): Promise<ConverseResponse> {
+    if (messages.length === 0) {
+      throw new Error('messages array must not be empty');
+    }
+
     const callConverseApi = async (): Promise<ConverseResponse> => {
🤖 Copy this AI Prompt to have your agent fix this:
In file x-pack/solutions/security/plugins/security_solution/server/agent_builder/evals/chat_client.ts around lines 59-143:

Accessing `messages[messages.length - 1].message` throws `TypeError: Cannot read property 'message' of undefined` when `messages` is an empty array. The `Messages` type permits empty arrays, and there's no validation before accessing the last element. Consider adding a check for empty `messages` before the API call, or document the precondition if callers are expected to always provide at least one message.

Evidence trail:
x-pack/solutions/security/plugins/security_solution/server/agent_builder/evals/chat_client.ts at REVIEWED_COMMIT:
- Line 16: `export type Messages = { message: string }[];` (permits empty arrays)
- Line 56: `async converse({ messages, conversationId, agentId }: ConverseParams)` (no validation)
- Line 65: `input: messages[messages.length - 1].message` (unsafe access)

@elasticmachine
Copy link
Copy Markdown
Contributor

elasticmachine commented Mar 25, 2026

💔 Build Failed

Failed CI Steps

Metrics [docs]

‼️ ERROR: metrics for 7a954bc were not reported

History

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants