Skip to content

[Observability] Migrate Observability Agent to modular Agent Builder skills#255706

Closed
patrykkopycinski wants to merge 2 commits into
elastic:mainfrom
patrykkopycinski:o11y-dashboard-skill
Closed

[Observability] Migrate Observability Agent to modular Agent Builder skills#255706
patrykkopycinski wants to merge 2 commits into
elastic:mainfrom
patrykkopycinski:o11y-dashboard-skill

Conversation

@patrykkopycinski
Copy link
Copy Markdown
Contributor

Summary

Decomposes the monolithic observability.agent into three focused, modular skills for the Elastic AI Agent. This mirrors the pattern established in #255697 (Security skills migration) and enables on-demand capability loading with reduced token overhead.

New Skills

Skill Purpose Registry Tools (7 each)
service-investigation APM service metrics, distributed traces, runtime metrics (JVM/Go/.NET), service topology get_services, get_trace_metrics, get_traces, get_runtime_metrics, get_service_topology + 2 platform
log-analysis Log pattern discovery, log rate analysis, change point detection, field discovery get_log_groups, run_log_rate_analysis, get_log_change_points, get_index_info + 3 platform
infrastructure-alerting Host metrics, alert triage, ML anomaly detection, metric/trace change points get_hosts, get_alerts, get_anomaly_detection_jobs, get_metric_change_points, get_trace_change_points + 2 platform

Key Changes

  • Extended SkillsDirectoryStructure with observability sub-directories (services, logs, infrastructure)
  • Created 3 SkillDefinition registrations with curated tool sets and rich skill content
  • Each skill includes referenced content with investigation workflow templates
  • Wired skill registration into ObservabilityAgentBuilderPlugin.setup()
  • All 14 observability-specific tools are covered across the 3 skills
  • Added comprehensive unit tests for skill validation and cross-skill uniqueness
  • Added eval suite for skill activation and tool selection validation

Architecture Notes

  • Skills use only getRegistryTools() — no inline tools needed (all O11y tools are already registry tools)
  • The existing observability.agent continues to work alongside skills
  • Dashboard already has a dashboard-management skill — no migration needed there
  • Skills reuse the same rich instruction patterns from the O11y agent (investigation workflow, reasoning principles, metric formats)

Test Plan

  • Unit tests pass for all 3 skills (schema validation, content, tool counts, cross-skill uniqueness)
  • All 14 O11y tools covered across the 3 skills
  • Eval suite validates correct skill activation for O11y queries
  • Skills appear in Agent Builder skill listing in Observability spaces
  • Service performance queries activate the service-investigation skill
  • Log investigation queries activate the log-analysis skill
  • Infrastructure/alert queries activate the infrastructure-alerting skill
  • Non-O11y queries (security, dashboard) do NOT activate O11y skills

…skills

Decomposes the monolithic observability.agent into three focused skills
for the Elastic AI Agent, enabling on-demand capability loading and
reducing token overhead per conversation turn.

New skills:
- **service-investigation**: APM service metrics, distributed traces,
  runtime metrics (JVM/Go/.NET), and service topology mapping
- **log-analysis**: Log pattern discovery, log rate analysis, change
  point detection, and index/field schema discovery
- **infrastructure-alerting**: Host metrics (CPU/memory/disk), alert
  triage, ML anomaly detection jobs, and metric/trace change points

Key changes:
- Extend SkillsDirectoryStructure with observability sub-directories
  (services, logs, infrastructure)
- Create three SkillDefinition registrations with 7 tools each,
  covering all 14 observability-specific tools
- Wire skill registration into ObservabilityAgentBuilderPlugin.setup()
- Add comprehensive unit tests for skill validation and cross-skill
  uniqueness
- Add eval suite for skill activation and tool selection validation
@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

@elasticmachine
Copy link
Copy Markdown
Contributor

🤖 Jobs for this PR can be triggered through checkboxes. 🚧

ℹ️ To trigger the CI, please tick the checkbox below 👇

  • Click to trigger kibana-pull-request for this PR!
  • Click to trigger kibana-deploy-project-from-pr for this PR!
  • Click to trigger kibana-deploy-cloud-from-pr for this PR!
  • Click to trigger kibana-entity-store-performance-from-pr for this PR!
  • Click to trigger kibana-storybooks-from-pr for this PR!

@elasticmachine
Copy link
Copy Markdown
Contributor

elasticmachine commented Mar 3, 2026

💔 Build Failed

Failed CI Steps

Metrics [docs]

✅ unchanged

History

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants