Skip to content

[SLO]: register alerts schema embeddable#256570

Merged
mgiota merged 55 commits intoelastic:mainfrom
mgiota:slo_alerts_schema_registration
Apr 2, 2026
Merged

[SLO]: register alerts schema embeddable#256570
mgiota merged 55 commits intoelastic:mainfrom
mgiota:slo_alerts_schema_registration

Conversation

@mgiota
Copy link
Copy Markdown
Contributor

@mgiota mgiota commented Mar 7, 2026

Fixes #252816
Fixes #257868

Legacy alerts embeddable saved state contained showAllGroupByInstances and slos (id, instanceId, name, groupBy).

As part of this PR we do following changes (More context on this issue):

  • remove showAllGroupByInstances and keep slo_instance_id as the sole source of truth
  • remove name and groupBy` from schema. An SLO with updated name for example can lead to stale UI representation in the dashboard app.

Acceptance criteria

  • schemas are registered server side
  • transforms are registered, where legacy stored shape is converted to REST API shape
  • REST API should be snake_case
  • public embeddable code should be updated to use new snake_case shape
  • types should be derived from schemas, duplicative type declarations should be removed
  • ensure backwards compatibility: confirm previously stored SLO error alert embeddables continue to work as expected
  • drilldown support
  • register SLO embeddable transform out in embeddablePlugin.registerLegacyURLTransform: Dashboard allows users to share unsaved dashboard changes. This feature stores embeddable state in URLs.

Testing OAS documentation

Testing validation

Try to create a few dashboards with alerts embeddable through API calls in Kibana dev tools and verify validation and creation work as expected.

POST kbn:/api/dashboards?apiVersion=1
{
  "title": "Testing empty alerts embeddable through API call",
  "panels": [
    {
      "config": {},
      "grid": {
        "x": 1,
        "y": 0
      },
      "type": "slo_alerts"
    }
  ]
}
POST kbn:/api/dashboards?apiVersion=1
{
  "title": "Test a group by SLO with all instances",
  "panels": [
    {
      "config": {
        "slos": [
            {
                "slo_id": "6b523190-9da5-4bbc-8c37-fa3f3a3f5046",
                "slo_instance_id": "*"
            }
        ]
      },
      "grid": {
        "x": 1,
        "y": 0
      },
      "type": "slo_alerts"
    }
  ]
}
POST kbn:/api/dashboards?apiVersion=1
{
  "title": "Test a group by SLO with a specific instance",
  "panels": [
    {
      "config": {
        "slos": [
            {
                "slo_id": "6b523190-9da5-4bbc-8c37-fa3f3a3f5046",
                "slo_instance_id": "something"
            }
        ]
      },
      "grid": {
        "x": 1,
        "y": 0
      },
      "type": "slo_alerts"
    }
  ]
}

Reference

Regarding removal of showAllGroupByInstances we explored two approaches, 1) keep the flag and normalize on load and 2) remove the flag entirely.

Screenshot 2026-03-18 at 11 18 47

I proceeded with Option 2: remove the flag entirely plus the simplest required UI change, where SLO selector support picking all instances (*) for group-by SLOs.

Screenshot 2026-03-18 at 11 44 49

I decided not to proceed with a full UI refactoring implemented in this PR. This change can be done as a follow up after the schema registration PR is merged, if we decide we want to change the way we select SLOs in the alerts embeddable.

Screenshot 2026-03-18 at 11 41 33

@mgiota mgiota self-assigned this Mar 7, 2026
@github-actions github-actions bot added the author:actionable-obs PRs authored by the actionable obs team label Mar 7, 2026
@mgiota mgiota added backport:skip This PR does not require backporting release_note:skip Skip the PR/issue when compiling release notes and removed author:actionable-obs PRs authored by the actionable obs team labels Mar 7, 2026
@mgiota mgiota marked this pull request as ready for review March 13, 2026 15:44
@mgiota mgiota requested a review from a team as a code owner March 13, 2026 15:44
@mgiota mgiota requested a review from a team as a code owner March 13, 2026 15:45
@mgiota mgiota added the v9.4.0 label Mar 13, 2026
@macroscopeapp
Copy link
Copy Markdown
Contributor

macroscopeapp bot commented Mar 27, 2026

Approvability

Verdict: Needs human review

This PR introduces new schema registration infrastructure for SLO alerts embeddables with drilldown support and legacy state migration. Unresolved review comments identify a potential DoS vulnerability (unbounded array in schema) and a bug in kuery generation that would cause incorrect filtering. These substantive issues require human review and resolution.

You can customize Macroscope's approvability policy. Learn more.

@fkanout fkanout self-requested a review March 31, 2026 09:32
});

const AlertsCustomSchema = schema.object({
slos: schema.arrayOf(sloItemSchema, {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL flagged this already but worth calling out explicitly: this array has no maxSize, so a crafty API caller can stuff an arbitrarily large payload into a dashboard panel and blow up memory on the server. Something like maxSize: 100 should be more than enough — I can't imagine a real dashboard panel with 100+ SLOs in a single alerts widget.

Suggested change
slos: schema.arrayOf(sloItemSchema, {
slos: schema.arrayOf(sloItemSchema, {
defaultValue: [],
maxSize: 100,
meta: { description: 'List of SLOs to display alerts for' },

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep maxSize:100 is good, applied in this commit

const handleGoToAlertsClick = () => {
const kuery = slos
.map((slo) => `(slo.id:"${slo.id}" and slo.instanceId:"${slo.instanceId}")`)
.map((slo) => `(slo.id:"${slo.slo_id}" and slo.instanceId:"${slo.slo_instance_id}")`)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that slo_instance_id can be '*' for "all instances", this kuery produces slo.instanceId:"*" which matches the literal string rather than all instances. For grouped SLOs the "Go to Alerts" link will filter to nothing useful. Might be worth skipping the instanceId clause when it's *:

const kuery = slos
  .map((slo) => {
    const idFilter = `slo.id:"${slo.slo_id}"`;
    return slo.slo_instance_id === ALL_VALUE
      ? idFilter
      : `(${idFilter} and slo.instanceId:"${slo.slo_instance_id}")`;
  })
  .join(' or ');

This was probably broken before too (the old toggle overrode instance filtering in the table query but the drilldown link always used the specific instanceId), but now's a good time to clean it up since the semantics are clearer.

Copy link
Copy Markdown
Contributor Author

@mgiota mgiota Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True that was broken before, but yep good time to clean it up. Here's the commit that fixes it

Copy link
Copy Markdown
Contributor

@fkanout fkanout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally and LGTM, I left some comments that need to be addressed

Image

@mgiota mgiota force-pushed the slo_alerts_schema_registration branch from 503f96d to 2756899 Compare April 1, 2026 09:28
mgiota and others added 2 commits April 1, 2026 14:07
…alerts/transforms/transform_alerts_out.ts

Co-authored-by: Faisal Kanout <faisal@kanout.com>
@mgiota mgiota requested a review from fkanout April 1, 2026 12:07
@mgiota
Copy link
Copy Markdown
Contributor Author

mgiota commented Apr 1, 2026

@fkanout Thanks for the review, I applied the feedback. PR is ready for review again!

@elasticmachine
Copy link
Copy Markdown
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
slo 1315 1317 +2

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
slo 1.1MB 1.1MB -236.0B

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
slo 36.5KB 36.7KB +234.0B
Unknown metric groups

async chunk count

id before after diff
slo 37 39 +2

History

cc @mgiota

Copy link
Copy Markdown
Contributor

@nreese nreese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kibana-presentation changes LGTM
code review only

@mgiota mgiota merged commit 69b9590 into elastic:main Apr 2, 2026
18 checks passed
mbondyra added a commit to mbondyra/kibana that referenced this pull request Apr 2, 2026
…heck

* commit 'af66aadafa7470ca8ba3e3edd3793bde81fa4596': (31 commits)
  [Scout] Update test config manifests (elastic#260850)
  [SLO]: register alerts schema embeddable (elastic#256570)
  [Discover][Flyout] Update overview fields table with new prop headerVisibility set to false (elastic#260692)
  [AiButton/Security] Migrate ai-related buttons to use custom styles (elastic#259847)
  [One Workflow] Fix connector step icons falling back to generic plugs in YAML editor (elastic#260785)
  [Agent Builder] Dashboard skill: Guard against editing non-ESQL based panels (elastic#260714)
  Security quality gate Cypress cleanup - Periodic Pipeline (elastic#260820)
  [Search] Deprecate search indices in favour of index management (elastic#260210)
  Upgrade dependency @elastic/charts to v71.4.0 (elastic#260593)
  [Security Solution] [HDQ]: integration-based targeting and descriptor versioning (elastic#258418)
  docs(saved-objects): consolidate docs and document scoped vs system client (elastic#260743)
  Fix observability UIAM config and add CPS observability variant (elastic#260485)
  [Security Solution] Add "matched_indices_count" rule execution metric (elastic#259938)
  [SigEvents] Add callout with working promote action. (elastic#260433)
  [Alerting V2] Episode table actions (elastic#260195)
  [Automatic Migration] Add ability to skip Reference Set step in QRadar upload workflow (elastic#259959)
  [Rules] KQL-to-DSL conversion without data view produces incorrect queries for keyword fields for Metric threshold rule (elastic#260046)
  Update dependency lightningcss to v1.32.0 (main) (elastic#259017)
  Update postcss (main) (elastic#255420)
  Migrate server-side apm.addLabels to OTel dual-write helpers (elastic#259619)
  ...
paulinashakirova pushed a commit to paulinashakirova/kibana that referenced this pull request Apr 2, 2026
Fixes elastic#252816
Fixes elastic#257868

Legacy alerts embeddable saved state contained `showAllGroupByInstances`
and `slos` (`id`, `instanceId`, `name`, `groupBy`).

As part of this PR we do following changes (More context on [this
issue](elastic#257868)):
- remove `showAllGroupByInstances` and keep slo_instance_id as the sole
source of truth
- remove `name` and groupBy` from schema. An SLO with updated name for
example can lead to stale UI representation in the dashboard app.


## Acceptance criteria
- schemas are registered server side
- transforms are registered, where legacy stored shape is converted to
REST API shape
- REST API should be snake_case
- public embeddable code should be updated to use new snake_case shape
- types should be derived from schemas, duplicative type declarations
should be removed
- ensure backwards compatibility: confirm previously stored SLO error
alert embeddables continue to work as expected
- drilldown support
- register SLO embeddable transform out in
embeddablePlugin.registerLegacyURLTransform: Dashboard allows users to
share unsaved dashboard changes. This feature stores embeddable state in
URLs.

## Testing OAS documentation
- add `server.oas.enabled: true` to config/kibana.dev.yml
- start kibana with `yarn start --no-base-path`
- Copy paste URL
http://localhost:5601/api/oas?pathStartsWith=/api/dashboard&access=internal&version=1
into your browser.
- View panels schema and verify slo alerts embeddable schema appears in
the list.

## Testing validation
Try to create a few dashboards with alerts embeddable through API calls
in Kibana dev tools and verify validation and creation work as expected.

```
POST kbn:/api/dashboards?apiVersion=1
{
  "title": "Testing empty alerts embeddable through API call",
  "panels": [
    {
      "config": {},
      "grid": {
        "x": 1,
        "y": 0
      },
      "type": "slo_alerts"
    }
  ]
}
```

```
POST kbn:/api/dashboards?apiVersion=1
{
  "title": "Test a group by SLO with all instances",
  "panels": [
    {
      "config": {
        "slos": [
            {
                "slo_id": "6b523190-9da5-4bbc-8c37-fa3f3a3f5046",
                "slo_instance_id": "*"
            }
        ]
      },
      "grid": {
        "x": 1,
        "y": 0
      },
      "type": "slo_alerts"
    }
  ]
}
```

```
POST kbn:/api/dashboards?apiVersion=1
{
  "title": "Test a group by SLO with a specific instance",
  "panels": [
    {
      "config": {
        "slos": [
            {
                "slo_id": "6b523190-9da5-4bbc-8c37-fa3f3a3f5046",
                "slo_instance_id": "something"
            }
        ]
      },
      "grid": {
        "x": 1,
        "y": 0
      },
      "type": "slo_alerts"
    }
  ]
}
```


## Reference

Regarding removal of `showAllGroupByInstances` we explored two
approaches, **1)** keep the flag and normalize on load and **2)** remove
the flag entirely.

<img width="400" height="676" alt="Screenshot 2026-03-18 at 11 18 47"
src="https://github.com/user-attachments/assets/ea6909f6-e98c-4bad-ac66-05c738e54126"
/>

I proceeded with `Option 2: remove the flag entirely` plus the simplest
required UI change, where SLO selector support picking all instances (*)
for group-by SLOs.

<img width="400" height="301" alt="Screenshot 2026-03-18 at 11 44 49"
src="https://github.com/user-attachments/assets/6c0a966b-6481-4381-8608-a8e1afdff97f"
/>

I decided not to proceed with a full UI refactoring implemented in this
[PR](mgiota#8). This change can be done
as a follow up after the schema registration PR is merged, if we decide
we want to change the way we select SLOs in the alerts embeddable.

<img width="400" height="474" alt="Screenshot 2026-03-18 at 11 41 33"
src="https://github.com/user-attachments/assets/710a816f-3c40-42ae-ac65-6f13aac846e7"
/>

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Faisal Kanout <faisal@kanout.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

author:actionable-obs PRs authored by the actionable obs team backport:skip This PR does not require backporting release_note:skip Skip the PR/issue when compiling release notes v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[SLO] Alerts embeddable schema validation [SLO] Register Alerts embeddable schema

6 participants