Skip to content

Add kibana.alert.grouping field to SLO burn rate rule#228754

Merged
mgiota merged 19 commits intoelastic:mainfrom
mgiota:slo_burn_rate_rule_grouping
Jul 29, 2025
Merged

Add kibana.alert.grouping field to SLO burn rate rule#228754
mgiota merged 19 commits intoelastic:mainfrom
mgiota:slo_burn_rate_rule_grouping

Conversation

@mgiota
Copy link
Copy Markdown
Contributor

@mgiota mgiota commented Jul 21, 2025

Fixes #224887
Fixes #219996

Acceptance criteria

  • Add a dynamic kibana.alert.grouping field
  • Add a dynamic template
dynamicTemplates: [
      {
        strings_as_keywords: {
          path_match: 'kibana.alert.grouping.*',
          match_mapping_type: 'string',
          mapping: {
            type: 'keyword',
            ignore_above: 1024,
          },
        },
      },
    ],
  • Remove grouping from alert state here (we had previously saved grouping in alert state to use it in alert recovery context due to absence of kibana.alertg.grouping field)
  • In the context use alert document instead of alert state. It should be grouping: recoveredAlert.hit?.[ALERT_GROUPING];

How to test

  • Generate some data node x-pack/scripts/data_forge.js --events-per-cycle 50 --lookback now --dataset fake_stack --install-kibana-assets --event-template bad
  • Create an SLO and group by one or more fields for example host.name
Screenshot 2025-07-22 at 17 46 18
  • In Dev Tools run GET .alerts-observability.slo.alerts-default/_search and verify that in the generated alert documents, kibana.alert.grouping field is present
Screenshot 2025-07-22 at 17 50 50
  • Recover the alert and verify that the kibana.alert.grouping field remains in the recovered alert document
  • Inspect the index mappings and ensure the dynamic mapping for kibana.alert.grouping has been applied correctly
  • Confirm that the alert context contains the grouping information.

@github-actions github-actions bot added the author:obs-ux-management PRs authored by the obs ux management team label Jul 21, 2025
@mgiota mgiota force-pushed the slo_burn_rate_rule_grouping branch 5 times, most recently from 6fd38f8 to 9ab1ffc Compare July 21, 2025 21:42
@mgiota mgiota changed the title temp Add kibana.alert.grouping field to SLO burn rate rule Jul 22, 2025
@mgiota mgiota added Team:actionable-obs Formerly "obs-ux-management", responsible for SLO, o11y alerting, significant events, & synthetics. v9.2.0 backport:version Backport to applied version labels release_note:skip Skip the PR/issue when compiling release notes labels Jul 22, 2025
dynamicTemplates: [
{
strings_as_keywords: {
path_match: 'kibana.alert.grouping.*',
Copy link
Copy Markdown
Contributor Author

@mgiota mgiota Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benakansara I took the code snippet for the dynamic template from the meta ticket. Wondering if it is OK to use the ALERT_GROUPING variable here ${ALERT_GROUPING}.*?

@mgiota mgiota force-pushed the slo_burn_rate_rule_grouping branch from 9ab1ffc to 9a48c93 Compare July 22, 2025 10:08
@mgiota mgiota force-pushed the slo_burn_rate_rule_grouping branch from 9a48c93 to a7b4beb Compare July 22, 2025 13:34
@mgiota
Copy link
Copy Markdown
Contributor Author

mgiota commented Jul 22, 2025

@elasticmachine merge upstream

@mgiota mgiota marked this pull request as ready for review July 22, 2025 14:53
@mgiota mgiota requested a review from a team as a code owner July 22, 2025 14:53
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/obs-ux-management-team (Team:obs-ux-management)

@mgiota mgiota requested a review from benakansara July 22, 2025 15:12
@mgiota mgiota self-assigned this Jul 22, 2025
Copy link
Copy Markdown
Contributor

@kdelemme kdelemme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally, LGTM

Image

WindowSchema,
} from './types';

// should I add ALERT_GROUPING here?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the alertClient does not complain i think it's fine to keep it as it is

Copy link
Copy Markdown
Contributor

@kdelemme kdelemme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm getting a bunch of errors in the console:


[2025-07-22T14:39:53.950-04:00][ERROR][plugins.alerting] Error writing alerts for slo.rules.burnRate:e70b82f5-a61c-4f12-b8e6-1248951060a7 'test Burn Rate rule': 19 successful, 0 conflicts, 1 errors: [eec9c343-1e02-4ded-9aa8-140d77ae0c3d]: version conflict, document already exists (current version [9])
[2025-07-22T14:39:53.952-04:00][ERROR][plugins.alerting] Error writing alerts for slo.rules.burnRate:95855ca9-65f6-4d65-b69a-0e65c958891d 'two groups Burn Rate rule': 2 successful, 0 conflicts, 35 errors: [8138a2d8-6ebd-4c39-a748-b16595ca287d]: version conflict, document already exists (current version [9]); [5f3bb6e9-b40f-4da1-953d-d9a4e37b746d]: version conflict, document already exists (current version [9]); [310265db-e87c-45bd-a85b-dbdc808f0181]: version conflict, document already exists (current version [9]); [f42a865c-b1e7-4cd7-acbf-c7d5f69727b8]: version conflict, document already exists (current version [9]); [6ac12141-2e97-4b81-8e54-8ebf3ed4f85b]: version conflict, document already exists (current version [9]); [909a5eb3-bbe6-45b1-b047-a0c26eced028]: version conflict, document already exists (current version [9]); [4ce69e8c-6ad1-4330-9319-0f3f9498d81b]: version conflict, document already exists (current version [9]); [1084c089-f8d1-43f1-a357-ae277eaac8d3]: version conflict, document already exists (current version [9]); [baa48408-18cf-4c7f-a512-627e0c181aa1]: version conflict, document already exists (current version [9]); [dee70d1a-5abf-4d8f-be2b-4e00d53b314f]: version conflict, document already exists (current version [9]); [7f1bb6a1-f942-456c-9972-558bece13b74]: version conflict, document already exists (current version [9]); [dba6bf8e-d0af-4348-ae13-937695045423]: version conflict, document already exists (current version [9]); [7b61f7e9-830f-4e73-8859-d453279eef67]: version conflict, document already exists (current version [9]); [5c067509-82ad-41bf-a6f9-9bc7cd1f5f8b]: version conflict, document already exists (current version [9]); [9bae022c-11bf-4331-9b33-8beb37bc5466]: version conflict, document already exists (current version [9]); [41fd5155-5c6a-4a4a-a3d7-8d99b3530c7f]: version conflict, document already exists (current version [9]); [95d158e7-eacd-4ae0-851d-418eaf59f2e1]: version conflict, document already exists (current version [9]); [05dd8f30-668d-41c6-84a4-db8550396211]: version conflict, document already exists (current version [9]); [d8d9733f-60f0-455b-8829-f35ec3bdd803]: version conflict, document already exists (current version [9]); [f1efd443-98d2-4f62-959b-4ac465c92e31]: version conflict, document already exists (current version [9]); [70ce1a36-ea63-46e1-97fe-fa5ccb07c2f6]: version conflict, document already exists (current version [9]); [32a8fed4-7cd1-44f5-8e9b-39f747a2d1b3]: version conflict, document already exists (current version [9]); [802f081e-9731-4fc8-ab25-1f76894b80aa]: version conflict, document already exists (current version [9]); [0c5aaea1-9423-4d53-8233-0392c78b506b]: version conflict, document already exists (current version [9]); [022d4b6c-a7c4-40bf-bf27-85f225bf7128]: version conflict, document already exists (current version [9]); [523fb96d-05a8-43f9-a970-e9e5eed0caba]: version conflict, document already exists (current version [9]); [9bdea984-4d7b-4aec-9343-367cfab43838]: version conflict, document already exists (current version [9]); [576b0d0c-b483-4a8f-b4b3-9e095c9c606c]: version conflict, document already exists (current version [9]); [f3cef280-c10b-4bb0-bcc2-5d8e75931c35]: version conflict, document already exists (current version [9]); [07ab7e36-3976-404a-aaca-c0f8c0a73baf]: version conflict, document already exists (current version [9]); [12ad4959-6047-45ed-80e2-c8ff09a599c0]: version conflict, document already exists (current version [9]); [ca68b3a9-88b5-4afc-83cd-d30f249c58c7]: version conflict, document already exists (current version [9]); [7801f1e6-818b-41d4-9eb9-d58f7acf43b6]: version conflict, document already exists (current version [9]); [3dae69e0-cf3b-4856-b25f-e1ada29af634]: version conflict, document already exists (current version [9]); [35196594-3dd1-4eeb-ac02-9683775b1eb0]: version conflict, document already exists (current version [9])

@mgiota mgiota force-pushed the slo_burn_rate_rule_grouping branch 3 times, most recently from 8db889b to 3d8d13b Compare July 23, 2025 09:36
@mgiota
Copy link
Copy Markdown
Contributor Author

mgiota commented Jul 24, 2025

@kdelemme I am now working on another branch and I got the same version conflict error you got:

[2025-07-24T13:23:16.028+03:00][ERROR][plugins.alerting] Error writing alerts for slo.rules.burnRate:af2f8c2b-bcd5-4024-81e4-6d8551651c7d 'SLO with 2 group by Burn Rate rule': 0 successful, 0 conflicts, 20 errors: [021823ce-cd95-469f-a3c3-32c6919a3d32]: version conflict, document already exists (current version [126]); [daf086af-6bca-4002-8a78-68bef25b1ab3]: version conflict, document already exists (current version [126]); [fd03f922-7f21-448e-a96f-fbe3996138da]: version conflict, document already exists (current version [126]); [cc58bc45-1dee-4e1d-a19d-b99d4d6d3ffa]: version conflict, document already exists (current version [126]); [690ca0f8-a91c-4abf-890b-5c2b953469d8]: version conflict, document already exists (current version [126]); [f027e0ea-af3d-45a4-a2c4-db3491412508]: version conflict

In praxis it shouldn't happen, right? Since users will update to the latest changes and not the other way around as I was able to reproduce switching from this branch to another one that doesn't contain the changes.

@benakansara
Copy link
Copy Markdown
Contributor

@kdelemme I am now working on another branch and I got the same version conflict error you got:

[2025-07-24T13:23:16.028+03:00][ERROR][plugins.alerting] Error writing alerts for slo.rules.burnRate:af2f8c2b-bcd5-4024-81e4-6d8551651c7d 'SLO with 2 group by Burn Rate rule': 0 successful, 0 conflicts, 20 errors: [021823ce-cd95-469f-a3c3-32c6919a3d32]: version conflict, document already exists (current version [126]); [daf086af-6bca-4002-8a78-68bef25b1ab3]: version conflict, document already exists (current version [126]); [fd03f922-7f21-448e-a96f-fbe3996138da]: version conflict, document already exists (current version [126]); [cc58bc45-1dee-4e1d-a19d-b99d4d6d3ffa]: version conflict, document already exists (current version [126]); [690ca0f8-a91c-4abf-890b-5c2b953469d8]: version conflict, document already exists (current version [126]); [f027e0ea-af3d-45a4-a2c4-db3491412508]: version conflict

In praxis it shouldn't happen, right? Since users will update to the latest changes and not the other way around as I was able to reproduce switching from this branch to another one that doesn't contain the changes.

this is a known issue: #190376

@mgiota mgiota requested a review from benakansara July 24, 2025 23:08
@mgiota
Copy link
Copy Markdown
Contributor Author

mgiota commented Jul 24, 2025

@elasticmachine merge upstream

@mgiota
Copy link
Copy Markdown
Contributor Author

mgiota commented Jul 25, 2025

@elasticmachine merge upstream

@mgiota
Copy link
Copy Markdown
Contributor Author

mgiota commented Jul 25, 2025

@elasticmachine merge upstream

@mgiota
Copy link
Copy Markdown
Contributor Author

mgiota commented Jul 28, 2025

@elasticmachine merge upstream

@elasticmachine
Copy link
Copy Markdown
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

cc @mgiota

Copy link
Copy Markdown
Contributor

@benakansara benakansara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@JiaweiWu JiaweiWu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let a nitpick comment

};

export const getFormattedGroups = (grouping?: Record<string, string>): Group[] | undefined => {
export const getFormattedGroups = (grouping?: Record<string, unknown>): Group[] | undefined => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Should we give this a type instead of unknown? Especially since we're casting to a string.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JiaweiWu I did this refactoring in this commit to address this PR feedback. I will merge as is and in a follow up PR I can try to use a new getFlattenGrouping instead and revert this change.

@mgiota mgiota merged commit 9aed805 into elastic:main Jul 29, 2025
12 checks passed
@kibanamachine kibanamachine added the backport missing Added to PRs automatically when the are determined to be missing a backport. label Jul 31, 2025
@kibanamachine
Copy link
Copy Markdown
Contributor

Friendly reminder: Looks like this PR hasn’t been backported yet.
To create automatically backports add a backport:* label or prevent reminders by adding the backport:skip label.
You can also create backports manually by running node scripts/backport --pr 228754 locally
cc: @mgiota

1 similar comment
@kibanamachine
Copy link
Copy Markdown
Contributor

Friendly reminder: Looks like this PR hasn’t been backported yet.
To create automatically backports add a backport:* label or prevent reminders by adding the backport:skip label.
You can also create backports manually by running node scripts/backport --pr 228754 locally
cc: @mgiota

@cesco-f cesco-f added backport:skip This PR does not require backporting and removed backport missing Added to PRs automatically when the are determined to be missing a backport. backport:version Backport to applied version labels labels Aug 1, 2025
delanni pushed a commit to delanni/kibana that referenced this pull request Aug 5, 2025
Fixes elastic#224887
Fixes elastic#219996

## Acceptance criteria
- [x] Add a dynamic `kibana.alert.grouping` field
- [x] Add a dynamic template
```
dynamicTemplates: [
      {
        strings_as_keywords: {
          path_match: 'kibana.alert.grouping.*',
          match_mapping_type: 'string',
          mapping: {
            type: 'keyword',
            ignore_above: 1024,
          },
        },
      },
    ],
```
- [x] Remove `grouping` from alert state
[here](https://github.com/elastic/kibana/blob/19802227747ee5e7022af2f5f8b87cba9f7eff1c/x-pack/solutions/observability/plugins/slo/server/lib/rules/slo_burn_rate/executor.ts#L171)
(we had previously saved grouping in alert state to use it in alert
recovery context due to absence of kibana.alertg.grouping field)
- [x] In the
[context](https://github.com/elastic/kibana/blob/19802227747ee5e7022af2f5f8b87cba9f7eff1c/x-pack/solutions/observability/plugins/slo/server/lib/rules/slo_burn_rate/executor.ts#L238)
use alert document instead of alert state. It should be `grouping:
recoveredAlert.hit?.[ALERT_GROUPING];`

## How to test

- Generate some data `node x-pack/scripts/data_forge.js
--events-per-cycle 50 --lookback now --dataset fake_stack
--install-kibana-assets --event-template bad`
- Create an SLO and group by one or more fields for example `host.name`

<img width="600" height="467" alt="Screenshot 2025-07-22 at 17 46 18"
src="https://github.com/user-attachments/assets/2d589551-1d2b-4139-9cff-dbd4f18cc020"
/>

- In Dev Tools run `GET
.alerts-observability.slo.alerts-default/_search` and verify that in the
generated alert documents, `kibana.alert.grouping` field is present

<img width="355" height="125" alt="Screenshot 2025-07-22 at 17 50 50"
src="https://github.com/user-attachments/assets/deabb5e0-d1ad-4460-bf09-45b5e2af2a59"
/>

- Recover the alert and verify that the `kibana.alert.grouping` field
remains in the recovered alert document
- Inspect the index mappings and ensure the dynamic mapping for
kibana.alert.grouping has been applied correctly
- Confirm that the alert context contains the grouping information.

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

author:obs-ux-management PRs authored by the obs ux management team backport:skip This PR does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:actionable-obs Formerly "obs-ux-management", responsible for SLO, o11y alerting, significant events, & synthetics. v9.2.0

Projects

None yet

7 participants