Skip to content

[OBX-UX-MGMT] Store Alert Muted Status Directly in Alert Documents #240996

Merged
fkanout merged 51 commits intoelastic:mainfrom
fkanout:240514-store-alert-muted-alert-doc
Dec 3, 2025
Merged

[OBX-UX-MGMT] Store Alert Muted Status Directly in Alert Documents #240996
fkanout merged 51 commits intoelastic:mainfrom
fkanout:240514-store-alert-muted-alert-doc

Conversation

@fkanout
Copy link
Copy Markdown
Contributor

@fkanout fkanout commented Oct 28, 2025

Summary

It fixes #240514 by adding a mute field to the alert doc.
This addition happens on mute, mute all, unmute, and unmute all actions. Also, the related tests are updated.

Alert doc

Screenshot 2025-10-28 at 14 07 08

While the rule still have this info

Screenshot 2025-10-28 at 14 38 06

@github-actions github-actions bot added the author:obs-ux-management PRs authored by the obs ux management team label Oct 28, 2025
@fkanout fkanout self-assigned this Oct 30, 2025
@fkanout fkanout marked this pull request as ready for review October 30, 2025 12:08
@fkanout fkanout requested review from a team as code owners October 30, 2025 12:08
@fkanout fkanout added release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting Team:actionable-obs Formerly "obs-ux-management", responsible for SLO, o11y alerting, significant events, & synthetics. v9.3.0 labels Oct 30, 2025
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/obs-ux-management-team (Team:obs-ux-management)

@darnautov darnautov requested a review from ersin-erdal October 30, 2025 16:54
Comment thread x-pack/platform/plugins/shared/alerting/server/plugin.ts Outdated
Comment thread x-pack/platform/plugins/shared/alerting/server/alerts_service/alerts_service.ts Outdated
logger: context.logger,
});
} catch (error) {
context.logger.error(
Copy link
Copy Markdown
Contributor

@dominiqueclarke dominiqueclarke Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing to throw here can lead to a situation where one process is successful, while the other process is not. This can lead to out of sync statuses between the rule object and the alert document. I see this pattern in a few other paces as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Updated all other alert doc operations; if it fails now, it will throw an error.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Updated all other alert doc operations; if it fails now, it will throw an error.

I don't know if this was about some previous changes.

Doesn't this mean there should be a throw after the log?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this mean there should be a throw after the log?

@adcoelho It seems this the commit for throwing errors: 1cc2334
Do you see anything missing?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds functionality to update the kibana.alert.muted field in Elasticsearch when muting/unmuting alert instances and rules. The changes introduce new methods in the AlertsService class to perform Elasticsearch updateByQuery operations and integrate these methods into the mute/unmute workflows.

Key changes:

  • Adds new AlertsService methods: muteAlertInstance, unmuteAlertInstance, muteAllAlerts, and unmuteAllAlerts
  • Introduces the ALERT_MUTED field constant and field mapping
  • Updates mute/unmute operations to call the new AlertsService methods to update Elasticsearch
  • Adds comprehensive unit tests covering the new functionality and error handling

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.

Show a summary per file
File Description
x-pack/platform/plugins/shared/alerting/server/alerts_service/alerts_service.ts Implements four new methods to update muted status in Elasticsearch via updateByQuery
x-pack/platform/plugins/shared/alerting/server/application/rule/methods/mute_alert/mute_instance.ts Integrates call to muteAlertInstance when muting individual alert instances
x-pack/platform/plugins/shared/alerting/server/application/rule/methods/unmute_alert/unmute_instance.ts Integrates call to unmuteAlertInstance when unmuting individual alert instances
x-pack/platform/plugins/shared/alerting/server/application/rule/methods/mute_all/mute_all.ts Integrates call to muteAllAlerts when muting all alerts for a rule
x-pack/platform/plugins/shared/alerting/server/application/rule/methods/unmute_all/unmute_all.ts Integrates call to unmuteAllAlerts when unmuting all alerts for a rule
x-pack/platform/plugins/shared/alerting/server/rules_client/tests/mute_instance.test.ts Adds unit tests verifying the mute instance functionality including Elasticsearch operations
x-pack/platform/plugins/shared/alerting/server/application/rule/methods/unmute_alert/unmute_instance.test.ts Adds unit tests verifying the unmute instance functionality including Elasticsearch operations
x-pack/platform/plugins/shared/alerting/server/application/rule/methods/mute_all/mute_all.test.ts Adds unit tests verifying the mute all functionality including Elasticsearch operations
x-pack/platform/plugins/shared/alerting/server/application/rule/methods/unmute_all/unmute_all.test.ts Adds unit tests verifying the unmute all functionality including Elasticsearch operations
src/platform/packages/shared/kbn-rule-data-utils/src/default_alerts_as_data.ts Adds and exports the ALERT_MUTED constant
src/platform/packages/shared/kbn-alerts-as-data-utils/src/field_maps/alert_field_map.ts Adds the ALERT_MUTED field mapping definition

@banderror banderror requested review from a team and rylnd December 1, 2025 11:25
[ALERT_SEVERITY_IMPROVING]: true,
[ALERT_PREVIOUS_ACTION_GROUP]: 'default',
[ALERT_MAINTENANCE_WINDOW_IDS]: [],
// @ts-ignore
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this ignored?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

@cnasikas cnasikas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT! Thanks for addressing the comments.

Copy link
Copy Markdown
Contributor

@rylnd rylnd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good from the Detection Engine side 👍 . I believe this will also help our RBAC effort, since we can now retrieve muted alerts without needing rule access.

@rylnd
Copy link
Copy Markdown
Contributor

rylnd commented Dec 3, 2025

@fkanout once this is merged, can we update src/platform/packages/shared/response-ops/alerts-apis/hooks/use_get_muted_alerts_query.tsx to retrieve muted alerts via this field rather than by the rule? The current behavior is preventing users with alerts-only access from retrieving muted alerts, which I think may be a blocker as we develop alerts-specific RBAC; that seems like the quickest path forward to me, but please let me know what you think.

cc @umbopepato

@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#9965

[❌] x-pack/platform/test/alerting_api_integration/spaces_only/tests/alerting/group4/config.ts: 23/50 tests passed.

see run history

@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#9967

[❌] x-pack/platform/test/alerting_api_integration/spaces_only/tests/alerting/group4/config.ts: 21/50 tests passed.

see run history

@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#9970

[❌] x-pack/platform/test/alerting_api_integration/spaces_only/tests/alerting/group4/config.ts: 75/100 tests passed.

see run history

@fkanout
Copy link
Copy Markdown
Contributor Author

fkanout commented Dec 3, 2025

🟢 Flaky Test Runner Stats with 100/100 tests passed

https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/9970/steps/canvas?sid=019ae445-a831-484c-8565-94c94b406f85

@elasticmachine
Copy link
Copy Markdown
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] x-pack/platform/test/alerting_api_integration/security_and_spaces/group8/config.ts / alerting api integration security and spaces enabled Alerts - Group 8 alerts rule gaps update gaps should mark intervals as in_progress immediately after scheduling backfill
  • [job] [logs] Scout Test Run Builder / serverless-security - EUI testing wrapper: EuiComboBox - with multiple selections (pills)
  • [job] [logs] Scout Test Run Builder / with multiple selections (pills)

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/rule-data-utils 201 203 +2

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
observability 1.7MB 1.7MB +45.0B
slo 991.5KB 991.5KB +45.0B
synthetics 1.0MB 1.0MB +46.0B
total +136.0B

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
observability 95.7KB 95.7KB +24.0B
slo 33.2KB 33.2KB +24.0B
synthetics 28.8KB 28.8KB +24.0B
timelines 170.6KB 170.7KB +60.0B
total +132.0B
Unknown metric groups

API count

id before after diff
@kbn/rule-data-utils 214 216 +2

History

cc @fkanout @maryam-saeidi

@fkanout
Copy link
Copy Markdown
Contributor Author

fkanout commented Dec 3, 2025

@fkanout once this is merged, can we update src/platform/packages/shared/response-ops/alerts-apis/hooks/use_get_muted_alerts_query.tsx to retrieve muted alerts via this field rather than by the rule? The current behavior is preventing users with alerts-only access from retrieving muted alerts, which I think may be a blocker as we develop alerts-specific RBAC; that seems like the quickest path forward to me, but please let me know what you think.

cc @umbopepato

Hey @rylnd, I wanted to follow up with you on that, but I got pulled into fixing a couple of flaky tests.
What I wanted to clarify is whether the current behavior you’re referring to is something that existed before, or if it’s an outcome of the changes introduced in my PR. Once I understand that, I can confirm whether this should be handled here (the alert muting epic) or if it falls under response-ops. Either way, we should definitely track this. If there isn’t an existing issue for it, we can open one so we can make sure it’s properly handled.

@cnasikas wdyt?

@fkanout fkanout merged commit 1d5cc71 into elastic:main Dec 3, 2025
12 checks passed
@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#9974

[❌] x-pack/platform/test/alerting_api_integration/spaces_only/tests/alerting/group4/config.ts: 74/100 tests passed.
[✅] x-pack/platform/test/alerting_api_integration/security_and_spaces/group4/config.ts: 100/100 tests passed.

see run history

@rylnd
Copy link
Copy Markdown
Contributor

rylnd commented Dec 3, 2025

What I wanted to clarify is whether the current behavior you’re referring to is something that existed before,

Yep, sorry if I was unclear! The current behavior for retrieving muted alerts (on main, both before and after your PR) is to fetch them via the rules. On main, this isn't an issue, since any users with rules access implicitly have alerts access. However, on our development branch, where we introduce RBAC for Alerts separate from Rules, users with only the Alerts privileges are encountering errors with those "fetch muted alerts" calls as they cannot access the corresponding rules.

After this PR, it seems as though we can address this wrinkle (that reading muted alerts implicitly requires rule access) by retrieving them via the new alert fields (although behavior for historical alerts may be trickier, it's clear that this will at least work for future alerts).

So: I was asking for clarification on:

  1. Whether my understanding of the above is correct,
  2. Whether the subsequent use of these fields within the application had already been discussed/planned, and particularly
  3. Whether the retrieval of muted alerts was going to be updated as I explained

@cnasikas
Copy link
Copy Markdown
Member

cnasikas commented Dec 4, 2025

Hey @rylnd!

Whether my understanding of the above is correct,

Yes, your understanding is correct.

Whether the subsequent use of these fields within the application had already been discussed/planned, and particularly

Not at the moment. They are for user phasing, but I could see use cases to use the new field in the application. One of the gotchas we have is that when we do an API call to mute the alerts, we do not wait for the ES query to be successful (wait_for_completion: false). It is possible that if the call fails, some alert documents will not be updated. But at rule execution, the mute status will definitely be updated, but it may be somewhere in the future, depending on how the rule is scheduled. This is a deliberate decision to avoid slowing (or timeout) the API and, in turn, the UI.

Whether the retrieval of muted alerts was going to be updated as I explained

I checked the code of the useGetMutedAlertsQuery query, and I can definitely see improvements to accommodate your needs. I can see the following approaches:

  1. Use the new field and ignore the old alerts. Do we know how often it is for a user to have active muted alerts that are old?
  2. Create a dedicated API for fetching the muted information needed (which is part of the rule SO), but instead of requiring access for rules, the API requires access for alerts. Because, in the end, we need information about the alerts (which alerts are muted?). It makes sense to me to require only alert access.
  3. A mix of both of the above. We rely on 2) only if the muted field is not populated.

What are your thoughts? I opened this issue for ResponseOps to take care of it: #245228.

JordanSh pushed a commit to JordanSh/kibana that referenced this pull request Dec 9, 2025
…lastic#240996)

## Summary

It fixes elastic#240514 by adding a mute field to the alert doc. 
This addition happens on mute, mute all, unmute, and unmute all actions.
Also, the related tests are updated.
### Alert doc
<img width="646" height="578" alt="Screenshot 2025-10-28 at 14 07 08"
src="https://github.com/user-attachments/assets/1ea6528c-da54-4507-9515-ad82f4af8f2f"
/>

### While the rule still have this info
<img width="607" height="377" alt="Screenshot 2025-10-28 at 14 38 06"
src="https://github.com/user-attachments/assets/ee055af4-cd57-47b4-b9c9-10e511779549"
/>

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Dominique Belcher <dominique.clarke@elastic.co>
Co-authored-by: Maryam Saeidi <maryam.saeidi@elastic.co>
Co-authored-by: Christos Nasikas <xristosnasikas@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

author:obs-ux-management PRs authored by the obs ux management team backport:skip This PR does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:actionable-obs Formerly "obs-ux-management", responsible for SLO, o11y alerting, significant events, & synthetics. v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Alert muting] Store Alert Muted Status Directly in Alert Documents

10 participants