Skip to content

[RAM] Add bulk action to Untrack selected alerts#167579

Merged
XavierM merged 54 commits intoelastic:mainfrom
Zacqary:bulkaction-164059
Oct 4, 2023
Merged

[RAM] Add bulk action to Untrack selected alerts#167579
XavierM merged 54 commits intoelastic:mainfrom
Zacqary:bulkaction-164059

Conversation

@Zacqary
Copy link
Copy Markdown
Contributor

@Zacqary Zacqary commented Sep 28, 2023

Summary

Part of #164059

Screenshot 2023-09-28 at 5 38 45 PM Screenshot 2023-09-28 at 5 38 11 PM

This PR:

  • Moves the setAlertStatusToUntracked function from the AlertsClient into the AlertsService. This function doesn't actually need any Rule IDs to do what it's supposed to do, only indices and Alert UUIDs. Therefore, we want to make it possible to use outside of a created AlertsClient, which requires a Rule to initialize.
  • Creates a versioned internal API to bulk untrack a given set of alertUuids present on indices. Both of these pieces of information are readily available from the ECS fields sent to the alert table component, from where this bulk action will be called.
  • Switches the setAlertStatusToUntracked query to look for alert UUIDs instead of alert instance IDs. [RAM] Mark disabled alerts as Untracked in both Stack and o11y #164788 dealt with untracking alerts that were bound to a single rule at a time, but this PR could be untracking alerts generated by many different rules at once. Multiple rules may generate the same alert instance ID names with different UUIDs, so using UUID increases the specificity and prevents us from untracking alert instances that the user didn't intend.
  • Adds a bulkUpdateState method to the task scheduler. [RAM] Mark disabled alerts as Untracked in both Stack and o11y #164788 modified the bulkDisable method to clear untracked alerts from task states, but this new method allows us to untrack a given set of alert instances without disabling the task that generated them.

Why omit rule ID from this API?

The rule ID is technically readily available from the alert table, but it becomes redundant when we already have immediate access to the alert document's index. #164788 used the rule ID to get the ruleTypeId and turn this into a corresponding index, which we don't have to do anymore.

Furthermore, it helps to omit the rule ID from the updateByQuery request, because the user can easily select alerts that were generated by a wide variety of different rules, and untrack them all at once. We could include the rule ID in a separate should query, but this adds needless complexity to the query.

We do need to know the rule ID after performing updateByQuery, because it corresponds to the task state we want to modify, but it's easier to retrieve this using the same query params provided.

Checklist

Delete any items that are not applicable to this PR.

@Zacqary Zacqary added Team:ResponseOps Platform ResponseOps team (formerly the Cases and Alerting teams) t// release_note:feature Makes this part of the condensed release notes Feature:Alerting/RulesManagement Issues related to the Rules Management UX v8.11.0 labels Sep 28, 2023
@ghost
Copy link
Copy Markdown

ghost commented Sep 28, 2023

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • /oblt-deploy : Deploy a Kibana instance using the Observability test environments.
  • /oblt-deploy-serverless : Deploy a serverless Kibana instance using the Observability test environments.
  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

}
}

public async setAlertsToUntracked(opts: SetAlertsToUntrackedOpts) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure that this method actually belongs in AlertsService rather than AlertsClient, because it interacts with and updates alert data. However, we want to make it available without being connected to a Rule (which is a prerequisite for creating an AlertsClient), and it also doesn't actually need a corresponding Rule to operate.

Not sure if there's a specific design goal re: Service vs. Client that I'm violating here with this, but it was the most expedient way I could think of to get this to work.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably fine for now. The initial intent was for the AlertService to handle all of the framework level resource installation that's required to write alerts for a rule and then AlertsClient to be the way for the alerting task runner to interact with those alerts. As you point out, there are use cases for users interacting directly with alerts. In the future, we may want to consolidate all of those interactions into a separate client but this is fine for now.

@Zacqary Zacqary marked this pull request as ready for review September 29, 2023 23:06
@Zacqary Zacqary requested a review from a team as a code owner September 29, 2023 23:06
@Zacqary Zacqary requested a review from a team September 29, 2023 23:06
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

Copy link
Copy Markdown
Member

@maryam-saeidi maryam-saeidi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally and for happy path it worked fine.

  1. But the Mark as untracked option for recovered alerts:

When I applied it, it showed a successful message:

image

Can you please fix that?

  1. Another issue was related to AlertSummaryWidget component:

image

Number of active alerts includes the untracked alerts as well, can it be fixed in this PR? If it is out of scope, would you please create an issue?

Questions:

  1. What is the difference between AlertsClient and AlertsService?
  2. @maciejforcone @katrin-freihofner Should we count untracked alerts in the total number of alerts in the Alert Summary Widget? [Out of scope of this PR]
Alert Summary Widget Alerts
image image

</EuiContextMenuItem>
),
],
...(ruleSavePermissions
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly is this permission? Is this different from the permission for adding an alert to a case?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the permission for editing rules. I didn't see a permissions requirement spec'd out at all so I wasn't sure which one to use, but I can switch it to the case permission, that probably makes more sense.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maryam-saeidi How's tying this to the update case permission sound? Or do we need to create an entirely new permission for this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked to @XavierM, turns out we don't need permissions on the client side because the user can only see alerts from rules that they're authorized to interact with. I need to implement some auth checking on the server side though.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update 👍🏻

@Zacqary
Copy link
Copy Markdown
Contributor Author

Zacqary commented Oct 2, 2023

@maryam-saeidi

Questions

  1. What is the difference between AlertsClient and AlertsService?

I was wondering that myself. As far as I can tell the AlertsClient is a thing we create specifically to interact with the alerts generated by a particular rule, given that it needs rule properties to be initialized.

The AlertsService on the other hand is, uhhhh, the thing that has the right elasticsearch client stuff already built in and feature freeze is tomorrow.

EDIT Sorry missed @ymao1's explanation: #167579 (comment)

) => {
router.post(
{
path: `${INTERNAL_BASE_ALERTING_API_PATH}/rules/_bulk_untrack`,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not use rules but alerts here

Suggested change
path: `${INTERNAL_BASE_ALERTING_API_PATH}/rules/_bulk_untrack`,
path: `${INTERNAL_BASE_ALERTING_API_PATH}/alerts/_bulk_untrack`,

},
];

if (ensureAuthorized) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensureAuthorized should always be there if we can, we need to enforce this check!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setAlertsToUntracked is sometimes called through untrackRuleAlerts, which is only ever called after a different ensureAuthorized method has run

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's go with the way it is and we can refactor it later on

Copy link
Copy Markdown
Contributor

@ymao1 ymao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried out all the disable/delete/untrack scenarios. Left some comments about unit tests that I'm fine with being addressed as a follow up issue.

Would like to see two things addressed:

  • try/catch around the callback function that updates the task manager state
  • deleting a rule did not mark the associated active alerts as untracked when I tested it.

@XavierM XavierM requested a review from a team October 3, 2023 19:13
@botelastic botelastic bot added the Team:APM - DEPRECATED Use Team:obs-ux-infra_services. label Oct 3, 2023
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/apm-ui (Team:APM)

Copy link
Copy Markdown
Contributor

@ymao1 ymao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified with the latest commits that delete works & the requested try/catch has been updated. I see @XavierM left a comment about ensureAuthorized so I'll let him give the final 👍 but LGTM

@XavierM XavierM enabled auto-merge (squash) October 3, 2023 20:26
Copy link
Copy Markdown
Contributor

@achyutjhunjhunwala achyutjhunjhunwala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from APM side

@kibana-ci
Copy link
Copy Markdown

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #38 / Alerting alerts_as_data alerts as data "after all" hook for "should write alert docs during rule execution"
  • [job] [logs] Defend Workflows Cypress Tests #4 / Artifact pages Trusted applications should update Endpoint Policy on Endpoint when adding Trusted application name should update Endpoint Policy on Endpoint when adding Trusted application name
  • [job] [logs] Security Solution Cypress Tests #9 / Ransomware Detection Alerts Ransomware in Timelines Renders ransomware entries in timelines table Renders ransomware entries in timelines table

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
triggersActionsUi 586 587 +1

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
triggersActionsUi 552 553 +1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
observability 1.0MB 1.0MB +438.0B
triggersActionsUi 1.4MB 1.4MB +1.2KB
total +1.6KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
triggersActionsUi 92.8KB 93.9KB +1.1KB
Unknown metric groups

API count

id before after diff
triggersActionsUi 578 579 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting Feature:Alerting/RulesManagement Issues related to the Rules Management UX release_note:feature Makes this part of the condensed release notes Team:APM - DEPRECATED Use Team:obs-ux-infra_services. Team:ResponseOps Platform ResponseOps team (formerly the Cases and Alerting teams) t// v8.11.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants