[ResponseOps][Alerting] Create a task to regenerate maintenance window events#219261
Conversation
|
Pinging @elastic/response-ops (Team:ResponseOps) |
| if (rRule.interval || rRule.freq) { | ||
| expirationDate = moment().utc().add(1, 'year').toISOString(); | ||
| } else { | ||
| expirationDate = moment(rRule.dtstart).utc().add(duration, 'ms').toISOString(); |
There was a problem hiding this comment.
if maintenance window has non recurring schedule, we don't need to set expiration date for more +1 year.
...red/alerting/server/application/maintenance_window/lib/generate_maintenance_window_events.ts
Show resolved
Hide resolved
...d/alerting/server/application/maintenance_window/methods/update/update_maintenance_window.ts
Outdated
Show resolved
Hide resolved
x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts
Outdated
Show resolved
Hide resolved
x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts
Outdated
Show resolved
Hide resolved
x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts
Outdated
Show resolved
Hide resolved
x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts
Outdated
Show resolved
Hide resolved
x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts
Outdated
Show resolved
Hide resolved
| return { | ||
| ...filteredMaintenanceWindow, | ||
| expirationDate: newEvents.length ? newExpirationDate : oldExpirationDate, | ||
| events: [...oldEvents, ...newEvents], |
There was a problem hiding this comment.
Do we need to keep the old events? I am wondering a) if the old events become so big that they cannot fit in memory, and b) if the events array becomes too big and the ES grows too much. The new events can contain some past events to be sure we do not miss anything. We can do that by setting the startDate of the generateMaintenanceWindowEvents to be one week in the past.
There was a problem hiding this comment.
startDateRange used in filter is 1 week before expiration date, hence used the same date to generate new events.
x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts
Outdated
Show resolved
Hide resolved
…t --include-path /api/status --include-path /api/alerting/rule/ --include-path /api/alerting/rules --include-path /api/actions --include-path /api/security/role --include-path /api/spaces --include-path /api/streams --include-path /api/fleet --include-path /api/dashboards --include-path /api/saved_objects/_import --include-path /api/saved_objects/_export --include-path /api/maintenance_window --update'
ymao1
left a comment
There was a problem hiding this comment.
Left a few comments. It would be good to add a functional test for this as well.
x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts
Show resolved
Hide resolved
x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts
Show resolved
Hide resolved
| .map((filteredMaintenanceWindow) => { | ||
| const { rRule, duration, expirationDate: oldExpirationDate } = filteredMaintenanceWindow; | ||
|
|
||
| const newEvents = generateMaintenanceWindowEvents({ |
There was a problem hiding this comment.
seems like generateMaintenanceWindowEvents can throw an error. should we handle that in a try/catch here? otherwise an error in one maintenance window might cause all not to get updated?
x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts
Show resolved
Hide resolved
| const fullQuery = [ | ||
| MaintenanceWindowStatus.Running, | ||
| MaintenanceWindowStatus.Upcoming, | ||
| MaintenanceWindowStatus.Finished, |
There was a problem hiding this comment.
Are we including finished maintenance windows in the query to check which ones need events? why?
There was a problem hiding this comment.
yes, suppose there was a maintenance window with 2 years of recurring schedule and runs once a year. So once this year's event is done it will be updated as finished. But it's end date is next year. That means it should generate event for next year even though it is finished as of now.
| for await (const findResults of soFinder.find()) { | ||
| mwsWithNewEvents = await generateEvents({ | ||
| maintenanceWindowsSO: findResults.saved_objects, |
There was a problem hiding this comment.
How does this work? soFinder.find() can return multiple findResults with their own saved_objects? why doesn't it return a single findResult with everything?
There was a problem hiding this comment.
It calls savedObjectsClient.find and iterates over multiple pages. It opens a new Point-In-Time (PIT), and continue paging until a set of results is received that's smaller than the designated perPage size. In our case if it has more than 1000 results, it will return in multiple batches. That's why we loop through findResults to find and update maximum 1000 SOs at a time.
Added a functional test |
Flaky Test Runner Stats🎉 All tests passed! - kibana-flaky-test-suite-runner#8358[✅] x-pack/test/alerting_api_integration/security_and_spaces/group3/config.ts: 25/25 tests passed. |
...ing_api_integration/security_and_spaces/group3/tests/maintenance_window/events_generation.ts
Outdated
Show resolved
Hide resolved
x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts
Outdated
Show resolved
Hide resolved
x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts
Show resolved
Hide resolved
x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts
Outdated
Show resolved
Hide resolved
x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts
Outdated
Show resolved
Hide resolved
| }; | ||
| }); | ||
|
|
||
| const result = await savedObjectsClient.bulkUpdate<MaintenanceWindowAttributes>( |
There was a problem hiding this comment.
Let's log the errors returned by the savedObjectsClient.bulkUpdate. bulkUpdate will not throw an error if any of the SOs failed to be updated. Instead, the error property of the SO in the response will not be empty and will have a message. We should iterate through the results and log the errors as Patrick suggested.
There was a problem hiding this comment.
Updated
Also subtracted MW SO with errors from totalUpdatedMaintenanceWindows
x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.test.ts
Show resolved
Hide resolved
| references: [], | ||
| }; | ||
|
|
||
| const statusFilter = { |
There was a problem hiding this comment.
It is hard to understand the filter and verify if it is correct. What about writing it on KQL and then using the fromKueryExpression to convert it to the format you need?
There was a problem hiding this comment.
Done I still added a snapshot test for the whole object.
There was a problem hiding this comment.
Thanks for the great test coverage!
| // verify 2 maintenance windows are updated | ||
| await retry.try(async () => { | ||
| const maintenanceWindowsResult = await getUpdatedMaintenanceWindows(); | ||
| expect(maintenanceWindowsResult.length).to.eql(2); |
There was a problem hiding this comment.
Could you please help me understand why two? In getUpdatedMaintenanceWindows, the number of "should be updated as..." is three, and I got confused.
There was a problem hiding this comment.
Two is correct number. Sorry my bad with incorrect comment.
The running MW should not be updated as its expiration date is 10 days in future while our task has 1 week past and 1 week future window.
💚 Build Succeeded
Metrics [docs]Unknown metric groupsESLint disabled line counts
Total ESLint disabled count
History
|
| describe('getStatusFilter', () => { | ||
| test('should build status filter', () => { | ||
| expect(getStatusFilter()).toEqual(statusFilter); | ||
| expect(getStatusFilter()).toMatchInlineSnapshot(` |
There was a problem hiding this comment.
super nit: I was thinking something like expect(getStatusFilter()).toEqual(fromKueryExpression(statusFilter))
| } | ||
|
|
||
| logger.debug( | ||
| `Cancelling maintenance windows events generator task - execution error due to timeout.` |
There was a problem hiding this comment.
super nit: We are not sure why the task got cancelled.
|
|
||
| await soFinder?.close(); | ||
|
|
||
| return; |
There was a problem hiding this comment.
super nit:
| return; |
|
Starting backport for target branches: 8.19 https://github.com/elastic/kibana/actions/runs/15731950402 |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…e window events (#219261) (#224773) # Backport This will backport the following commits from `main` to `8.19`: - [[ResponseOps][Alerting] Create a task to regenerate maintenance window events (#219261)](#219261) <!--- Backport version: 10.0.1 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport) <!--BACKPORT [{"author":{"name":"Janki Salvi","email":"117571355+js-jankisalvi@users.noreply.github.com"},"sourceCommit":{"committedDate":"2025-06-18T11:46:12Z","message":"[ResponseOps][Alerting] Create a task to regenerate maintenance window events (#219261)\n\n## Summary\n\nResolves https://github.com/elastic/kibana/issues/211534\n\nThis PR adds a recurring task which will\n- run once every day \n- collect maintenance windows which have expiration date within 1 week\n- updates expiration date to +1 year if it is recurring\n- generate events for the next 1 year\n- adds new events to maintenance window\n\n### Checklist\n\n- [x] [Unit or functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere updated or added to match the most common scenarios\n\n### How to test\n- Set expiration date to less than 1 week before creating maintenance\nwindows: update line 70 `expirationDate = moment().utc().add(1,\n'year').toISOString();` to `expirationDate = moment().utc().add(5,\n'days').toISOString();` in the file\n`x-pack/platform/plugins/shared/alerting/server/application/maintenance_window/methods/create/create_maintenance_window.ts`\n- Create maintenance windows with different scenarios (recurring, non\nrecurring, etc.)\n- Update task schedule to run every five minutes to test: set `{\ninterval: '1d' }` to `{ interval: '5m' }` in file\n`x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts`\n- Verify the task ran successfully\n- Verify maintenance windows are updated properly with new expiration\ndate and new events\n\n\n### Flaky test runner:\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/8358\n\n---------\n\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"eaa9c1c4cdfe2310d430a1f8c0a4fabf39475ad8","branchLabelMapping":{"^v9.1.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:ResponseOps","backport missing","Feature:Alerting/RulesFramework","backport:version","v9.1.0","v8.19.0"],"title":"[ResponseOps][Alerting] Create a task to regenerate maintenance window events","number":219261,"url":"https://github.com/elastic/kibana/pull/219261","mergeCommit":{"message":"[ResponseOps][Alerting] Create a task to regenerate maintenance window events (#219261)\n\n## Summary\n\nResolves https://github.com/elastic/kibana/issues/211534\n\nThis PR adds a recurring task which will\n- run once every day \n- collect maintenance windows which have expiration date within 1 week\n- updates expiration date to +1 year if it is recurring\n- generate events for the next 1 year\n- adds new events to maintenance window\n\n### Checklist\n\n- [x] [Unit or functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere updated or added to match the most common scenarios\n\n### How to test\n- Set expiration date to less than 1 week before creating maintenance\nwindows: update line 70 `expirationDate = moment().utc().add(1,\n'year').toISOString();` to `expirationDate = moment().utc().add(5,\n'days').toISOString();` in the file\n`x-pack/platform/plugins/shared/alerting/server/application/maintenance_window/methods/create/create_maintenance_window.ts`\n- Create maintenance windows with different scenarios (recurring, non\nrecurring, etc.)\n- Update task schedule to run every five minutes to test: set `{\ninterval: '1d' }` to `{ interval: '5m' }` in file\n`x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts`\n- Verify the task ran successfully\n- Verify maintenance windows are updated properly with new expiration\ndate and new events\n\n\n### Flaky test runner:\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/8358\n\n---------\n\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"eaa9c1c4cdfe2310d430a1f8c0a4fabf39475ad8"}},"sourceBranch":"main","suggestedTargetBranches":["8.19"],"targetPullRequestStates":[{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/219261","number":219261,"mergeCommit":{"message":"[ResponseOps][Alerting] Create a task to regenerate maintenance window events (#219261)\n\n## Summary\n\nResolves https://github.com/elastic/kibana/issues/211534\n\nThis PR adds a recurring task which will\n- run once every day \n- collect maintenance windows which have expiration date within 1 week\n- updates expiration date to +1 year if it is recurring\n- generate events for the next 1 year\n- adds new events to maintenance window\n\n### Checklist\n\n- [x] [Unit or functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere updated or added to match the most common scenarios\n\n### How to test\n- Set expiration date to less than 1 week before creating maintenance\nwindows: update line 70 `expirationDate = moment().utc().add(1,\n'year').toISOString();` to `expirationDate = moment().utc().add(5,\n'days').toISOString();` in the file\n`x-pack/platform/plugins/shared/alerting/server/application/maintenance_window/methods/create/create_maintenance_window.ts`\n- Create maintenance windows with different scenarios (recurring, non\nrecurring, etc.)\n- Update task schedule to run every five minutes to test: set `{\ninterval: '1d' }` to `{ interval: '5m' }` in file\n`x-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.ts`\n- Verify the task ran successfully\n- Verify maintenance windows are updated properly with new expiration\ndate and new events\n\n\n### Flaky test runner:\nhttps://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/8358\n\n---------\n\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"eaa9c1c4cdfe2310d430a1f8c0a4fabf39475ad8"}},{"branch":"8.19","label":"v8.19.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Janki Salvi <117571355+js-jankisalvi@users.noreply.github.com>
Summary
Resolves #211534
This PR adds a recurring task which will
Checklist
How to test
expirationDate = moment().utc().add(1, 'year').toISOString();toexpirationDate = moment().utc().add(5, 'days').toISOString();in the filex-pack/platform/plugins/shared/alerting/server/application/maintenance_window/methods/create/create_maintenance_window.ts{ interval: '1d' }to{ interval: '5m' }in filex-pack/platform/plugins/shared/alerting/server/maintenance_window_events/task.tsFlaky test runner:
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/8358