Skip to content

Conversation

@mikecote
Copy link
Contributor

@mikecote mikecote commented Jun 17, 2019

In this PR, we're adding the initial documentation to allow solution teams to get started at validating the alerting framework that is developed so far.

@mikecote mikecote added Feature:Alerting release_note:skip Skip the PR/issue when compiling release notes Team:Stack Services labels Jun 17, 2019
@mikecote mikecote self-assigned this Jun 17, 2019
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-stack-services

@mikecote mikecote force-pushed the alerting/README-files branch from 569f6f3 to 02b078a Compare June 18, 2019 11:39
Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great!

I made a number of comments, but they're mostly about the design of the APIs, not the doc :-)

|---|---|---|
|id|Unique identifier for the action type.|string|
|name|A user friendly name for the action type.|string|
|unencryptedAttributes|A list of opt-out attributes that don't need to be encrypted, these attributes won't need to be re-entered on import / export when the feature becomes available.|array of strings|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had some chit-chat about maybe indicating unencrypted config attributes (this is really just config, and not params, right?) could be specified instead via joi.meta({unencrypted: true}) or such (or possibly the reverse - specify which config attrs are encrypted). Seems more readable and writable. We'd need to use joi.describe() to get these

|id|The id of the action you want to fire.|string|
|params|The `params` value to give the action type executor|object|
|namespace|The namespace the action exists within|string|
|basePath|The request's basePath value, usually `request.getBasePath()`|string|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is basePath supposed to provide some audit/log-ish kind of value? Like, "this alert was associated with an http request here: x/y/z". Otherwise not sure what it's for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a temporary thing until we do the security phase of alerting. Currently when we're detached from a request, we need to capture that value for the saved objects client to know what space it relates to, we've discussed decomposing it into just the namespace but figured it wasn't worth the effort for now.

|services.callCluster|Use this to do elasticsearch queries on the cluster Kibana connects to. NOTE: This currently authenticates as the Kibana internal user, this will change in a future PR.|
|services.savedObjectsClient|Use this to manipulate saved objects. NOTE: This currently only works when security is disabled. A future PR will add support for enabled security using Elasticsearch API tokens.|
|services.log|Use this to create server logs. (This is the same function as server.log)|
|scheduledRunAt|The date and time the alert type was supposed to be called.|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had talked about some more elaborate "meta" state on previous scheduled runs - time of last successful run, number of errors since last successful run, etc. Seems like at least previousScheduleRunAt and probably scheduledRunAt should be properties on some kind of "meta" object which is passed in.

Speaking of "meta" info about alerts, I wonder if this should include error information about running the actions. Wonder if there's enough info in the alert to provide interesting user-facing info on things like "we ran this alert but this action failed" kinda thing. I guess eventually, when we start adding audit/log info to ES on the alerts as they are "executed"? Seems right, so ... can defer, but just wondering ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think that will come with the history / audit log phase where we could leverage that for meta functionality and such.

In regards to meta you're thinking wrapping those two into a meta object instead of being provided flat?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya, that was the suggestion, sorry: move scheduledRunAt and previousScheduledRunAt into some new top-level property, maybe scheduleState.

});

// Returning updated alert type level state
return {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brain fart: I wonder how well this will work in practice. I think I find that for functions that might be pretty complicated to implement (eg, an alert executor heh), it can get complicated to coordinate all the returns. Something we could "enhance" later with an API that let you explicitly set the state or something, perhaps. I think I'm thinking of how AWS lambda has morphed over the years from kind of a return-centric thing to explicit API, for responses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open for suggestions, this one here is the global state at the alert level regardless if an instance is created or not. We'll see if its used by the team.

|resetFire()|Reverts a `.fire(...)` call and makes the instance no longer fire.|
|getState()|Get the current state of the alert instance.|
|getMeta()|Get the internal set meta attributes for the alert instance.|
|fire(actionGroup, context)|Called to fire actions.|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I call fire() multiple times? Does resetFire() take options to remove a specific fire() that was previously called? Allowing fire() to be called multiple times seems like one answer to the "how do we do 'aggregated' actions on alerts" - the answer is that it's up to the alert. If an alert wants to support either or both, it could provide it's own toggle switches to deal with that, then either do a single fire() or multiple or ... both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By design it can only be called once, we could enhance to throw an error if it does. In the scenario to do different actions, they would create different alert instances and fire each of them.

|shouldFire()|Function returns `true` or `false` whether the alert instance should fire or not. This value turns to true when `.fire(...)` is called.|
|getFireOptions()|Function returns an object of `actionGroup`, `context` and `state` to use for firing the actions.|
|resetFire()|Reverts a `.fire(...)` call and makes the instance no longer fire.|
|getState()|Get the current state of the alert instance.|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If state and meta are things that alert creators might make use of, needs more description. Or maybe it's a placeholder for future stuff we know we'll need but don't know the shape of yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'll try to rethink this one, its hard to explain what it will contain since the developer is in control of that. But I can mark where the values will come from / be set from.


```
alertInstanceFactory('server_1')
.replaceState({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really curious about this .replaceState() now! How is related to the state that is returned by the alert executor, which is labelled updated alert type level state in an example above. Perhaps this answers my question above re: having an explicit api to set state instead of via the return value. But ... I'm thinking it's probably something different. If different, I think we need a different name, or will need a simple "prefix" name for these "alert state" vss "alert type state" etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is different state than the updated alert type level state. One is global and one is specific to an alert instance. I'm open for new names. Just have to avoid confusion when you have two alerts for the same type, there's no cross state for that scenario.


Below is an example of an alert that takes advantage of templating:

```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be good to have both a simple example like this one, as well as a more complicated one that shows how you might deal with multiple objects to "action".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

|Property|Description|Type|
|---|---|---|
|alertTypeId|The id value of the alert type you want to call when the alert is scheduled to execute.|string|
|interval|The interval in milliseconds the alert should execute.|number|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking into the near future, we'll want some crontab like thing here. Here's how watcher does this: https://www.elastic.co/guide/en/elastic-stack-overview/7.1/trigger-schedule.html

I propose we eventually follow that, we can do just interval for now, unitless, and should probably be seconds and not millseconds.

That just means moving interval under a new top-level schedule attribute. We can then roll out the other schedule types as needed.

@mikecote mikecote marked this pull request as ready for review June 19, 2019 11:47
@mikecote
Copy link
Contributor Author

Jenkins, test this

@elasticmachine

This comment has been minimized.

Copy link
Contributor

@bmcconaghy bmcconaghy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able with a little bit of fiddling to get a full custom alert + custom action thing going locally following this guide. I've called out the confusions and issues I ran into.

// previous and persisted values
...
})
.fire('default', {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not immediately clear what the default string does here. Reading below I might be able to connect the dots and realize that this is the action group. Also, what if I want to fire multiple action groups? Seems like this wouldn't be supported with the current code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make a change to the docs, in regards to fire multiple action groups, by design, you would have to duplicate the action for each group you would like it to fire for. Ex if you want email on warning and severe action groups, you would have to pass two objects to actions on the create alert API. In the future we could make group an array within the actions array passed into the create alert API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fairly painful to have to recreate actions. I can easily imagine a scenario where I want a less important alert firing slack messages, but if the alert severity grows into a more important scenario, send an email (contrived example). But I'd still want the slack messages fired.

Don't have any thoughts on making it easier tho, ATM. Have thoughts on making it harder, so I'll keep those to myself for now :-)


Below is an example of an alert that takes advantage of templating:

```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

...
actions: [
{
"group": "default",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so thinking about this more: group names are configured on alert instances, but the alert type code itself needs to know names for when it fires. Don't see how this will work in practice unless we only have predefined names that you can use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think when we come up with throttling we'll make the alert type the source of truth for these groups and define a priority or something so we know when to stop throttling when alert status gets worse.

@elasticmachine

This comment has been minimized.

@mikecote
Copy link
Contributor Author

Thank you so much for the feedback @gchaps! All the changes have been applied as suggested. They all made sense.

@elasticmachine

This comment has been minimized.

namespace: undefined, // The namespace the action exists within
basePath: undefined, // Usually `request.getBasePath();` or `undefined`
});
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading through this document I have a good understanding of the different pieces like action types, CRUD for actions, and firing. But an example that shows the full usage from creation of an action to the action being completed might help answer some questions. For example: after I create an action and fire it, how do I get its status - how do I know it worked? Do I get that through task manager, do I poll the action to get a status?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example: after I create an action and fire it, how do I get its status - how do I know it worked? Do I get that through task manager, do I poll the action to get a status?

TBD; seems like we'll have some kind of history &| audit log of actions run, so you'd need to poll that. We'll need to have the action fire return some kind of id for the action fire status, that would eventually show up in the history &| audit log.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question related to action lifecycle, once an action has been fired and executed, should I delete it or keep it for history? Are there any consequences to not deleting them, do they get cleaned up periodically?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do they get cleaned up periodically?

I just did a test regarding this, if a task isn't re-scheduled, it gets removed from the task manager. The task manager periodically cleans up expired tasks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just had a conversation with @bmcconaghy about this. We're going to add a REST API to fire actions (PR coming soon). We will change the behaviour of this manual function and make it hold off until the execution is complete. This function will then be able to throw errors, if any occur, during the execution.

@elasticmachine
Copy link
Contributor

💔 Build Failed


|Property|Description|Type|
|---|---|---|
|id|The id of the alert you're trying to get.|string|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some information on what gets returned would be useful - I think we're just returning the alert configuration (i.e. what is in the saved object). Some people might be expecting instance data like alert state being returned here (say I want to get my alert and show its state in UI). But sounds like we'd need to get that via the alertInstanceFactory since it is part of the instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can see that. We can add new APIs or modify them as we build the alerting app to display valuable information. We may have to move things around or develop something new to make the data accessible.

Copy link
Contributor

@peterschretlen peterschretlen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good I left a few comments!

Copy link
Contributor

@bmcconaghy bmcconaghy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We'll have to be extra careful to keep these in sync with the implementations.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@mikecote mikecote merged commit c3ceb4d into elastic:feature/alerting Jun 20, 2019
@elasticmachine
Copy link
Contributor

💔 Build Failed

mikecote added a commit to mikecote/kibana that referenced this pull request Jun 20, 2019
* Create actions plugin (elastic#35679)

* Basic alerting plugin with actions

* Remove relative imports

* Code cleanup

* Split service into 3 parts, change connector structure

* Ability to disable plugin, ability to get actions

* Add slack connector

* Add email connector

* Ability to validate params and connector options

* Remove connectorOptionsSecrets for now

* Fix plugin config validation

* Add tests for slack connector

* Default connectors register on plugin init, console renamed to log, slack to message_slack

* Add remaining API endpoints for action CRUD

* Add list connectors API

* Change actions CRUD APIs to be closer with saved objects structure

* WIP

* Fix broken tests

* Add encrypted attribute support

* Add params and connectorOptions for email

* WIP

* Remove action's ability to have custom ids

* Remove ts-ignore

* Fix broken test

* Remove default connectors from this branch

* Fix API integration tests to use fixture connector

* Rename connector terminology to action type

* Rename actionTypeOptions to actionTypeConfig

* Code cleanup

* Fix broken tests

* Rename alerting plugin to actions

* Some code cleanup and add API unit tests

* Change signature of action type service execute function

* Add some plugin api integration tests

* Fix type check failure

* Code cleanup

* Create an actions client instead of an action service

* Apply Bill's PR feedback

* Fix broken test

* Find function to have destructured params

* Add tests to ensure encrypted attributes are not returned

* Fix broken test

* Add tests for validation

* Ensure actions can be updated without re-passing the config

* Remove dead code

* Test cleanup

* Fix eslint issue

* Apply Peter's PR feedback

* Code cleanup and fix broken tests

* Apply Brandon's PR feedback

* Add namespace support

* Fix broken test

* Pass services to action executors (elastic#37194)

* Pass services to action executors

* Fix tests

* Apply PR feedback

* Apply PR feedback pt2

* Cleanup actions plugin (elastic#37250)

* Cleanup actions, move code from alerting plugin PR

* Rename service terminology to registry

* Use static encryption key for encrypted attributes plugin inside of tests

* Empty data after create test is done running

* Fix type checks

* Fix inconsistent naming

* add server log action for alerting (elastic#37530)

adds the first "builtin" alertType for performing a `server.log()`

* Create alerting plugin (elastic#37043)

* WIP

* Rename fire function and remove @ts-ignore in all places

* Change naming in alerting service

* Remove alert instance class for now, support interval configuration

* Cleanup TS

* Split alerting between registry and client

* Use saved object alongside task manager instance

* Add remaining alerting APIs

* Change create structure

* Rename some variables, change actionGroups structure

* Use handlebars for templating strings at fire time

* Fix params given to alert type execute function

* Use alert instance class

* Alert instances support meta attributes

* Move alert instances deserialization

* Change interval to be ms

* Rename actions es archive

* Fix tests to use encrypted esArchive for action record

* Add create alert test to demo end to end flow

* Fix type check issue

* Alerts to use references to action objects

* Only update task manager tasks after saved objects are fully updated

* Use scope in task manager

* Fix type check

* Use task manager to execute actions

* Convert ids into references and back

* Apply PR feedback

* Fix broken test

* Fix some bugs

* Fix test errors

* Alert interval to be previous runAt + interval instead of now + interval

* Add range support

* Remove extra line

* Cleanup

* Add alert_instance.test.ts

* Add alert_type_registry.test.ts

* Move tests around

* Create generic task manager mock

* Add note about saved objects client mock

* Create alert_type_registry.mock.ts

* Add alerts_client.test.ts

* Add create_alert_instance_factory.test.ts

* Add create_fire_handler.test.ts

* WIP

* Fix get_create_task_runner_function.test.ts and make test pass

* Make get_create_task_runner_function.test.ts 100% coverage

* Add unit tests for routes

* Move files around

* Created transform_action_params.ts

* Add get_next_run_at.ts

* Add comment explaining why we copy nextRunAt

* Re-use state within alert instance

* Finalize code coverage in unit tests

* Create base api integration tests

* Add a test that ensures end to end functionality of an alert

* Fix ui capabilities test

* Fix broken plugin api integration test

* Fix jest tests with new saved objects client

* Fix broken integration tests

* Change api integration test fixture to make more sense, add functions for future tests

* Move alerts integration testing into own file, prep to add more tests

* Add tests to ensure failed task instances get retried

* Add get_create_task_runner_function.test.ts for actions, create encrypted saved objects mock

* Add action validation tests

* Ensure action type validation occurs on update

* Test 400 on unregistered alert types

* Ensure alertTypeId can't be updated

* Add validation test for alert create / update

* Fix broken checks / tests

* Skip failing test for now

* Cleanup jest tests

* Ensure action objects can be updated while keeping encrypted attributes readable

* Remove partial update sopport, remove ability to change actionTypeId, require config

* Ensure actionTypeConfig is validated on create and update

* Add alertTypeParams validation support

* Fix failing tests

* Ensure alert cleanup errors don't replace the original error

* Pass callCluster as a service to alerts and actions

* Only pass log to alerts client

* Pass savedObjectsClient as a service to alerting and actions

* Fix failing tests

* Remove range support, provide when current and previous task got scheduled

* Ensure Joi validation happens before every execute

* Remove skipped tests, to be done in future PR

* Apply self feedback pt1

* Apply self feedback pt2

* Fix broken tests

* Apply PR feedback

* PR feedback pt1

* Apply security team PR feedback

* PR feedback pt1

* PR feedback pt2

* PR feedback pt3

* Fix broken tests

* Fix callCluster to have signature

* Revert f11a6ae

* PR feedback pt4

* Remove __jest__ folders

* PR feedback pt5

* Fix Joi from leaking secrets

* Fire instance actions in parallel instead of series

* Fix failing jest tests

* Accept core api changes

* Fix saved objects client mock

* PR feedback pt1

* Fix eslint issues

* Throw error when alert instance already fired (elastic#39251)

* Throw error when alert instance already fired

* shouldFire doesn't need its own boolean value

* Actions & alerting getting started user guides (elastic#39093)

* Initial user guides

* Cleanup

* Typos, example changes

* Switch to tables, use ordered list for usage

* Start docs around alert instances and templating

* Documentation changes

* Some adjustments

* Apply PR feedback

* Apply suggestions from code review

Co-Authored-By: gchaps <[email protected]>

* PR feedback pt2

* Provide better examples for alert types

* Apply PR feedback

* Update README locations
mikecote added a commit to mikecote/kibana that referenced this pull request Jun 21, 2019
* Create actions plugin (elastic#35679)

* Basic alerting plugin with actions

* Remove relative imports

* Code cleanup

* Split service into 3 parts, change connector structure

* Ability to disable plugin, ability to get actions

* Add slack connector

* Add email connector

* Ability to validate params and connector options

* Remove connectorOptionsSecrets for now

* Fix plugin config validation

* Add tests for slack connector

* Default connectors register on plugin init, console renamed to log, slack to message_slack

* Add remaining API endpoints for action CRUD

* Add list connectors API

* Change actions CRUD APIs to be closer with saved objects structure

* WIP

* Fix broken tests

* Add encrypted attribute support

* Add params and connectorOptions for email

* WIP

* Remove action's ability to have custom ids

* Remove ts-ignore

* Fix broken test

* Remove default connectors from this branch

* Fix API integration tests to use fixture connector

* Rename connector terminology to action type

* Rename actionTypeOptions to actionTypeConfig

* Code cleanup

* Fix broken tests

* Rename alerting plugin to actions

* Some code cleanup and add API unit tests

* Change signature of action type service execute function

* Add some plugin api integration tests

* Fix type check failure

* Code cleanup

* Create an actions client instead of an action service

* Apply Bill's PR feedback

* Fix broken test

* Find function to have destructured params

* Add tests to ensure encrypted attributes are not returned

* Fix broken test

* Add tests for validation

* Ensure actions can be updated without re-passing the config

* Remove dead code

* Test cleanup

* Fix eslint issue

* Apply Peter's PR feedback

* Code cleanup and fix broken tests

* Apply Brandon's PR feedback

* Add namespace support

* Fix broken test

* Pass services to action executors (elastic#37194)

* Pass services to action executors

* Fix tests

* Apply PR feedback

* Apply PR feedback pt2

* Cleanup actions plugin (elastic#37250)

* Cleanup actions, move code from alerting plugin PR

* Rename service terminology to registry

* Use static encryption key for encrypted attributes plugin inside of tests

* Empty data after create test is done running

* Fix type checks

* Fix inconsistent naming

* add server log action for alerting (elastic#37530)

adds the first "builtin" alertType for performing a `server.log()`

* Create alerting plugin (elastic#37043)

* WIP

* Rename fire function and remove @ts-ignore in all places

* Change naming in alerting service

* Remove alert instance class for now, support interval configuration

* Cleanup TS

* Split alerting between registry and client

* Use saved object alongside task manager instance

* Add remaining alerting APIs

* Change create structure

* Rename some variables, change actionGroups structure

* Use handlebars for templating strings at fire time

* Fix params given to alert type execute function

* Use alert instance class

* Alert instances support meta attributes

* Move alert instances deserialization

* Change interval to be ms

* Rename actions es archive

* Fix tests to use encrypted esArchive for action record

* Add create alert test to demo end to end flow

* Fix type check issue

* Alerts to use references to action objects

* Only update task manager tasks after saved objects are fully updated

* Use scope in task manager

* Fix type check

* Use task manager to execute actions

* Convert ids into references and back

* Apply PR feedback

* Fix broken test

* Fix some bugs

* Fix test errors

* Alert interval to be previous runAt + interval instead of now + interval

* Add range support

* Remove extra line

* Cleanup

* Add alert_instance.test.ts

* Add alert_type_registry.test.ts

* Move tests around

* Create generic task manager mock

* Add note about saved objects client mock

* Create alert_type_registry.mock.ts

* Add alerts_client.test.ts

* Add create_alert_instance_factory.test.ts

* Add create_fire_handler.test.ts

* WIP

* Fix get_create_task_runner_function.test.ts and make test pass

* Make get_create_task_runner_function.test.ts 100% coverage

* Add unit tests for routes

* Move files around

* Created transform_action_params.ts

* Add get_next_run_at.ts

* Add comment explaining why we copy nextRunAt

* Re-use state within alert instance

* Finalize code coverage in unit tests

* Create base api integration tests

* Add a test that ensures end to end functionality of an alert

* Fix ui capabilities test

* Fix broken plugin api integration test

* Fix jest tests with new saved objects client

* Fix broken integration tests

* Change api integration test fixture to make more sense, add functions for future tests

* Move alerts integration testing into own file, prep to add more tests

* Add tests to ensure failed task instances get retried

* Add get_create_task_runner_function.test.ts for actions, create encrypted saved objects mock

* Add action validation tests

* Ensure action type validation occurs on update

* Test 400 on unregistered alert types

* Ensure alertTypeId can't be updated

* Add validation test for alert create / update

* Fix broken checks / tests

* Skip failing test for now

* Cleanup jest tests

* Ensure action objects can be updated while keeping encrypted attributes readable

* Remove partial update sopport, remove ability to change actionTypeId, require config

* Ensure actionTypeConfig is validated on create and update

* Add alertTypeParams validation support

* Fix failing tests

* Ensure alert cleanup errors don't replace the original error

* Pass callCluster as a service to alerts and actions

* Only pass log to alerts client

* Pass savedObjectsClient as a service to alerting and actions

* Fix failing tests

* Remove range support, provide when current and previous task got scheduled

* Ensure Joi validation happens before every execute

* Remove skipped tests, to be done in future PR

* Apply self feedback pt1

* Apply self feedback pt2

* Fix broken tests

* Apply PR feedback

* PR feedback pt1

* Apply security team PR feedback

* PR feedback pt1

* PR feedback pt2

* PR feedback pt3

* Fix broken tests

* Fix callCluster to have signature

* Revert f11a6ae

* PR feedback pt4

* Remove __jest__ folders

* PR feedback pt5

* Fix Joi from leaking secrets

* Fire instance actions in parallel instead of series

* Fix failing jest tests

* Accept core api changes

* Fix saved objects client mock

* PR feedback pt1

* Fix eslint issues

* Throw error when alert instance already fired (elastic#39251)

* Throw error when alert instance already fired

* shouldFire doesn't need its own boolean value

* Actions & alerting getting started user guides (elastic#39093)

* Initial user guides

* Cleanup

* Typos, example changes

* Switch to tables, use ordered list for usage

* Start docs around alert instances and templating

* Documentation changes

* Some adjustments

* Apply PR feedback

* Apply suggestions from code review

Co-Authored-By: gchaps <[email protected]>

* PR feedback pt2

* Provide better examples for alert types

* Apply PR feedback

* Update README locations
mikecote added a commit that referenced this pull request Jun 21, 2019
* Create actions plugin (#35679)

* Basic alerting plugin with actions

* Remove relative imports

* Code cleanup

* Split service into 3 parts, change connector structure

* Ability to disable plugin, ability to get actions

* Add slack connector

* Add email connector

* Ability to validate params and connector options

* Remove connectorOptionsSecrets for now

* Fix plugin config validation

* Add tests for slack connector

* Default connectors register on plugin init, console renamed to log, slack to message_slack

* Add remaining API endpoints for action CRUD

* Add list connectors API

* Change actions CRUD APIs to be closer with saved objects structure

* WIP

* Fix broken tests

* Add encrypted attribute support

* Add params and connectorOptions for email

* WIP

* Remove action's ability to have custom ids

* Remove ts-ignore

* Fix broken test

* Remove default connectors from this branch

* Fix API integration tests to use fixture connector

* Rename connector terminology to action type

* Rename actionTypeOptions to actionTypeConfig

* Code cleanup

* Fix broken tests

* Rename alerting plugin to actions

* Some code cleanup and add API unit tests

* Change signature of action type service execute function

* Add some plugin api integration tests

* Fix type check failure

* Code cleanup

* Create an actions client instead of an action service

* Apply Bill's PR feedback

* Fix broken test

* Find function to have destructured params

* Add tests to ensure encrypted attributes are not returned

* Fix broken test

* Add tests for validation

* Ensure actions can be updated without re-passing the config

* Remove dead code

* Test cleanup

* Fix eslint issue

* Apply Peter's PR feedback

* Code cleanup and fix broken tests

* Apply Brandon's PR feedback

* Add namespace support

* Fix broken test

* Pass services to action executors (#37194)

* Pass services to action executors

* Fix tests

* Apply PR feedback

* Apply PR feedback pt2

* Cleanup actions plugin (#37250)

* Cleanup actions, move code from alerting plugin PR

* Rename service terminology to registry

* Use static encryption key for encrypted attributes plugin inside of tests

* Empty data after create test is done running

* Fix type checks

* Fix inconsistent naming

* add server log action for alerting (#37530)

adds the first "builtin" alertType for performing a `server.log()`

* Create alerting plugin (#37043)

* WIP

* Rename fire function and remove @ts-ignore in all places

* Change naming in alerting service

* Remove alert instance class for now, support interval configuration

* Cleanup TS

* Split alerting between registry and client

* Use saved object alongside task manager instance

* Add remaining alerting APIs

* Change create structure

* Rename some variables, change actionGroups structure

* Use handlebars for templating strings at fire time

* Fix params given to alert type execute function

* Use alert instance class

* Alert instances support meta attributes

* Move alert instances deserialization

* Change interval to be ms

* Rename actions es archive

* Fix tests to use encrypted esArchive for action record

* Add create alert test to demo end to end flow

* Fix type check issue

* Alerts to use references to action objects

* Only update task manager tasks after saved objects are fully updated

* Use scope in task manager

* Fix type check

* Use task manager to execute actions

* Convert ids into references and back

* Apply PR feedback

* Fix broken test

* Fix some bugs

* Fix test errors

* Alert interval to be previous runAt + interval instead of now + interval

* Add range support

* Remove extra line

* Cleanup

* Add alert_instance.test.ts

* Add alert_type_registry.test.ts

* Move tests around

* Create generic task manager mock

* Add note about saved objects client mock

* Create alert_type_registry.mock.ts

* Add alerts_client.test.ts

* Add create_alert_instance_factory.test.ts

* Add create_fire_handler.test.ts

* WIP

* Fix get_create_task_runner_function.test.ts and make test pass

* Make get_create_task_runner_function.test.ts 100% coverage

* Add unit tests for routes

* Move files around

* Created transform_action_params.ts

* Add get_next_run_at.ts

* Add comment explaining why we copy nextRunAt

* Re-use state within alert instance

* Finalize code coverage in unit tests

* Create base api integration tests

* Add a test that ensures end to end functionality of an alert

* Fix ui capabilities test

* Fix broken plugin api integration test

* Fix jest tests with new saved objects client

* Fix broken integration tests

* Change api integration test fixture to make more sense, add functions for future tests

* Move alerts integration testing into own file, prep to add more tests

* Add tests to ensure failed task instances get retried

* Add get_create_task_runner_function.test.ts for actions, create encrypted saved objects mock

* Add action validation tests

* Ensure action type validation occurs on update

* Test 400 on unregistered alert types

* Ensure alertTypeId can't be updated

* Add validation test for alert create / update

* Fix broken checks / tests

* Skip failing test for now

* Cleanup jest tests

* Ensure action objects can be updated while keeping encrypted attributes readable

* Remove partial update sopport, remove ability to change actionTypeId, require config

* Ensure actionTypeConfig is validated on create and update

* Add alertTypeParams validation support

* Fix failing tests

* Ensure alert cleanup errors don't replace the original error

* Pass callCluster as a service to alerts and actions

* Only pass log to alerts client

* Pass savedObjectsClient as a service to alerting and actions

* Fix failing tests

* Remove range support, provide when current and previous task got scheduled

* Ensure Joi validation happens before every execute

* Remove skipped tests, to be done in future PR

* Apply self feedback pt1

* Apply self feedback pt2

* Fix broken tests

* Apply PR feedback

* PR feedback pt1

* Apply security team PR feedback

* PR feedback pt1

* PR feedback pt2

* PR feedback pt3

* Fix broken tests

* Fix callCluster to have signature

* Revert f11a6ae

* PR feedback pt4

* Remove __jest__ folders

* PR feedback pt5

* Fix Joi from leaking secrets

* Fire instance actions in parallel instead of series

* Fix failing jest tests

* Accept core api changes

* Fix saved objects client mock

* PR feedback pt1

* Fix eslint issues

* Throw error when alert instance already fired (#39251)

* Throw error when alert instance already fired

* shouldFire doesn't need its own boolean value

* Actions & alerting getting started user guides (#39093)

* Initial user guides

* Cleanup

* Typos, example changes

* Switch to tables, use ordered list for usage

* Start docs around alert instances and templating

* Documentation changes

* Some adjustments

* Apply PR feedback

* Apply suggestions from code review

Co-Authored-By: gchaps <[email protected]>

* PR feedback pt2

* Provide better examples for alert types

* Apply PR feedback

* Update README locations
@mikecote mikecote mentioned this pull request Jun 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature:Alerting release_note:skip Skip the PR/issue when compiling release notes review v7.3.0 v8.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants