[APM] Add kibana.alert.grouping to latency threshold alerts#254904
[APM] Add kibana.alert.grouping to latency threshold alerts#254904fkanout merged 4 commits intoelastic:mainfrom
Conversation
|
Pinging @elastic/actionable-obs-team (Team:actionable-obs) |
|
Pinging @elastic/obs-presentation-team (Team:obs-presentation) |
smith
left a comment
There was a problem hiding this comment.
Verified locally against OTel demo data. Created an APM latency threshold rule for frontend-proxy with group-by on service.name, service.environment, transaction.type, and transaction.name. Alert fired and kibana.alert.grouping is correctly populated:
{
"service": {
"name": "frontend-proxy",
"environment": "ENVIRONMENT_NOT_DEFINED"
},
"transaction": {
"type": "request",
"name": "ingress"
}
}LGTM.
| expect(alerts[0]).property('service.environment', 'production'); | ||
| expect(alerts[0]).property('transaction.type', 'request'); | ||
| expect(alerts[0]).property('transaction.name', 'tx-node'); | ||
| expect(alerts[0]) |
There was a problem hiding this comment.
You can use expect.objectContaining to have fewer assertions.
|
@fkanout I was checking a similar ticket I worked on in the past for SLO burn rate rule. I can see I had to add a dynamic template there. Here's the change I did in a draft PR a while ago. I would expect we need to do something similar to the APM rule types. I would ask @benakansara to weigh in here. |
|
I haven't test it locally. Can you share the rule configuration you used? |
yes, we need to add dynamic template. Without it, the mapping can be incorrect affecting query results, auto complete, etc. Current mapping without dynamic template: |
|
@benakansara @mgiota, thanks for the review and the heads up! Updated 4ad1d81 |
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]
History
cc @fkanout |
| const groupByActionVariables = getGroupByActionVariables(groupByFields); | ||
| const groupingObject = unflattenObject(groupByFields); | ||
| const groupingObjectFromRecoveredAlert = | ||
| alertHits?.[ALERT_GROUPING] ?? unflattenObject(groupByFields); |
There was a problem hiding this comment.
I was cross-referencing infra and slo and they don't use a fallback there alertHits?.[ALERT_GROUPING]
@fkanout I guess fallback doesn't hurt here. That makes me wonder if we need to add it to the other rule types? I am trying to understand in what case the alert grouping won't be in the document.
Otherwise looks good to me. @benakansara can I hear your thoughts as well?
There was a problem hiding this comment.
I think we don't need the fallback here.
pmuellr
left a comment
There was a problem hiding this comment.
ResponseOps changes LGTM
20095f4 to
55b32cb
Compare
ApprovabilityVerdict: Needs human review This PR adds a new field to alert documents, constituting a feature addition with runtime behavior changes. The author doesn't own any of the modified files (all owned by @elastic/obs-presentation-team), and there's an unresolved design question about consistency with other rule types. You can customize Macroscope's approvability policy. Learn more. |
|
Starting backport for target branches: 9.2, 9.3 https://github.com/elastic/kibana/actions/runs/23543444881 |
💔 All backports failed
Manual backportTo create the backport manually run: Questions ?Please refer to the Backport tool documentation |
…254904) ## Summary Fixes elastic#224898 Implements `kibana.alert.grouping` for the APM Latency threshold rule (`apm.transaction_duration`) in line with the Observability grouping initiative. ### What changed - Added `kibana.alert.grouping` to active alert payloads in the transaction duration executor. - Updated recovered alert context to prefer `kibana.alert.grouping` from the recovered alert document, with a backward-compatible fallback to reconstructed grouping for older alerts. - Updated latency rule unit tests to validate payload/context behavior. - Updated deployment-agnostic APM API integration tests to assert `kibana.alert.grouping` is indexed with the expected nested structure. <img width="1550" height="1504" alt="Screenshot 2026-02-25 at 12 57 53" src="https://github.com/user-attachments/assets/d7fd2fc6-788b-4d4a-8c02-6b3a25d1d635" /> ## Why This ensures grouping is first-class in alert documents for filtering/searching and keeps recovered `context.grouping` aligned with the alert document source of truth. ## Test plan - `yarn test:jest x-pack/solutions/observability/plugins/apm/server/routes/alerts/rule_types/transaction_duration/register_transaction_duration_rule_type.test.ts` - `yarn test:ftr --config x-pack/solutions/observability/test/api_integration_deployment_agnostic/configs/stateful/oblt.apm.stateful.config.ts --grep "transaction duration alert"` ## Validation results - Unit tests: **PASS** (7/7) - Deployment-agnostic stateful APM suite (filtered to transaction duration alert): **PASS** (22 passing, exit code 0) ## Notes - No saved object migration is required. - Shared APM mapping support for `kibana.alert.grouping.*` is additive. --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
1 similar comment
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
…254904) ## Summary Fixes elastic#224898 Implements `kibana.alert.grouping` for the APM Latency threshold rule (`apm.transaction_duration`) in line with the Observability grouping initiative. ### What changed - Added `kibana.alert.grouping` to active alert payloads in the transaction duration executor. - Updated recovered alert context to prefer `kibana.alert.grouping` from the recovered alert document, with a backward-compatible fallback to reconstructed grouping for older alerts. - Updated latency rule unit tests to validate payload/context behavior. - Updated deployment-agnostic APM API integration tests to assert `kibana.alert.grouping` is indexed with the expected nested structure. <img width="1550" height="1504" alt="Screenshot 2026-02-25 at 12 57 53" src="https://github.com/user-attachments/assets/d7fd2fc6-788b-4d4a-8c02-6b3a25d1d635" /> ## Why This ensures grouping is first-class in alert documents for filtering/searching and keeps recovered `context.grouping` aligned with the alert document source of truth. ## Test plan - `yarn test:jest x-pack/solutions/observability/plugins/apm/server/routes/alerts/rule_types/transaction_duration/register_transaction_duration_rule_type.test.ts` - `yarn test:ftr --config x-pack/solutions/observability/test/api_integration_deployment_agnostic/configs/stateful/oblt.apm.stateful.config.ts --grep "transaction duration alert"` ## Validation results - Unit tests: **PASS** (7/7) - Deployment-agnostic stateful APM suite (filtered to transaction duration alert): **PASS** (22 passing, exit code 0) ## Notes - No saved object migration is required. - Shared APM mapping support for `kibana.alert.grouping.*` is additive. --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
…254904) ## Summary Fixes elastic#224898 Implements `kibana.alert.grouping` for the APM Latency threshold rule (`apm.transaction_duration`) in line with the Observability grouping initiative. ### What changed - Added `kibana.alert.grouping` to active alert payloads in the transaction duration executor. - Updated recovered alert context to prefer `kibana.alert.grouping` from the recovered alert document, with a backward-compatible fallback to reconstructed grouping for older alerts. - Updated latency rule unit tests to validate payload/context behavior. - Updated deployment-agnostic APM API integration tests to assert `kibana.alert.grouping` is indexed with the expected nested structure. <img width="1550" height="1504" alt="Screenshot 2026-02-25 at 12 57 53" src="https://github.com/user-attachments/assets/d7fd2fc6-788b-4d4a-8c02-6b3a25d1d635" /> ## Why This ensures grouping is first-class in alert documents for filtering/searching and keeps recovered `context.grouping` aligned with the alert document source of truth. ## Test plan - `yarn test:jest x-pack/solutions/observability/plugins/apm/server/routes/alerts/rule_types/transaction_duration/register_transaction_duration_rule_type.test.ts` - `yarn test:ftr --config x-pack/solutions/observability/test/api_integration_deployment_agnostic/configs/stateful/oblt.apm.stateful.config.ts --grep "transaction duration alert"` ## Validation results - Unit tests: **PASS** (7/7) - Deployment-agnostic stateful APM suite (filtered to transaction duration alert): **PASS** (22 passing, exit code 0) ## Notes - No saved object migration is required. - Shared APM mapping support for `kibana.alert.grouping.*` is additive. --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
6 similar comments
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Summary
Fixes #224898
Implements
kibana.alert.groupingfor the APM Latency threshold rule (apm.transaction_duration) in line with the Observability grouping initiative.What changed
kibana.alert.groupingto active alert payloads in the transaction duration executor.kibana.alert.groupingfrom the recovered alert document, with a backward-compatible fallback to reconstructed grouping for older alerts.kibana.alert.groupingis indexed with the expected nested structure.Why
This ensures grouping is first-class in alert documents for filtering/searching and keeps recovered
context.groupingaligned with the alert document source of truth.Test plan
yarn test:jest x-pack/solutions/observability/plugins/apm/server/routes/alerts/rule_types/transaction_duration/register_transaction_duration_rule_type.test.tsyarn test:ftr --config x-pack/solutions/observability/test/api_integration_deployment_agnostic/configs/stateful/oblt.apm.stateful.config.ts --grep "transaction duration alert"Validation results
Notes
kibana.alert.grouping.*is additive.