[eem] remove history transforms#193999
Conversation
🤖 GitHub commentsExpand to view the GitHub comments
Just comment with:
|
… src/core/server/integration_tests/ci_checks'
| ]) | ||
| ), | ||
| installStartedAt: z.optional(z.string()), | ||
| installedComponents: z.optional( |
There was a problem hiding this comment.
this allows cleaning up versions of definitions that contain history components
| export const ENTITY_DEFAULT_METADATA_LIMIT = 1000; | ||
| export const ENTITY_DEFAULT_LATEST_FREQUENCY = '1m'; | ||
| export const ENTITY_DEFAULT_LATEST_SYNC_DELAY = '60s'; | ||
| export const ENTITY_DEFAULT_METADATA_LIMIT = 10; |
There was a problem hiding this comment.
took the opportunity to reduce the default limit
💛 Build succeeded, but was flaky
Failed CI StepsMetrics [docs]Async chunks
History
To update your PR or re-run it, just comment with: |
|
|
||
| export const SO_ENTITY_DEFINITION_TYPE = 'entity-definition'; | ||
|
|
||
| export const backfillInstalledComponents: SavedObjectModelDataBackfillFn< |
|
@jaredburgettelastic @hop-dev I'll be merging that change soon, could you please verify that it doesn’t break anything that would not be covered by tests ? |
|
@klacabane please go ahead and merge when you're ready, we have a big PR incoming and will fix any issues in that |
💛 Build succeeded, but was flaky
Failed CI StepsMetrics [docs]Async chunks
History
|
simianhacker
left a comment
There was a problem hiding this comment.
I added some comments about changing the entity.id generation with this change. Since this PR makes a significant change, I'd advocate for changing that too.
| { | ||
| script: { | ||
| description: 'Generated the entity.id field', | ||
| source: cleanScript(` | ||
| // This function will recursively collect all the values of a HashMap of HashMaps | ||
| Collection collectValues(HashMap subject) { | ||
| Collection values = new ArrayList(); | ||
| // Iterate through the values | ||
| for(Object value: subject.values()) { | ||
| // If the value is a HashMap, recurse | ||
| if (value instanceof HashMap) { | ||
| values.addAll(collectValues((HashMap) value)); | ||
| } else { | ||
| values.add(String.valueOf(value)); | ||
| } | ||
| } | ||
| return values; | ||
| } | ||
|
|
||
| // Create the string builder | ||
| StringBuilder entityId = new StringBuilder(); | ||
|
|
||
| if (ctx["entity"]["identity"] != null) { | ||
| // Get the values as a collection | ||
| Collection values = collectValues(ctx["entity"]["identity"]); | ||
|
|
||
| // Convert to a list and sort | ||
| List sortedValues = new ArrayList(values); | ||
| Collections.sort(sortedValues); | ||
|
|
||
| // Create comma delimited string | ||
| for(String instanceValue: sortedValues) { | ||
| entityId.append(instanceValue); | ||
| entityId.append(":"); | ||
| } | ||
|
|
||
| // Assign the entity.id | ||
| ctx["entity"]["id"] = entityId.length() > 0 ? entityId.substring(0, entityId.length() - 1) : "unknown"; | ||
| } | ||
| `), | ||
| }, | ||
| }, | ||
| { | ||
| fingerprint: { | ||
| fields: ['entity.id'], | ||
| target_field: 'entity.id', | ||
| method: 'MurmurHash3', | ||
| }, | ||
| }, |
There was a problem hiding this comment.
@tommyers-elastic I kind of wonder if we shouldn't take this opportunity to simplify the entity.id generation? I purpose we remove the hashing AND change the entity.id to be:
{
set: {
field: 'entity.id',
value: definition.identityFields.map((identityField) => `{{entity.identity.${identityField.field)}}`).join(':'),
}
}
For something like ['service.name', 'service.environment'] it would look like {{entity.identity.service.name}}:{{entity.identity.service.environment}}
Thoughts @miltonhultgren @klacabane
There was a problem hiding this comment.
This makes sense and it is implemented in #193652. I'd love to keep these separate as this PR has been opened for a while now and all the lights are finally green
There was a problem hiding this comment.
|
Starting backport for target branches: 8.x https://github.com/elastic/kibana/actions/runs/11263190736 |
### Summary Remove history and backfill transforms, leaving latest transform in place. Notable changes to latest transform: - it does not read from history output anymore but source indices defined on the definition - it defines a `latest.lookbackPeriod` to limit the amount of data ingested, which defaults to 24h - each metadata aggregation now accepts a `metadata.aggregation.lookbackPeriod` which defaults to the `latest.lookbackPeriod` - `entity.firstSeenTimestamp` is removed. this should be temporary until we have a solution for elastic/elastic-entity-model#174 - latest metrics used to get the latest pre-computed value from history data, but is it now aggregating over the `lookbackPeriod` in the source indices (which can be filtered down with `metrics.filter`) - `latest` block on the entity definition is now mandatory --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Mark Hopkin <mark.hopkin@elastic.co> (cherry picked from commit 8f8e988)
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
# Backport This will backport the following commits from `main` to `8.x`: - [[eem] remove history transforms (#193999)](#193999) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Kevin Lacabane","email":"kevin.lacabane@elastic.co"},"sourceCommit":{"committedDate":"2024-10-09T21:15:33Z","message":"[eem] remove history transforms (#193999)\n\n### Summary\r\n\r\nRemove history and backfill transforms, leaving latest transform in\r\nplace.\r\n\r\nNotable changes to latest transform:\r\n- it does not read from history output anymore but source indices\r\ndefined on the definition\r\n- it defines a `latest.lookbackPeriod` to limit the amount of data\r\ningested, which defaults to 24h\r\n- each metadata aggregation now accepts a\r\n`metadata.aggregation.lookbackPeriod` which defaults to the\r\n`latest.lookbackPeriod`\r\n- `entity.firstSeenTimestamp` is removed. this should be temporary until\r\nwe have a solution for\r\nhttps://github.com/elastic/elastic-entity-model/issues/174\r\n- latest metrics used to get the latest pre-computed value from history\r\ndata, but is it now aggregating over the `lookbackPeriod` in the source\r\nindices (which can be filtered down with `metrics.filter`)\r\n- `latest` block on the entity definition is now mandatory\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: Mark Hopkin <mark.hopkin@elastic.co>","sha":"8f8e9883e0a8e78a632418a0677980f758450351","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","backport:prev-minor","ci:project-deploy-observability","Feature:EEM","team:obs-entities"],"title":"[eem] remove history transforms","number":193999,"url":"https://github.com/elastic/kibana/pull/193999","mergeCommit":{"message":"[eem] remove history transforms (#193999)\n\n### Summary\r\n\r\nRemove history and backfill transforms, leaving latest transform in\r\nplace.\r\n\r\nNotable changes to latest transform:\r\n- it does not read from history output anymore but source indices\r\ndefined on the definition\r\n- it defines a `latest.lookbackPeriod` to limit the amount of data\r\ningested, which defaults to 24h\r\n- each metadata aggregation now accepts a\r\n`metadata.aggregation.lookbackPeriod` which defaults to the\r\n`latest.lookbackPeriod`\r\n- `entity.firstSeenTimestamp` is removed. this should be temporary until\r\nwe have a solution for\r\nhttps://github.com/elastic/elastic-entity-model/issues/174\r\n- latest metrics used to get the latest pre-computed value from history\r\ndata, but is it now aggregating over the `lookbackPeriod` in the source\r\nindices (which can be filtered down with `metrics.filter`)\r\n- `latest` block on the entity definition is now mandatory\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: Mark Hopkin <mark.hopkin@elastic.co>","sha":"8f8e9883e0a8e78a632418a0677980f758450351"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/193999","number":193999,"mergeCommit":{"message":"[eem] remove history transforms (#193999)\n\n### Summary\r\n\r\nRemove history and backfill transforms, leaving latest transform in\r\nplace.\r\n\r\nNotable changes to latest transform:\r\n- it does not read from history output anymore but source indices\r\ndefined on the definition\r\n- it defines a `latest.lookbackPeriod` to limit the amount of data\r\ningested, which defaults to 24h\r\n- each metadata aggregation now accepts a\r\n`metadata.aggregation.lookbackPeriod` which defaults to the\r\n`latest.lookbackPeriod`\r\n- `entity.firstSeenTimestamp` is removed. this should be temporary until\r\nwe have a solution for\r\nhttps://github.com/elastic/elastic-entity-model/issues/174\r\n- latest metrics used to get the latest pre-computed value from history\r\ndata, but is it now aggregating over the `lookbackPeriod` in the source\r\nindices (which can be filtered down with `metrics.filter`)\r\n- `latest` block on the entity definition is now mandatory\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: Mark Hopkin <mark.hopkin@elastic.co>","sha":"8f8e9883e0a8e78a632418a0677980f758450351"}}]}] BACKPORT--> Co-authored-by: Kevin Lacabane <kevin.lacabane@elastic.co>

Summary
Remove history and backfill transforms, leaving latest transform in place.
Notable changes to latest transform:
latest.lookbackPeriodto limit the amount of data ingested, which defaults to 24hmetadata.aggregation.lookbackPeriodwhich defaults to thelatest.lookbackPeriodentity.firstSeenTimestampis removed. this should be temporary until we have a solution for https://github.com/elastic/elastic-entity-model/issues/174lookbackPeriodin the source indices (which can be filtered down withmetrics.filter)latestblock on the entity definition is now mandatory