Skip to content

Update entities schema to support integrations data#262242

Merged
uri-weisman merged 20 commits intoelastic:mainfrom
uri-weisman:feat/entity-store-ecs-relationship-bags
Apr 13, 2026
Merged

Update entities schema to support integrations data#262242
uri-weisman merged 20 commits intoelastic:mainfrom
uri-weisman:feat/entity-store-ecs-relationship-bags

Conversation

@uri-weisman
Copy link
Copy Markdown
Contributor

@uri-weisman uri-weisman commented Apr 9, 2026

Summary

Update entities schema to align with the populated integrations data.

related:

Summary
Refines how relationship fields from logs are materialized on entity documents in the entity store v2 index:
each relationship besides resolution (e.g. owns, supervises) is now an object of { raw_identifiers, ids }.

We introduce a data structure that allows integrations and entity maintainers to populate as much identifiers as possible, it will provide us the ability, in the future, to run a maintainer on entity store indices and find a correlation to those identifiers.

What changed:

For each supported relationship, we collect identifier fields into entity.relationships.<relationship>.raw_identifiers.*. entity.relationships.<relationship>.ids - should represent EUIDs (will be updated by entity maintainers).

Integration data (source) example:

{
  "@timestamp": "2024-09-01T12:00:00.000Z",
  "event": { "module": "endpoint", "dataset": "endpoint.events.network" },
  "user": {
    "id": "S-1-5-21-…-1001",
    "name": "jdoe",
    "entity": {
      "id": "user:jdoe@corp",
      "type": "user",
      "relationships": {
        "owns": {
          "host.name": ["prod-db-01", "prod-db-02"],
          "host.id": ["h-aaa", "h-bbb"],
        },
        "supervises": {
          "user.email": ["delegate@corp"],
        }
      }
    }
  }
}

Stored shape in .entities.v2. without post processing:

"entity": {
  "namespace": "workday"
  "relationships": {
    "owns": {
      "raw_identifiers": {
        "host.name": ["prod-db-01", "prod-db-02"],
        "host.id": ["h-aaa", "h-bbb"]
      },
      "ids": []
    },
    "supervises": {
      "raw_identifiers": { "user.email": ["john123@corp"] },
      "ids": []
    }
  }
}

**Another entity in the store that the relationship's attributes refers to:**

"entity": {
  "id": "user:john123@corp@workday"
  "namespace": "workday"
  "user": {
    "email": "john123@corp"
  }
}

Stored shape in .entities.v2 after post-processing:

"entity": {
  "relationships": {
    "owns": {
      "raw_identifiers": {
        "host.name": ["prod-db-01", "prod-db-02"],
        "host.id": ["h-aaa", "h-bbb"]
      },
      "ids": ["host:prod-db-01@corp"]
    },
    "supervises": {
      "raw_identifiers": { "user.email": ["john123@corp"] },
      "ids": ["user:john123@corp@workday"]
    }
  }
}

Values in raw_identifiers and ids are aggregated over time per entity-store rules (collect / dedupe), so the index may show more than a single event contributed.

@uri-weisman uri-weisman marked this pull request as ready for review April 9, 2026 18:12
@uri-weisman uri-weisman requested a review from a team as a code owner April 9, 2026 18:12
@uri-weisman uri-weisman requested a review from a team as a code owner April 12, 2026 08:45
@uri-weisman uri-weisman added release_note:skip Skip the PR/issue when compiling release notes backport:version Backport to applied version labels v9.4.0 labels Apr 12, 2026
@niros1
Copy link
Copy Markdown
Contributor

niros1 commented Apr 12, 2026

Thanks Uri
Just want to pin point that ids can store "resolved" ids from the raw data OR from any other maintainer or source.

@uri-weisman
Copy link
Copy Markdown
Contributor Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 12, 2026

✅ Actions performed

Full review triggered.

@uri-weisman
Copy link
Copy Markdown
Contributor Author

uri-weisman commented Apr 12, 2026

Just want to pin point that ids can store "resolved" ids from the raw data OR from any other maintainer or source.

@niros1
Entity maintainer will emit API request as follows:

{
  "user": {
    "id": "uri@elastic",
    "relationships": {
      "owns": {
        "ids": ["host:prod-db-01@corp"]
      },
      "supervises": {
        "ids": ["user:john123@corp@workday"]
      }
    }
  }
}

The Integration log will look as follows (integrations are less likely to populate entity.id):
(I don't think we should expose implementation details on ECS based schema, all identifiers including entity.id should be published under the same structure.)

{
  "@timestamp": "2024-09-01T12:00:00.000Z",
  "event": { "module": "endpoint", "dataset": "endpoint.events.network" },
  "user": {
    "id": "uri@elastic",
    "name": "jdoe",
    "entity": {
      "id": "user:jdoe@corp",
      "type": "user",
      "relationships": {
        "owns": {
          "host.name": "blabla"
          "entity.id" ["host:prod-db-01@corp"],
        },
        "supervises": {
          "entity.id": ["user:john123@corp@workday"],
        }
      }
    }
  }
}

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 12, 2026

✅ Actions performed

Full review triggered.

return `${col} = COALESCE(\`entity.relationships.resolution.resolved_to\`, [""])`;
}

return `${col} = COALESCE(\`entity.relationships.${field}.ids\`, [""])`;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is graph ok with having correlations for id only?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for my understanding that was the format before the change but it was directly under the {field}.
@kfirpeled ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The graph works with ids only
In order to correlate between entities and events we pull relevant data per node. And based on the enginemetadata we build the filters using the DSL functions

it is now in review: #261420

@uri-weisman
Copy link
Copy Markdown
Contributor Author

/ci

Copy link
Copy Markdown
Contributor

@kfirpeled kfirpeled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked on existing environment
The graph fails to load due to this change

Tried to "clear all data" from entity store. But now entity store cannot recover due to another issue.

Providing here the error message from the API

Error log 📁
“message”: “verification_exception: Found 7 problems\nline 3:49: Unknown column [entity.relationships.accesses_frequently.ids], did you mean any of [entity.relationships.accesses_frequently, entity.relationships.accesses_infrequently, entity.relationships.owns_inferred, entity.relationships.supervises, entity.relationships.depends_on, entity.relationships.communicates_with, entity.relationships.resolution.resolved_to, entity.relationships.owns]?\nline 4:51: Unknown column [entity.relationships.accesses_infrequently.ids], did you mean any of [entity.relationships.accesses_infrequently, entity.relationships.accesses_frequently, entity.relationships.owns_inferred, entity.relationships.resolution.resolved_to, entity.relationships.communicates_with, entity.relationships.supervises, entity.relationships.depends_on, entity.relationships.resolution.risk.calculated_score, entity.relationships.owns]?\nline 5:47: Unknown column [entity.relationships.communicates_with.ids], did you mean any of [entity.relationships.communicates_with, entity.relationships.owns_inferred, entity.relationships.supervises, entity.relationships.owns, entity.relationships.depends_on, entity.relationships.resolution.resolved_to, entity.relationships.accesses_frequently, entity.relationships.accesses_infrequently, entity.relationships.resolution.risk.calculated_score_norm]?\nline 6:40: Unknown column [entity.relationships.depends_on.ids], did you mean any of [entity.relationships.depends_on, entity.relationships.supervises, entity.relationships.owns, entity.relationships.owns_inferred, entity.relationships.communicates_with, entity.relationships.resolution.resolved_to, entity.relationships.accesses_frequently, entity.relationships.accesses_infrequently, entity.relationships.resolution.risk.calculated_score, entity.relationships.resolution.risk.calculated_level]?\nline 7:34: Unknown column [entity.relationships.owns.ids], did you mean any of [entity.relationships.owns, entity.relationships.owns_inferred, entity.relationships.supervises, entity.relationships.depends_on, entity.relationships.communicates_with, entity.relationships.resolution.resolved_to, entity.relationships.accesses_frequently, entity.relationships.accesses_infrequently, entity.relationships.resolution.risk.calculated_score]?\nline 8:43: Unknown column [entity.relationships.owns_inferred.ids], did you mean any of [entity.relationships.owns_inferred, entity.relationships.owns, entity.relationships.supervises, entity.relationships.accesses_infrequently, entity.relationships.communicates_with, entity.relationships.depends_on, entity.relationships.accesses_frequently, entity.relationships.resolution.resolved_to]?\nline 10:40: Unknown column [entity.relationships.supervises.ids], did you mean any of [entity.relationships.supervises, entity.relationships.depends_on, entity.relationships.communicates_with, entity.relationships.owns_inferred, entity.relationships.owns, entity.relationships.resolution.resolved_to, entity.relationships.accesses_frequently, entity.relationships.accesses_infrequently, entity.relationships.resolution.risk.calculated_score]?”

@uri-weisman
Copy link
Copy Markdown
Contributor Author

/ci

Copy link
Copy Markdown
Contributor

@kfirpeled kfirpeled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on existing serverless
Relationships Owns and resolution, it covers all edge cases

Both worked

Image

@uri-weisman
Copy link
Copy Markdown
Contributor Author

/ci

@uri-weisman uri-weisman enabled auto-merge (squash) April 13, 2026 16:08
@elasticmachine
Copy link
Copy Markdown
Contributor

elasticmachine commented Apr 13, 2026

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
entityStore 128.0KB 128.1KB +103.0B

History

@uri-weisman uri-weisman merged commit 1270467 into elastic:main Apr 13, 2026
18 checks passed
@uri-weisman uri-weisman deleted the feat/entity-store-ecs-relationship-bags branch April 13, 2026 16:33
@kibanamachine
Copy link
Copy Markdown
Contributor

Starting backport for target branches: 9.4

https://github.com/elastic/kibana/actions/runs/24354816960

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Apr 13, 2026
@kibanamachine
Copy link
Copy Markdown
Contributor

💚 All backports created successfully

Status Branch Result
9.4

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Apr 13, 2026
…262856)

# Backport

This will backport the following commits from `main` to `9.4`:
- [Update entities schema to support integrations data
(#262242)](#262242)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Uri
Weisman","email":"68195305+uri-weisman@users.noreply.github.com"},"sourceCommit":{"committedDate":"2026-04-13T16:33:21Z","message":"Update
entities schema to support integrations data
(#262242)","sha":"1270467a2b81d39d05db63cf527ae7a6c7322c76","branchLabelMapping":{"^v9.5.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","ci:build-serverless-image","backport:version","v9.4.0","v9.5.0"],"title":"Update
entities schema to support integrations
data","number":262242,"url":"https://github.com/elastic/kibana/pull/262242","mergeCommit":{"message":"Update
entities schema to support integrations data
(#262242)","sha":"1270467a2b81d39d05db63cf527ae7a6c7322c76"}},"sourceBranch":"main","suggestedTargetBranches":["9.4"],"targetPullRequestStates":[{"branch":"9.4","label":"v9.4.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v9.5.0","branchLabelMappingKey":"^v9.5.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/262242","number":262242,"mergeCommit":{"message":"Update
entities schema to support integrations data
(#262242)","sha":"1270467a2b81d39d05db63cf527ae7a6c7322c76"}}]}]
BACKPORT-->

Co-authored-by: Uri Weisman <68195305+uri-weisman@users.noreply.github.com>
seanrathier added a commit to seanrathier/kibana that referenced this pull request Apr 13, 2026
Tracks the changes needed in communicates_with and accesses maintainers
to align with the EntityRelationship schema introduced in elastic#262242.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
seanrathier added a commit to seanrathier/kibana that referenced this pull request Apr 28, 2026
Tracks the changes needed in communicates_with and accesses maintainers
to align with the EntityRelationship schema introduced in elastic#262242.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
seanrathier added a commit to seanrathier/kibana that referenced this pull request Apr 29, 2026
Tracks the changes needed in communicates_with and accesses maintainers
to align with the EntityRelationship schema introduced in elastic#262242.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:version Backport to applied version labels ci:build-serverless-image release_note:skip Skip the PR/issue when compiling release notes v9.4.0 v9.5.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants