Entity store/maintainer framework poc#253173
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a maintainer task framework for the entity_store plugin, enabling other plugins to register custom recurring tasks that execute within the entity store context (e.g., per-space maintenance operations). The framework includes a public API for registration, persistence via saved objects, and Task Manager integration for scheduling.
Changes:
- Added public API
registerEntityMaintainerto plugin setup contract for task registration - Created
EntityMaintainersTasksClientand saved object type for persistent task storage - Implemented task scheduling in
scheduleEntityMaintainerTaskstriggered during entity store initialization - Modified install flow to consolidate initialization and maintainer scheduling in
AssetManager.init
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
types.ts |
Exports new setup contract with registerEntityMaintainer method |
extract_entity_task.ts |
Added cleanup for test maintainer task on stop |
entity_maintainer_task/types.ts |
Defines task configuration interface and status metadata types |
entity_maintainer_task/index.ts |
Implements task registration, scheduling, and runner logic |
constants.ts |
Added entityMaintainer to task type enum |
config.ts |
Added configuration entry for maintainer tasks with optional interval |
install.ts |
Simplified to call unified assetManager.init method |
request_context_factory.ts |
Instantiates EntityMaintainersTasksClient for request context |
plugin.ts |
Returns setup contract with registerEntityMaintainer and registers new saved object type |
saved_objects/index.ts |
Exports maintainer tasks type and client |
entity_maintainers_tasks_type.ts |
Defines saved object type for storing registered maintainer tasks |
entity_maintainers_tasks_client.ts |
Client for CRUD operations on maintainer tasks saved object |
constants.ts |
Added schema and type for maintainer task entries |
asset_manager.ts |
Refactored to add init method that schedules maintainer tasks after entity initialization |
| EntityMaintainersTasksTypeName, | ||
| ]); | ||
| const maintainerTasksClient = new EntityMaintainersTasksClient(internalRepo, logger); | ||
| maintainerTasksClient.addOrUpdate({ id, interval }).catch((err) => { |
There was a problem hiding this comment.
Not exactly, the id and interval are saved asynchronously to the saved object repository.
The maintainer task registration does not depend on that persistence step.
There was a problem hiding this comment.
But then we don't have the guarantee that the promise has finished? Seems interesting to have it hanging
| dynamic: false, | ||
| properties: { | ||
| 'entity-maintainers-tasks': { | ||
| type: 'nested', |
There was a problem hiding this comment.
Instead of nested I think it would be simpler and more in line with other saved object types to have one document per maintainer
There was a problem hiding this comment.
@hop-dev do you have examples of how others behave?
Tbh, I wonder if we even need to have saved objects for this. Isn't this something that we have all in memory always? Why do we need to store in ES? If we have a kibana restart, the maintainer will be registered again, won't it?
There was a problem hiding this comment.
Yeah if you go to packages/kbn-check-saved-objects-cli/current_mappings.json which is part of this PRs diff we dump all of the saved object mappings in there. This would be the 11th use of nested in a saved object type in Kibana.
The other ones have lots of other properties on the object, whereas here we have one single property which is nested, which to me is like saying "these things are separate documents" because we have no shared maintainer properties.
There was a problem hiding this comment.
@romulets The reason we are using saved object is because we need the id and interval for the scheduling phase.
This is the order:
- on setup phase - registering to the maintainers with all the configuration needed for it.
- on entity-store install phase - scheduling the maintainers, here we should get al the ids of the registered maintainers.
@hop-dev I will go over to figure out if there is a better option then nested
There was a problem hiding this comment.
type: 'nested' – In Elasticsearch, an array of objects should be mapped as nested so each element is a single “document” and the pair id/interval stays together
@hop-dev that is exactly what i want
There was a problem hiding this comment.
My point is if that's what you want, and you have no other properties on this saved object, why not just have them as separate objects? I won't keep pushing the point, the core team will review too so they might give us some helpful guidance.
There was a problem hiding this comment.
I want them to be together, because on install, when fetching all the registered id's, i need their related intervals.
If i will keep it separately i will lose the connection between maintainer id to its interval
There was a problem hiding this comment.
@romulets The reason we are using saved object is because we need the id and interval for the
Don't we have this in memory always?
romulets
left a comment
There was a problem hiding this comment.
Overall looks good, two comments:
- Do we even need saved objects?
- I'd add a readme so we can point teams who use it on how to use and lifecycle of a maintainer (when it installs, when set up runs) and things like that
| dynamic: false, | ||
| properties: { | ||
| 'entity-maintainers-tasks': { | ||
| type: 'nested', |
There was a problem hiding this comment.
@hop-dev do you have examples of how others behave?
Tbh, I wonder if we even need to have saved objects for this. Isn't this something that we have all in memory always? Why do we need to store in ES? If we have a kibana restart, the maintainer will be registered again, won't it?
| const isFirstRun = currentStatus.metadata.runs === 0; | ||
| if (isFirstRun && setup) { | ||
| logger.debug(`First run, executing setup`); | ||
| currentStatus.state = await setup({ |
There was a problem hiding this comment.
This is a important lifecycle note - set up runs on first task run, and not in any kibana life cycle or first security page enter.
Personally, I think this setup could be hooked on the install process of entity store. But I'm fine with this decision if @hop-dev doesn't see any problem.
It's just something that needs to be documented
There was a problem hiding this comment.
I think that on install might be better for us. E.g risk scoring setup will create the risk scoring index. The UI queries this index so it seems nicer to have the logic "if the entity store is setup the risk score index will be there".
if all maintainers run straight away then maybe there isn't much difference between the approaches?
There was a problem hiding this comment.
I think that on install might be better for us
@hop-dev @romulets This is effectively what will happen, once the installation is complete, all maintainers are scheduled, and their first execution will be triggered immediately (just not within the endpoint request thread).
I also see this as beneficial from a performance perspective, we don’t want to block the installation process while waiting for the initial execution of all maintainers to finish
…tion_tests/ci_checks
…b.com/chennn1990/kibana into entity-store/maintainer-framework-poc
…b.com/chennn1990/kibana into entity-store/maintainer-framework-poc
🔍 Preview links for changed docs |
⏳ Build in-progress, with failures
Failed CI StepsHistory
cc @chennn1990 |
…b.com/chennn1990/kibana into entity-store/maintainer-framework-poc
## Summary Adds a **maintainer task framework** to the entity_store plugin so other plugins can register custom recurring tasks that run in the context of the entity store (e.g. per-space maintenance). Registration is persisted in a new saved object; scheduling uses Task Manager and is triggered when an entity store is started (e.g. on install). ## What's in scope - **Public API:** Plugin setup exposes `registerEntityMaintainer(config)` so consumers can register a maintainer with `id`, `interval`, `run`, optional `setup`, `initialState`, and `description`. - **Persistence:** A new hidden saved object type `entity-maintainers-tasks` stores the list of registered tasks (id + interval). A dedicated **EntityMaintainersTasksClient** owns all reads/writes so registration and scheduling stay free of direct SO usage. - **Scheduling:** When the entity store is started (e.g. via install API), all registered maintainer tasks are loaded from the client and scheduled with Task Manager (`ensureScheduled`) for the current space. Task runner supports optional first-run `setup`, status metadata (runs, lastSuccessTimestamp, lastErrorTimestamp), and debug logging. - **Install flow:** The install route now calls `assetManager.init(req, entityTypes, logExtraction)` so init and maintainer scheduling live in one place (with try/catch and error handling). ## How to try it 1. A sample maintainer is registered in plugin setup (id: `entity-maintainer-task-test`, interval: `20s`) for POC. 2. Call the install API with one or more entity types; after init, maintainer tasks for that space are scheduled. 3. Check Task Manager and logs; task runner logs include task id and run number at debug level. --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Summary
Adds a maintainer task framework to the entity_store plugin so other plugins can register custom recurring tasks that run in the context of the entity store (e.g. per-space maintenance). Registration is persisted in a new saved object; scheduling uses Task Manager and is triggered when an entity store is started (e.g. on install).
What's in scope
registerEntityMaintainer(config)so consumers can register a maintainer withid,interval,run, optionalsetup,initialState, anddescription.entity-maintainers-tasksstores the list of registered tasks (id + interval). A dedicated EntityMaintainersTasksClient owns all reads/writes so registration and scheduling stay free of direct SO usage.ensureScheduled) for the current space. Task runner supports optional first-runsetup, status metadata (runs, lastSuccessTimestamp, lastErrorTimestamp), and debug logging.assetManager.init(req, entityTypes, logExtraction)so init and maintainer scheduling live in one place (with try/catch and error handling).How to try it
entity-maintainer-task-test, interval:20s) for POC.