[workflows_management] Lazy-load Zod connector schemas to cut idle memory#264283
Conversation
semd
left a comment
There was a problem hiding this comment.
Nice work 💯
The heap analysis and the targeted fix look correct to me.
Before merging, I'd like to suggest an alternative shape that I think gets the same memory win with substantially less surface area. Curious what you think.
The observation
The six lazy getters in connector_action_schema.ts always end up firing together: opening the YAML editor calls getWorkflowZodSchema() → getAllConnectorsInternal() → all six getters on the same call stack. Monaco needs the full union schema for autocomplete, so per-Map (and per-connector) deferral collapses to a single effective deferral in practice. That means we're paying a fair amount of refactoring cost (six getters, six caches, a test-only reset, three require()-in-arrow workarounds, ~500 lines of churn in connector_action_schema.ts) to defer six things that always defer together.
Proposed alternative: single boundary in schema.ts
schema.ts is the only consumer of connector_action_schema.ts, so it's the natural single point of control. We already have memoize-one available, which is a clean fit here:
// connector_action_schema.ts — REVERT to the original eager Map exports.
// No getters, no cached vars, no __resetForTesting.
// Add a warning comment to not import statically from this file. just in case
export const ConnectorInputSchemas = new Map([...]);
export const ConnectorActionInputSchemas = new Map([...]);
// ...etc// schema.ts — the only consumer becomes the single lazy boundary.
import memoizeOne from 'memoize-one';
const getConnectorSchemas = memoizeOne(
// eslint-disable-next-line @typescript-eslint/no-var-requires -- defers ~16MB zod heap, see #264175
(): typeof import('./connector_action_schema') => require('./connector_action_schema')
);
function getSubActionParamsSchema(actionTypeId: string, subActionName: string) {
const { ConnectorInputSchemas, ConnectorActionInputSchemas, ConnectorSpecsInputSchemas } =
getConnectorSchemas();
// ...original lookup logic, unchanged
}Comparison
| Current PR | Single-boundary | |
|---|---|---|
connector_action_schema.ts diff |
+500 / −500 | 0 |
| Lazy boundaries | 6 | 1 |
require() workarounds |
3 (arrow-fn trick) | 1 (explicit, justified disable) |
| Test-only API surface | __resetConnectorSchemaCachesForTesting |
none (jest.resetModules()) |
| Idle-startup memory savings | ~16 MB | ~16 MB (same) |
Same memory outcome, much less code, no API change to connector_action_schema.ts, and the lazy boundary lives in the file that actually consumes it.
Caveat to verify
The PR description says common/index.ts doesn't re-export the Map constants and no consumer outside schema.ts references them. If that still holds on the latest base, the single-boundary version is strictly better. The lazy regression test added is still valuable, just point it at the schema.ts boundary instead of the six getters.
Happy to be wrong about any of this, wanted to suggest it before the structural choice is made.
|
In general, I agree with @semd's suggestion. A couple of other observations/questions:
|
Heap snapshot analysis (built Kibana, allocation tracking)Compared against Rudolf's baseline ( Total idle heap: 822.8 → 811.7 MB (-11.1 MB) Allocated by Plugin (allocation site) — before vs after
Remaining 4.5 MB breakdownDug into what's still allocated eagerly on the server path:
Lazy-loading |
…mory Single lazy boundary in schema.ts (the sole consumer of connector_action_schema.ts) defers ~16 MB of zod-schema heap until the first workflow edit/execute call. connector_action_schema.ts is left untouched — no getter wrappers, no test-only reset API. Removes the unused WORKFLOW_ZOD_SCHEMA / WORKFLOW_ZOD_SCHEMA_LOOSE module-level constants whose eager generateYamlSchemaFromConnectors() calls contributed to the startup heap. Heap snapshot (built Kibana, allocation tracking, idle): @kbn/workflows-management-plugin alloc site: 17.2 → 4.2 MB (-13 MB) Closes elastic#264175 Made-with: Cursor
0e55f4b to
829f896
Compare
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]Async chunks
Unknown metric groupsESLint disabled line counts
Total ESLint disabled count
History
cc @Kiryous |
|
Starting backport for target branches: 9.4 https://github.com/elastic/kibana/actions/runs/24761518230 |
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…dle memory (#264283) (#264885) # Backport This will backport the following commits from `main` to `9.4`: - [[workflows_management] Lazy-load Zod connector schemas to cut idle memory (#264283)](#264283) <!--- Backport version: 9.6.6 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport) <!--BACKPORT [{"author":{"name":"Tal","email":"tal.borenstein@elastic.co"},"sourceCommit":{"committedDate":"2026-04-22T05:15:56Z","message":"[workflows_management] Lazy-load Zod connector schemas to cut idle memory (#264283)","sha":"bf4e1b09650ac7d9b0f90ee073e84b17d0eb58df","branchLabelMapping":{"^v9.5.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:version","Team:One Workflow","v9.4.0","v9.5.0"],"title":"[workflows_management] Lazy-load Zod connector schemas to cut idle memory","number":264283,"url":"https://github.com/elastic/kibana/pull/264283","mergeCommit":{"message":"[workflows_management] Lazy-load Zod connector schemas to cut idle memory (#264283)","sha":"bf4e1b09650ac7d9b0f90ee073e84b17d0eb58df"}},"sourceBranch":"main","suggestedTargetBranches":["9.4"],"targetPullRequestStates":[{"branch":"9.4","label":"v9.4.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v9.5.0","branchLabelMappingKey":"^v9.5.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/264283","number":264283,"mergeCommit":{"message":"[workflows_management] Lazy-load Zod connector schemas to cut idle memory (#264283)","sha":"bf4e1b09650ac7d9b0f90ee073e84b17d0eb58df"}}]}] BACKPORT--> Co-authored-by: Tal <tal.borenstein@elastic.co>
…sationChanges23 * commit '9a7b717c662d1c904052bc59f0e5a81daab87c7f': (145 commits) Upgrade EUI to v114.2.0 (elastic#264550) [Entity Analytics] Add missing OpenAPI descriptions and examples to p… (elastic#264778) [Entity Resolution] Clarify CSV upload result for already-linked entities (elastic#264689) [AI Infra] Fix failing GenAI Settings Scout tests (elastic#260496) [Agent Builder] [Bug Bash] OAuth connector settings mention fields that are not there (elastic#264756) [performance] process-wide cache for advanced settings lookup (elastic#262618) [CI] Update limits.yml for securitySolution (elastic#264946) [SLO] Fix APM embeddable ids (elastic#264750) [EDR Workflows] Unify artifacts empty state buttons (elastic#264389) [Alert Triage workflow] Adds security.buildAlertEntityGraph and security.renderAlertNarrative… (elastic#259159) [SigEvents] Add KI feature identification endpoints and refactor task to use shared service (elastic#263528) [Scout] Migrate Data Views API tests from FTR - Part5 (elastic#264088) [Cases] Apply shared extended_fields path util server side (elastic#264706) [Lens as code] Fix metric trendline (elastic#264777) [api-docs] 2026-04-22 Daily api_docs build (elastic#264882) [Scout] Update test config manifests (elastic#264575) [workflows_management] Lazy-load Zod connector schemas to cut idle memory (elastic#264283) [ES|QL] Fix ES|QL columns reset race during active fetch (elastic#263947) [Content List] Column layout props, sticky actions, and title click handlers (elastic#264203) [Lens as code] Validate `id` in route for new vis types (elastic#264480) ...
Summary
Contributes to the 9.4.0 OOM effort for 1GB ECH/ECK deployments (parent epic: #264170, this sub-issue: #264175).
Heap snapshots traced ~16MB of retained Zod-schema heap to
@kbn/workflows-management-plugin, originating fromcommon/connector_action_schema.ts. At module-load time the plugin:./stack_connectors_schema/*submodule.@kbn/connector-specspackage.Maps and astaticConnectorsarray, each populated with fully-instantiated Zod schemas — whether or not any workflow code ever runs.On a 1GB Kibana pod this contributed directly to idle-memory OOM kills after the Zod v4 upgrade.
What this PR does
staticConnectorsinto cached getter functions:getConnectorSpecsInputSchemas()getConnectorInputSchemas()getConnectorActionInputSchemas()getConnectorOutputSchemas()getConnectorActionOutputSchemas()getStaticConnectors()Tests
Memory impact
Expected reduction on an idle Kibana where no workflow code path is exercised:
Kibana instances that actively use workflows pay the same ~16MB on first use — trade-off is acceptable and only hits once per process.
Risk / scope
Not in scope (follow-ups)
Validation
Closes #264175
For reviewers
Checklist
Made with Cursor