[performance] process-wide cache for advanced settings lookup by drewdaemon · Pull Request #262618 · elastic/kibana

drewdaemon · 2026-04-10T19:43:16Z

Summary

This is a performance enhancement to the UI Settings service. It introduces a space-aware, process-wide cache layer for the settings saved object. It applies to .get('setting-name') lookups on all UI settings clients.

Justification

When a dashboard loads, search-related settings need to be consulted before each search request. That can mean 20+ saved object lookups (see #84923) which are each happening in different requests, so the per-request cache doesn't cover it.

Caching approach

NamespacedCache is the class that handles caching. It stores both resolved settings objects and in-flight requests to deduplicate read operations (getUserDefined).

The server-side cache is freshened in three scenarios

A setting is changed (setMany)
A browser requests index.html
A settings lookup occurs after the TTL has expired

The browser UI settings client is guaranteed to get fresh settings page loads (this preserves the existing UX for changing settings).

In a multi-node scenario, the node performing a setting change gets fast consistency. Other nodes get eventual consistency.

Checklist

Unit or functional tests were updated or added to match the most common scenarios

…daemon/kibana into advanced-settings-per-setting-cache

kibanamachine · 2026-04-14T20:24:22Z

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#11609

[❌] src/platform/test/functional/apps/context/config.ts: 2/5 tests passed.

see run history

kibanamachine · 2026-04-14T22:31:40Z

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#11610

[✅] src/platform/test/functional/apps/context/config.ts: 10/10 tests passed.

see run history

…gs-per-setting-cache

kertal

🏆 For this PR

While then doing visual performance testing locally with default settings, there's not much difference between main and this PR, when simulating a slower network connection between ES and Kibana, the win in this case is clearly visible. What I did was applying the following in my local system to slow down the connection:

# Enable pf
sudo pfctl -E

# Configure the dummynet pipe with 100ms delay
sudo dnctl pipe 1 config delay 100ms

# Apply both inbound and outbound rules in one shot
sudo pfctl -f - <<'EOF'
dummynet out proto tcp from any to 127.0.0.1 port 9200 pipe 1
dummynet in proto tcp from any port 9200 to any pipe 1
EOF

Loading our e-commerce dashboard in this case looked like this

Median improvement: 0.706s, Average 1.004s when running it 10x. Left side with cache, right side without

More details

race_with-cache_vs_without-cache.html

Dosant · 2026-04-17T15:21:35Z

+    return result;
+  }
+
+  private async computeUserProvided<T = unknown>(bypassCache = false): Promise<UserProvided<T>> {


looks like bypassCache is not used?

Dosant · 2026-04-17T15:22:04Z

    { handleWriteErrors }: { validateKeys?: boolean; handleWriteErrors?: boolean } = {}
  ) {
-    this.cache.del();
+    // check if a read is currently in progress, wait for it to complete before proceeding


Seems like an outdated comment.

Dosant · 2026-04-17T15:22:14Z

+
    this.onWriteHook(changes);
+
+    // Register write operation as in-flight so concurrent reads will wait


Seems like an outdated comment

Dosant · 2026-04-17T15:23:35Z

+  timer: NodeJS.Timeout;
+}
+
+export const NAMESPACED_CACHE_TTL = 60_000;


I think we discussed lowering this to reduce eventual consistency time in a multinode setup?

Yes, totally spaced 👍

What value do you think is reasonable given the cache-bust-on-page-load addition?

5 seconds to be on the safe side? I think this is what we had in the per-request cache?

I think 5 seconds made sense for a request cache because few requests take longer than 5 seconds.

On a busy cluster, the cache will be getting refreshed via page load often so we're really talking about the absolute outer-bound on eventual consistency. And looking at it from a dashboard perspective, users will often be changing a filter, etc at greater than 5 second intervals so I think they'd often be waiting on a refresh if the TTL is so low.

That together with the fact that changing these settings is an edge case makes me lean more aggressive. E.g. 10-15 seconds.

But, even with 5 seconds this is a significant improvement over the current situation. I defer to you, of course!

60 is def very aggressive. I would be open for 10s though, I think it will work nice. But @Dosant you prefer 5s it is ok by me. Let me know and I will make the change (Drew is out and he told me to take care of this PR)

thank you both! let's do 10! 👍

Done here 8697441

(Agreed that 60 is too aggressive... I put it there before I considered multi-node scenarios)

…daemon/kibana into advanced-settings-per-setting-cache

Dosant

LGTM! thanks for getting this over the line

Dosant · 2026-04-21T09:20:45Z

+export class NamespacedCache<T = unknown> {
+  private readonly entries = new Map<string, NamespacedCacheEntry<T>>();
+  private readonly inflightReads = new Map<string, Promise<T>>();
+  private readonly inflightWrites = new Map<string, Promise<void>>();


I think inflightWrites are not used and can be cleaned up.

You are absolutely right 3d7c96e

Dosant · 2026-04-21T09:22:30Z

+  timer: NodeJS.Timeout;
+}
+
+export const NAMESPACED_CACHE_TTL = 60_000;


thank you both! let's do 10! 👍

elasticmachine · 2026-04-21T11:03:31Z

💛 Build succeeded, but was flaky

Buildkite Build
Commit: 3d7c96e

Failed CI Steps

FTR Configs #66

Test Failures

[job] [logs] FTR Configs #66 / EPM Endpoints installs and uninstalls all assets reinstalls all assets 0.2.0 should have created the correct saved object

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`@kbn/core-ui-settings-server`	24	25	+1

Unknown metric groups

API count

id	before	after	diff
`@kbn/core-ui-settings-server`	42	43	+1

History

💛 Build #431071 was flaky cf38933
💛 Build #430520 was flaky 4f6ab40
💛 Build #429999 was flaky e4517f3
💔 Build #429988 failed 38baf32
💔 Build #429259 failed 423fd26
💔 Build #428704 failed bcc7787

walterra

Code review only for datavis codeownership, LGTM.

…sationChanges23 * commit '9a7b717c662d1c904052bc59f0e5a81daab87c7f': (145 commits) Upgrade EUI to v114.2.0 (elastic#264550) [Entity Analytics] Add missing OpenAPI descriptions and examples to p… (elastic#264778) [Entity Resolution] Clarify CSV upload result for already-linked entities (elastic#264689) [AI Infra] Fix failing GenAI Settings Scout tests (elastic#260496) [Agent Builder] [Bug Bash] OAuth connector settings mention fields that are not there (elastic#264756) [performance] process-wide cache for advanced settings lookup (elastic#262618) [CI] Update limits.yml for securitySolution (elastic#264946) [SLO] Fix APM embeddable ids (elastic#264750) [EDR Workflows] Unify artifacts empty state buttons (elastic#264389) [Alert Triage workflow] Adds security.buildAlertEntityGraph and security.renderAlertNarrative… (elastic#259159) [SigEvents] Add KI feature identification endpoints and refactor task to use shared service (elastic#263528) [Scout] Migrate Data Views API tests from FTR - Part5 (elastic#264088) [Cases] Apply shared extended_fields path util server side (elastic#264706) [Lens as code] Fix metric trendline (elastic#264777) [api-docs] 2026-04-22 Daily api_docs build (elastic#264882) [Scout] Update test config manifests (elastic#264575) [workflows_management] Lazy-load Zod connector schemas to cut idle memory (elastic#264283) [ES|QL] Fix ES|QL columns reset race during active fetch (elastic#263947) [Content List] Column layout props, sticky actions, and title click handlers (elastic#264203) [Lens as code] Validate `id` in route for new vis types (elastic#264480) ...

…c#262618) ## Summary This is a performance enhancement to the UI Settings service. It introduces a space-aware, process-wide cache layer for the settings saved object. It applies to `.get('setting-name')` lookups on all UI settings clients. ### Justification When a dashboard loads, search-related settings need to be consulted before each search request. That can mean 20+ saved object lookups (see elastic#84923) which are each happening in different requests, so the [per-request cache](elastic#84513) doesn't cover it. ### Caching approach `NamespacedCache` is the class that handles caching. It stores both resolved settings objects and in-flight requests to deduplicate read operations (`getUserDefined`). The server-side cache is freshened in three scenarios - A setting is changed (`setMany`) - A browser requests `index.html` - A settings lookup occurs after the TTL has expired The browser UI settings client is guaranteed to get fresh settings page loads (this preserves the existing UX for changing settings). In a multi-node scenario, the node performing a setting change gets fast consistency. Other nodes get eventual consistency. ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios --------- Co-authored-by: Stratou <efstratia.kalafateli@elastic.co>

) This PR: - stabilizes the suite by reducing advanced-setting flips. - groups disabled and enabled scenarios so the suite only transitions into the enabled state once. - waits for the enabled setting to propagate before running the enabled-path assertions (following [ref](#262618 (comment))).

drewdaemon added 2 commits April 10, 2026 13:14

working cache

66193c6

Also handle in-flight settings

83e0438

macroscopeapp Bot reviewed Apr 10, 2026

View reviewed changes

drewdaemon added 4 commits April 10, 2026 13:50

Some clean-up

4bc82ce

more clean-up

9c4509a

use explicit has check

bf74af6

Merge branch 'main' into advanced-settings-per-setting-cache

b802e19

drewdaemon changed the title ~~[PoC] Advanced settings per setting cache~~ [PoC] per setting cache for advanced settings Apr 10, 2026

drewdaemon changed the title ~~[PoC] per setting cache for advanced settings~~ [PoC] per-setting cache for advanced settings lookup Apr 10, 2026

drewdaemon added 2 commits April 10, 2026 15:33

Fix some race conditions

ab82ad7

Merge branch 'advanced-settings-per-setting-cache' of github.com:drew…

5d74e6e

…daemon/kibana into advanced-settings-per-setting-cache

Dosant self-requested a review April 12, 2026 08:19

drewdaemon added 2 commits April 13, 2026 12:12

shift cache to apply to all settings

a3a4b7a

add ui-settings tests

2d61d80

macroscopeapp Bot reviewed Apr 13, 2026

View reviewed changes

drewdaemon added 6 commits April 13, 2026 15:59

fix race

4afd1ac

fix stale-data race

caf98e0

Merge branch 'main' into advanced-settings-per-setting-cache

9f04995

Merge branch 'main' into advanced-settings-per-setting-cache

55da0c9

Merge branch 'advanced-settings-per-setting-cache' of github.com:drew…

a6530c8

…daemon/kibana into advanced-settings-per-setting-cache

restore error fallback behavior

3f6c0f8

drewdaemon added 2 commits April 14, 2026 15:21

add log debugging

c18f40c

block reads on writes

48b9319

drewdaemon force-pushed the advanced-settings-per-setting-cache branch from e2c4760 to 48b9319 Compare April 14, 2026 21:46

macroscopeapp Bot reviewed Apr 14, 2026

View reviewed changes

Comment thread src/core/packages/ui-settings/server-internal/src/clients/ui_settings_client_common.ts Outdated

elastic deleted a comment from elasticmachine Apr 14, 2026

drewdaemon added 2 commits April 14, 2026 17:41

Merge branch 'main' into advanced-settings-per-setting-cache

4ea4f70

Merge branch 'main' of github.com:elastic/kibana into advanced-settin…

a73628d

…gs-per-setting-cache

drewdaemon added 3 commits April 16, 2026 11:32

clean up logic

c12e119

bypass cache during page render

4607da5

update tests

423fd26

drewdaemon requested a review from a team as a code owner April 16, 2026 19:52

macroscopeapp Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread src/core/packages/ui-settings/server-internal/src/clients/ui_settings_client_common.ts Outdated

drewdaemon requested a review from Dosant April 17, 2026 02:16

Merge branch 'main' into advanced-settings-per-setting-cache

d070803

kertal reviewed Apr 17, 2026

View reviewed changes

Dosant reviewed Apr 17, 2026

View reviewed changes

drewdaemon added 4 commits April 17, 2026 13:09

some clean-up

1ac12f3

Merge branch 'advanced-settings-per-setting-cache' of github.com:drew…

38baf32

…daemon/kibana into advanced-settings-per-setting-cache

remove unused var

e4517f3

Merge branch 'main' into advanced-settings-per-setting-cache

4f6ab40

stratoula added the v9.5.0 label Apr 21, 2026

Merge branch 'main' into advanced-settings-per-setting-cache

cf38933

Dosant approved these changes Apr 21, 2026

View reviewed changes

stratoula added 2 commits April 21, 2026 11:48

Change to 10

8697441

Cleanuo unused variab;e

3d7c96e

walterra approved these changes Apr 22, 2026

View reviewed changes

patrykkopycinski approved these changes Apr 22, 2026

View reviewed changes

jesuswr approved these changes Apr 22, 2026

View reviewed changes

stratoula merged commit 9a59de9 into elastic:main Apr 22, 2026
18 checks passed

This was referenced Apr 26, 2026

[Infra UI] Stabilize Infra Custom Dashboards stateful API tests #265679

Merged

Make FTR tests account for eventual-consistency propagation for cached uiSettings in multi-node Kibana #265720

Open

drewdaemon mentioned this pull request Apr 27, 2026

[data.search] optimize network activity in the data plugin #84923

Closed


		this.onWriteHook(changes);

		// Register write operation as in-flight so concurrent reads will wait

Conversation

drewdaemon commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Justification

Caching approach

Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kibanamachine commented Apr 14, 2026

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#11609

Uh oh!

Uh oh!

kibanamachine commented Apr 14, 2026

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#11610

Uh oh!

Uh oh!

kertal left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Dosant left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elasticmachine commented Apr 21, 2026

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

Metrics [docs]

Public APIs missing comments

API count

History

Uh oh!

walterra left a comment

Choose a reason for hiding this comment

Uh oh!

drewdaemon commented Apr 10, 2026 •

edited

Loading

kertal left a comment •

edited

Loading