Skip to content

Refactor cache to support storing resources in memory#52210

Merged
rosstimothy merged 1 commit intomasterfrom
tross/refactor_cache
Apr 10, 2025
Merged

Refactor cache to support storing resources in memory#52210
rosstimothy merged 1 commit intomasterfrom
tross/refactor_cache

Conversation

@rosstimothy
Copy link
Copy Markdown
Contributor

@rosstimothy rosstimothy commented Feb 16, 2025

While the current cache storage is also in memory, it leverages the memory backend which requires converting resources to and from json. Marshaling json is suboptimal and is often a source of CPU and memory consumption. The biggest problem with json marshaling though is that it requires calling CheckAndSetDefaults. When validations enforced in CASD become stricter it can leave caches unable to become healthy when there are mixed versions within a cluster as they all might have a slightly different view of what should be allowed.

As a means to solve both problems, this changes the cache to store resources received from the upstream directly in memory without doing any conversion to json. There are two gotchas with this approach, caching is no longer "free" and cloning of resources must be done diligently to avoid races. Historically the cache relied on the storage implementation in services/local to manage persisting resources in the cache's backend.Memory. However, that will no longer be the case as the cache needs to support storage directly in memory. In order to reduce the burden this may pose on developers the new collection machinery is simpler, and helpers will be added on top of sortcache.SortCache to make a second storage implementation trivial.

The changes here are two fold, mark the existing collections machinery as legacy and introduce new machinery to operate entirely in memory. This includes an initial implementation of caching for static tokens, users, and cert authorities that leverages the new collection machinery. Additionally, all of the new resource specific code was moved to lib/cache/static_tokens.go, lib/cache/users.go, and lib/cache/cert_authority.go. This follows the blue print some of the newer cached resources use to make it easier to identify where code for a specific resource lives and to reduce the size of the cache.go and collections.go files.

Once this lands I plan to copy the same pattern used on other cached resource until all of the legacy collections are gone.

@rosstimothy rosstimothy added the no-changelog Indicates that a PR does not require a changelog entry label Feb 16, 2025
@rosstimothy rosstimothy force-pushed the tross/refactor_cache branch 4 times, most recently from 3be09f7 to 9baa177 Compare February 18, 2025 18:04
Comment thread lib/cache/store.go Outdated
Comment thread lib/cache/static_tokens.go Outdated
@rosstimothy rosstimothy force-pushed the tross/refactor_cache branch 2 times, most recently from ab060e3 to 63618f5 Compare February 20, 2025 22:00
Comment thread lib/cache/cert_authority.go Outdated
Comment thread lib/cache/cert_authority.go Outdated
Comment thread lib/cache/collection.go
Comment thread lib/cache/store.go
Comment on lines +96 to +60
for idx, transform := range s.indexes {
s.cache.Delete(idx, transform(t))
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: SortCache deletes the item across all indexes if it is deleted on any index, so only one call to Delete is required (though maybe just leaving as-is is easier than deciding which one to use lol).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to have a primary index and secondary, non-unique indices? As it is we're basically requiring the secondary indices to include the "primary key" as a tail all the time anyway.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered adding support for non-unique indices when I initially implemented the sortcache. I decided that it was simpler not to bother. It sounds good in theory, but pagination that works across restarts is actually tricky to do with non-unique indices. Appending the primary key makes pagination trivial and preserves the standard sort order within groups of colliding values, which is especially nice when dealing with low cardinality indices such as access request state. Seemed like non-unique indices would be a big headache for little payoff.

Comment thread lib/cache/collection.go Outdated
Comment thread lib/cache/collection.go Outdated
Comment thread lib/cache/collection.go
@rosstimothy rosstimothy force-pushed the tross/refactor_cache branch 4 times, most recently from a82d3b3 to 8240ca5 Compare February 27, 2025 22:13
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests were moved unaltered from cache_test.go.

@rosstimothy rosstimothy marked this pull request as ready for review March 12, 2025 15:21
@github-actions github-actions Bot requested review from fspmarshall and tcsc March 12, 2025 15:21
@rosstimothy rosstimothy force-pushed the tross/refactor_cache branch from 1ba6511 to 927fb94 Compare March 22, 2025 01:00
@rosstimothy rosstimothy removed the request for review from tcsc March 24, 2025 15:49
@rosstimothy rosstimothy force-pushed the tross/refactor_cache branch from 236d981 to 21748c5 Compare March 26, 2025 23:51
Comment thread lib/cache/cache.go
Comment thread lib/cache/collection.go
Comment thread lib/cache/cert_authority.go Outdated
Comment thread lib/cache/store.go
Comment on lines +96 to +60
for idx, transform := range s.indexes {
s.cache.Delete(idx, transform(t))
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to have a primary index and secondary, non-unique indices? As it is we're basically requiring the secondary indices to include the "primary key" as a tail all the time anyway.

Comment thread lib/cache/store.go Outdated
Comment thread lib/cache/store.go
Comment thread lib/cache/collection.go Outdated
Comment thread lib/cache/cache.go
Comment thread lib/cache/collection.go Outdated
Comment thread lib/services/local/users.go Outdated
Comment thread lib/services/local/users.go Outdated
Comment thread lib/cache/collection.go
@rosstimothy rosstimothy force-pushed the tross/refactor_cache branch from 21748c5 to aad2514 Compare March 28, 2025 20:14
rosstimothy added a commit that referenced this pull request Apr 28, 2025
Moves join tokens to the new cache collection scheme that was
introduced in #52210. No additional functionality changes have been
made here. This should be a purely mechanical translation to the
new internal caching machinery.
rosstimothy added a commit that referenced this pull request Apr 28, 2025
Moves authe preference, cluster name, session recording config,
audit config, and networking config to the new cache collection
scheme that was introduced in #52210. No additional functionality
changes have been made here. This should be a purely mechanical
translation to the new internal caching machinery.
rosstimothy added a commit that referenced this pull request Apr 28, 2025
Moves auto update config, version and agent rollout to the new cache
collection scheme that was introduced in #52210. No additional
functionality changes have been made here. This should be a purely
mechanical translation to the new internal caching machinery.
rosstimothy added a commit that referenced this pull request Apr 28, 2025
Moves auto update config, version and agent rollout to the new cache
collection scheme that was introduced in #52210. No additional
functionality changes have been made here. This should be a purely
mechanical translation to the new internal caching machinery.
rosstimothy added a commit that referenced this pull request Apr 28, 2025
Moves auto update config, version and agent rollout to the new cache
collection scheme that was introduced in #52210. No additional
functionality changes have been made here. This should be a purely
mechanical translation to the new internal caching machinery.
rosstimothy added a commit that referenced this pull request Apr 28, 2025
Moves auto update config, version and agent rollout to the new cache
collection scheme that was introduced in #52210. No additional
functionality changes have been made here. This should be a purely
mechanical translation to the new internal caching machinery.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-changelog Indicates that a PR does not require a changelog entry size/lg

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants