Improve role fetching and unmarshaling by espadolini · Pull Request #42007 · gravitational/teleport

espadolini · 2024-05-24T10:33:19Z

This PR changes the default page size for ListRoles (that the API client uses by default, and by extent the cache initialization process) from 1000 to 100; roles can require a lot of effort during unmarshaling (as we validate expressions, which requires parsing them) and after a cluster event such as an auth server restart, a lot of nodes will all connect at the same time and get roles 1000 at a time, which can spike the memory used by the auth server. The default iteration page size of 1000 was picked somewhat arbitrarily and works well for resources that are just data with very little effort required during unmarshaling, but roles can be pretty big and hang around for a while, so tuning the page size down can help mitigate these memory requirement bursts.

In addition, this PR changes the version checking for roles to use jsoniter.Get to access the one field we need, rather than using encoding/json's unmarshaler for a whole resource header. A simple benchmark that unmarshals a reasonably "average" role with one where expression shows a 28% improvement in time spent in services.UnmarshalRoleV6, with a tiny reduction in allocations as well.

Benchmark result:

current
BenchmarkUnmarshalRole    210998             27651 ns/op           17377 B/op        331 allocs/op

new
BenchmarkUnmarshalRole    293476             19791 ns/op           16521 B/op        320 allocs/op

CPU profile flame graph for a relative comparison (with FastUnmarshal and ValidateRole being the same in both):

changelog: reduced memory and cpu usage after control plane restarts in clusters with a high number of roles

espadolini · 2024-05-24T11:38:09Z

For the record: the existing implementation sits at 27651 ns/op, this one with jsoniter.Get at 19791 ns/op, simply replacing json.Unmarshal with utils.FastUnmarshal (i.e. using jsoniter for a full ResourceHeader) is at 22823 ns/op - it's well worth it to apply the same approach to resources that are using json.Unmarshal but maybe not so much if they're already using utils.FastUnmarshal.

public-teleport-github-review-bot · 2024-05-24T17:37:15Z

@espadolini See the table below for backport results.

Branch	Result
branch/v14	Failed
branch/v15	Failed

* Reduce default ListRoles page size * Avoid stdjson in UnmarshalRoleV6

espadolini added 2 commits May 24, 2024 11:37

Reduce default ListRoles page size

b7ac4a3

Avoid stdjson in UnmarshalRoleV6

5ce09c6

espadolini added backport/branch/v14 labels May 24, 2024

github-actions Bot requested review from ibeckermayer and tigrato May 24, 2024 10:33

github-actions Bot added the size/sm label May 24, 2024

strideynet approved these changes May 24, 2024

View reviewed changes

tigrato approved these changes May 24, 2024

View reviewed changes

public-teleport-github-review-bot Bot removed the request for review from ibeckermayer May 24, 2024 10:57

espadolini requested a review from rosstimothy May 24, 2024 11:35

zmb3 approved these changes May 24, 2024

View reviewed changes

public-teleport-github-review-bot Bot removed the request for review from rosstimothy May 24, 2024 13:36

espadolini added this pull request to the merge queue May 24, 2024

Merged via the queue into master with commit bfa3a0e May 24, 2024

espadolini deleted the espadolini/improve-role-data-handling branch May 24, 2024 17:35

espadolini added a commit that referenced this pull request May 27, 2024

Improve role fetching and unmarshaling (#42007)

1ee9b0d

* Reduce default ListRoles page size * Avoid stdjson in UnmarshalRoleV6

espadolini mentioned this pull request May 27, 2024

[v15] Improve role fetching and unmarshaling #42062

Merged

espadolini added a commit that referenced this pull request May 27, 2024

Improve role fetching and unmarshaling (#42007)

6cd4b1b

* Reduce default ListRoles page size * Avoid stdjson in UnmarshalRoleV6

espadolini mentioned this pull request May 27, 2024

[v14] Improve role fetching and unmarshaling #42064

Merged

github-merge-queue Bot pushed a commit that referenced this pull request May 28, 2024

Improve role fetching and unmarshaling (#42007) (#42062)

3214463

* Reduce default ListRoles page size * Avoid stdjson in UnmarshalRoleV6

github-merge-queue Bot pushed a commit that referenced this pull request May 28, 2024

Improve role fetching and unmarshaling (#42007) (#42064)

d3beaa9

* Reduce default ListRoles page size * Avoid stdjson in UnmarshalRoleV6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve role fetching and unmarshaling#42007

Improve role fetching and unmarshaling#42007
espadolini merged 2 commits intomasterfrom
espadolini/improve-role-data-handling

espadolini commented May 24, 2024 •

edited

Loading

Uh oh!

espadolini commented May 24, 2024 •

edited

Loading

Uh oh!

public-teleport-github-review-bot Bot commented May 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

espadolini commented May 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

espadolini commented May 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

public-teleport-github-review-bot Bot commented May 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

espadolini commented May 24, 2024 •

edited

Loading

espadolini commented May 24, 2024 •

edited

Loading