Skip to content

[FIXED] Fix request/reply performance when using allow_responses perms#6064

Merged
derekcollison merged 3 commits intonats-io:mainfrom
jack7803m:replies-map-perf-fix
Dec 11, 2024
Merged

[FIXED] Fix request/reply performance when using allow_responses perms#6064
derekcollison merged 3 commits intonats-io:mainfrom
jack7803m:replies-map-perf-fix

Conversation

@jack7803m
Copy link
Copy Markdown
Contributor

Fixes performance issues noted in #6058. Attempts to prune reply map every replyPermLimit messages or if it has been more than replyPruneTime since the last prune.

Resolves #6058

Signed-off-by: Jack Morris jack@jackmorris.me

@jack7803m jack7803m requested a review from a team as a code owner October 31, 2024 18:21
@jack7803m
Copy link
Copy Markdown
Contributor Author

Unsure how those failing tests could be affected by the minimal changes I made.

Still need to add some sort of solution to the infinite expiry, though I'm not sure exactly what direction to go with that (i.e. error or set to a default), so I'll leave that decision up to the maintainers.

client.replies[string(reply)] = &resp{time.Now(), 0}
if len(client.replies) > replyPermLimit {
client.repliesSincePrune++
if client.repliesSincePrune > replyPermLimit || time.Since(client.lastReplyPrune) > replyPruneTime {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a sense under heaby load how much more memory this will hold onto?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original issue was it holding onto too much memory and looping through all of it every message.

Just added some debug statements and found that if the reply subject is already allowed (eg. "pub": ">"), then the reply counter never actually goes up and therefore it never is able to prune that subject out until it expires by time. Just noting this because I'm going to be looking at a fix for that too but not sure if it'll have broader effects (hopefully not).

Assuming that the subjects are getting pruned as they're replied to, at most it should only be able to get the replies map to an extra replyPermLimit size than what it could've possible been before in the worst case scenario. Even for that to happen it would have to fill the map with subjects, attempt to prune, then expire them all by the next message - in that case, pruning for every message over the replyPermLimit would cause it to immediately prune whereas this solution would hold onto that memory for the next replyPermLimit messages, making the map size replyPermLimit * 2.

Under normal heavy load it shouldn't make any significant difference, as the current behavior typically should only run the prune once every replyPermLimit messages anyway when it's configured properly.

Copy link
Copy Markdown
Member

@MauriceVanVeen MauriceVanVeen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jack7803m, the PR's title mentions it's WIP, but is this ready/good to merge?

@neilalexander, would you maybe also want to review, given your comment here? #6058 (comment).
Think this PR would at least relieve some pressure, by not calling client.pruneReplyPerms() every single time just because the map is large enough.

@jack7803m jack7803m changed the title [WIP][FIXED] Fix request/reply performance when using allow_responses perms [FIXED] Fix request/reply performance when using allow_responses perms Dec 10, 2024
@jack7803m
Copy link
Copy Markdown
Contributor Author

Forgot to change the title - this should be good to merge!

@derekcollison
Copy link
Copy Markdown
Member

Let's have @neilalexander take a look as well real quick, but then we can get this merged.

@neilalexander neilalexander self-requested a review December 11, 2024 09:37
Copy link
Copy Markdown
Member

@neilalexander neilalexander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably want to address the non-expiring replies long-term but for now I'm OK with this to improve performance, LGTM.

@derekcollison derekcollison merged commit 88ab06b into nats-io:main Dec 11, 2024
neilalexander added a commit that referenced this pull request Dec 13, 2024
Includes the following:

- #6226
- #6232
- #6235
- #6064
- #6244
- #6246
- #6247
- #6248
- #6250

Signed-off-by: Neil Twigg <neil@nats.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Severe request/reply performance hit when using allow_responses map [v2.10.20]

4 participants