-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Response Caching [ASP.NET Core 6 & 7]: cache populated for each concurrent connection rather than once #44696
Comments
Looks like a cache stampede and casting my mind back to when response caching was made, this might be a known design trait, but would be good to have others confirm. The new Output Cache system in .NET 7 should not suffer from this as I believe it implements protection against this, @sebastienros |
@DamianEdwards Thanks for pointing it out. With
|
Looking at your numbers, it seems that latency was higher for Output Cache, but so was throughput. Given everything is local (and without a profile) it's hard to draw any real conclusions from that. @sebastienros do you have any thoughts? Certainly, server profiles from both runs would show what's going on here. |
@DamianEdwards The throughput figures with Output Cache are actually lower (1.64 < 2.13), which makes sense given the latency numbers. I'd be glad to provide more and more reliable data points. The local environment does certainly not provide a strict baseline profile. What server profiles are you referring to? |
Ah sorry I was misreading the results (Max vs. Avg). By "profile" I mean a performance profile using CPU sampling, via the VS Performance Profile or |
Thanks for contacting us. We're moving this issue to the |
Sure thing, will do. For the time being it appears there's a memory leak in output cache. I've just noticed the RSS of my dotnet process going through the roof, under the same load and soak test as above, RSS ever increasing stepwise about each time cache should be evicted. Almost a gigabyte now. But when I hit the response-cached endpoint in the same dotnet process it first steps a bit higher but then plummets back to more reasonable about 300m. |
I can upload the project and the profile if it would help. |
fwiw, the imemorycache might be a bottleneck when judging from this comparison by a cache written by an azure engineer. |
@ben-manes the benchmarks actually should that on their machine it supports ~40M transactions per second. I think it actually shows that it won't be bottleneck. |
@demming I confirm that Response Caching doesn't handle cache stampede like Output Caching does, by design. This is one reason we introduced Output Caching. Though in theory we could also implement it in Response Caching in a future release (maybe a contribution?). Wath version did you use for the benchmarks WRT Output Caching. I updated it to use binary serialization in RC2 in order to improve performance, it might help. However with Response Caching there is no such serialization step (it's always using in-memory storage) which could explain performance differences. |
@demming I'm not seeing enough there to conclude there's an actual memory leak, vs. expected allocations and respective GC under pressure. If you have an app and profile you can share that'd be great. |
@sebastienros, @DamianEdwards: I've been out of the loop on this particular issue, appreciate your staying on track with it. Do you still need my input/assistance on it? |
I'm going to cross-reference #53255 here, which is a proposed net9 extension to
I'm more enthusiastic about 1; I'll try a prototype, to see how well it works; update: not hard, but completely untested (and we'd probably want some kind of key isolation/disambiguation): // Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
using Microsoft.AspNetCore.OutputCaching;
namespace Microsoft.Extensions.Caching.Distributed.Tests;
// context: experiment for https://github.com/dotnet/aspnetcore/issues/44696
internal class OutputCacheWrapper(ReadThroughCache underlying) : IOutputCacheStore
{
ValueTask IOutputCacheStore.EvictByTagAsync(string tag, CancellationToken cancellationToken)
=> underlying.RemoveTagAsync(tag, cancellationToken);
async ValueTask<byte[]?> IOutputCacheStore.GetAsync(string key, CancellationToken cancellationToken)
=> (await underlying.GetAsync<byte[]>(key, null, cancellationToken)).Value;
ValueTask IOutputCacheStore.SetAsync(string key, byte[]? value, string[]? tags, TimeSpan validFor, CancellationToken cancellationToken)
=> value is null
? underlying.RemoveKeyAsync(key, cancellationToken)
: underlying.SetAsync<byte[]>(key, value, new ReadThroughCacheEntryOptions(validFor), tags, cancellationToken);
} |
Is there an existing issue for this?
Describe the bug
Response caching goes bust when multiple concurrent inbound connections arrive.
It appears that cache population begins with the first concurrent inbound request hitting the endpoint and not finishing until all 100 of them arrive. So each of them invokes the service, and GC can't cope with the load. The number 100 is chosen here almost arbitrarily. It can't serve C10k requests with response caching---it just crashes.
If, however, I run a preliminary request against the cached endpoint as described below in "Reproduction," then the cache gets populated and all 100 concurrent requests are ordinarily served from the cache and only 1 service invocation occurs. If instead I let that cache expire, then 101 service invocations take place.
When properly served from cache and the runtime VM is warmed up, my MacBook Air 2020 gives 3-4 GB/s throughput on that ASP.NET Core endpoint with connection reuse according to
bombardier
, whereas my Akka HTTP implementation with in-memory Caffeine cache gives 7-9 GB/s, all other things being equal, with up to tenfold smaller 99th percentile latency. I'm not sure where to look for the root cause of this performance mismatch.More importantly, the runtime throws a
System.OutOfMemory
exception when theDuration
is not long enough. I suppose multiple cache populations and evictions are taking place at the same time, they overlap, and RSS goes beyond 1 GiB, before shrinking while still serving requests from the then populated cache.Expected Behavior
I expect it to populate the cache on the first approaching request, regardless of how many identical inbound requests arrive at the same time or prior to cache population. All but one requests should be served from the cache. Now they all induce cache population. Perhaps simply one of them should be picked at random to populate the cache.
Steps To Reproduce
I define a very simple controller,
with just the following endpoint
which calls a service. Nothing fancy.
Here's the service that gets invoked
Now if I run a load test on this endpoint on Kestrel,
bombardier -c 100 -d 10s "http://localhost:5104/website?address=localhost:8080"
I observe exactly 100 service invocations (each of which is an expensive operation). To prevent this from happening all I need to do is run a single preliminary request against this endpoint, e.g., something like
which populates the cache, so that the same 100 concurrent connections immediately afterwards are served from cache and no additional invocations are taking place.
Exceptions (if any)
System.OutOfMemory
fromGanss.Xss.HtmlSanitizer
(inside a singleton service) due to multiple concurrent invocations.NET Version
6 and 7-rc2
Anything else?
ASP.NET Core 6.0 and 7.0-rc2
The text was updated successfully, but these errors were encountered: