Reduce string allocations per request #441

cesarblum · 2015-12-02T04:20:38Z

There is a fixed set of methods and a fixed set of HTTP versions that we'll see in most requests. When we see those we can reuse the same string instead of allocating a new one. If not, we fall back to allocating a new string.

Since all methods we're checking for plus the HTTP versions fit in 8 bytes, I'm pre-computing longs containing those bytes to make the comparisons faster.

cc @halter73 @davidfowl @benaadams @DamianEdwards

benaadams · 2015-12-02T11:10:49Z

Related: #411 which removes all string allocations on repeated requests on a keep alive - however this resolves common allocators across all requests

halter73 · 2015-12-02T18:58:08Z

src/Microsoft.AspNet.Server.Kestrel/Infrastructure/MemoryPoolIterator2Extensions.cs

+
+            if (httpVersion != null)
+            {
+                for (int i = 0; i < 8; i++) scan.Take();


Can re avoid this by having PeekLong advance the iterator for us.

halter73 · 2015-12-02T19:07:19Z

@benaadams After discussing #411 with @davidfowl and @CesarBS we were thinking about closing it due to the relative complexity of managing a per-connection invalidating cache to reduce string allocations.

I didn't what to start that discussion before offering an alternative, and this is it. I prefer this approach, because it has a better (or at least easier to analyze) worst case performance both memory and cpu-wise. This might also have a better memory footprint per-connection since we don't have to allocate a new cache each time.

@benaadams What do you think?

halter73 · 2015-12-02T19:08:13Z

src/Microsoft.AspNet.Server.Kestrel/Infrastructure/MemoryPoolIterator2Extensions.cs

+            {
+                httpMethod = HttpDeleteMethod;
+            }
+            else if (((scanLong ^ _httpGetMethodLong) << 32) == 0)


Super nitpicky, but we might as well check for GETs and POSTs first since I assume they are the most common methods.

benaadams · 2015-12-02T20:08:50Z

@halter73 👍 to this as it resolves expected shared strings in known locations; also the two changes aren't in conflict.

Want a discussion over on other issue? Or was this the discussion? 😉

halter73 · 2015-12-02T20:17:02Z

You are right that the two changes aren't mutually exclusive, but I'm still not sold on the StringPool yet. I was hoping this change would help convince you that the StringPool isn't necessary 😉

We can can continue discussing that over on the other issue though.

cesarblum · 2015-12-03T20:17:23Z

I was able to optimize this even further. Given we have a set of 11 known strings that will never change, I went looking for a divisor that would yield a unique modulo value for each string's long representation. Turns out there is such a value (37). So now we have a perfect hash of the known strings and matching the input to those is just a matter of clearing uninteresting bits in the input and looking up the resulting value, then making sure the longs are actually the same.

cesarblum · 2015-12-03T20:18:31Z

src/Microsoft.AspNet.Server.Kestrel/Properties/launchSettings.json

@@ -0,0 +1,3 @@
+{


benaadams · 2015-12-03T20:24:57Z

So now we have a perfect hash of the known strings and matching the input to those is just a matter of ...

Now we're cooking 👍

pakrym · 2015-12-03T20:25:40Z

I think we need to make sure that somebody won't accidentally get hash collision when adding new string to known strings. And add more comments to intialization code, because without this issue context it looks spooky

cesarblum · 2015-12-03T20:31:01Z

@pakrym Goot point 👍 I'll add comments explaining what is going on.

cesarblum · 2015-12-03T23:37:07Z

This is really unfortunate, but the 30% figure from my initial tests where for very simple requests (no headers, I should've thought better about that). With more realistic requests the reduction is a lot smaller (around 1%) 😞

I'm looking for more places where I can apply a similar optimization.

benaadams · 2015-12-05T11:02:40Z

LGTM

Only comment would be GetKnownString could be split into two GetHttpVersionString and GetMethodString but might be something for future when the hash is broken and there are more types.

halter73 · 2015-12-08T01:06:34Z

I'm waiting on verification that this will work on big-endian architectures before merging.

cesarblum · 2015-12-08T01:14:55Z

@halter73 It's going to be hard to check that. Raspberry Pi distros set the processor to little endian. Actually little endian seems to be the default on ARM environments. I'm not sure it's worth verifying this right now. I could enter a bug to track that and move on with this change.

cesarblum · 2015-12-08T01:20:19Z

I've emailed @stephentoub asking if they test Core on big endian. If they do I might try the same as them to test this.

benaadams · 2015-12-08T01:22:46Z

Would there be problems with the Frame header collection also?

cesarblum · 2015-12-08T01:23:46Z

@benaadams Can you elaborate? I don't see what you mean.

benaadams · 2015-12-08T01:26:17Z

@CesarBS like https://github.com/aspnet/KestrelHttpServer/blob/dev/src/Microsoft.AspNet.Server.Kestrel/Http/FrameHeaders.Generated.cs#L8127

cesarblum · 2015-12-08T01:29:27Z

@benaadams Oh, with regard to endianess, you mean? I hadn't looked at that code, but it's likely it might be affected by it.

cesarblum · 2015-12-08T02:25:28Z

Ok, I did much better controlled tests and these are the results I got:

Before:

After:

Before:

After:

(Look at the allocation percentages in the last two)

That looks like some improvement to me 😀

I tested it with wrk, from a remote machine:

wrk -c 256 -t 32 -d 10 http://<local address>:5000

cesarblum · 2015-12-08T02:27:46Z

@stephentoub replied that they don't test Core on big endian. Again I'd say we can postpone verification on big endian.

halter73 · 2015-12-08T08:03:28Z

@CesarBS Aside from potentially addressing further feedback, do you think this PR is complete?

stephentoub · 2015-12-08T13:05:19Z

@stephentoub replied that they don't test Core on big endian. Again I'd say we can postpone verification on big endian.

I'd suggest at least adding a Debug.Assert(BitConverter.IsLittleEndian) to the relevant code.

cesarblum · 2015-12-08T16:48:39Z

@halter73 Yes.

cesarblum · 2015-12-08T17:56:31Z

@stephentoub I'd rather not. This code likely does not work on big endian, but it doesn't break things. It'll just go through a less optimized path.

stephentoub · 2015-12-08T18:01:19Z

I'd rather not. This code likely does not work on big endian, but it doesn't break things. It'll just go through a less optimized path

I've not looked at the code in depth. Just looked like wrong values would be computed in big endian. If that's not true, then an assert isn't valuable. But if running on a big endian system would start resulting in erroneous results, an assert would help to point out the problem immediately, and there's little downside. Like I said, though, I've not looked much at the code, so you know better than I whether there's a problem.

cesarblum · 2015-12-08T18:03:31Z

@stephentoub On a second thought, I noticed someone could craft some weird requests (not necessarily malicious) and this would not behave as expected. I'll add the assert.

cesarblum · 2015-12-08T19:28:16Z

Went with a slightly different approach. Instead of an assert I'm just checking BitConverter.IsLittleEndian and skipping the optimized path if it's false.

cesarblum · 2015-12-08T22:50:46Z

Squashed.

AspNetSmurfLab · 2015-12-14T20:35:27Z

Branchmarks:

dev

Running 15s test @ http://10.0.0.100:5001/plaintext
32 threads and 256 connections
Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.10ms   12.44ms 393.12ms   96.10%
    Req/Sec    24.62k     2.11k   67.62k    93.12%
11787922 requests in 15.10s, 1.45GB read
Socket errors: connect 0, read 0, write 177, timeout 0
Requests/sec: 780789.10
Transfer/sec:     98.29MB

This PR

Running 15s test @ http://10.0.0.100:5001/plaintext
32 threads and 256 connections
Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.94ms   12.55ms 211.73ms   94.72%
    Req/Sec    25.26k     1.87k   54.54k    87.10%
12106391 requests in 15.10s, 1.49GB read
Requests/sec: 801743.82
Transfer/sec:    100.93MB

@CesarBS ran those several times and the RPS was consistently around those marks for each branch.

benaadams · 2015-12-14T20:45:32Z

LGTM! 👍

cesarblum · 2015-12-16T17:04:59Z

Ping.

halter73 · 2015-12-16T18:40:21Z

cesarblum · 2015-12-16T18:40:43Z

Yay 😀

dnfclas added the cla-already-signed label Dec 2, 2015

halter73 reviewed Dec 2, 2015
View reviewed changes

benaadams mentioned this pull request Dec 2, 2015

Request header StringCache #411

Closed

cesarblum reviewed Dec 3, 2015
View reviewed changes

src/Microsoft.AspNet.Server.Kestrel/Properties/launchSettings.json

@@ -0,0 +1,3 @@

{

Copy link

Contributor Author

cesarblum Dec 3, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert

cesarblum changed the title ~~Reduce string allocations per request by 30%~~ Reduce string allocations per request Dec 3, 2015

cesarblum force-pushed the cesarbs/perf-alloc-optimizations branch from d509edf to bf927e5 Compare December 3, 2015 23:26

cesarblum force-pushed the cesarbs/perf-alloc-optimizations branch from cbc261c to 83f55c5 Compare December 8, 2015 01:59

cesarblum force-pushed the cesarbs/perf-alloc-optimizations branch from 83f55c5 to 56a5cd1 Compare December 8, 2015 19:27

cesarblum force-pushed the cesarbs/perf-alloc-optimizations branch from 56a5cd1 to 20e6862 Compare December 8, 2015 22:50

benaadams mentioned this pull request Dec 9, 2015

[Testing] Combo changes #458

Closed

cesarblum force-pushed the cesarbs/perf-alloc-optimizations branch from 20e6862 to 49439e8 Compare December 14, 2015 20:45

cesarblum closed this Dec 16, 2015

cesarblum force-pushed the cesarbs/perf-alloc-optimizations branch from 49439e8 to 349af50 Compare December 16, 2015 19:00

cesarblum deleted the cesarbs/perf-alloc-optimizations branch December 16, 2015 19:00

Reduce string allocations per request #441

Reduce string allocations per request #441

Conversation

cesarblum commented Dec 2, 2015

benaadams commented Dec 2, 2015

halter73 Dec 2, 2015

Choose a reason for hiding this comment

halter73 commented Dec 2, 2015

halter73 Dec 2, 2015

Choose a reason for hiding this comment

benaadams commented Dec 2, 2015

halter73 commented Dec 2, 2015

cesarblum commented Dec 3, 2015

cesarblum Dec 3, 2015

Choose a reason for hiding this comment

benaadams commented Dec 3, 2015

pakrym commented Dec 3, 2015

cesarblum commented Dec 3, 2015

cesarblum commented Dec 3, 2015

benaadams commented Dec 5, 2015

halter73 commented Dec 8, 2015

cesarblum commented Dec 8, 2015

cesarblum commented Dec 8, 2015

benaadams commented Dec 8, 2015

cesarblum commented Dec 8, 2015

benaadams commented Dec 8, 2015

cesarblum commented Dec 8, 2015

cesarblum commented Dec 8, 2015

cesarblum commented Dec 8, 2015

halter73 commented Dec 8, 2015

stephentoub commented Dec 8, 2015

cesarblum commented Dec 8, 2015

cesarblum commented Dec 8, 2015

stephentoub commented Dec 8, 2015

cesarblum commented Dec 8, 2015

cesarblum commented Dec 8, 2015

cesarblum commented Dec 8, 2015

AspNetSmurfLab commented Dec 14, 2015

benaadams commented Dec 14, 2015

cesarblum commented Dec 16, 2015

halter73 commented Dec 16, 2015

cesarblum commented Dec 16, 2015