Skip to content

http: O(1) custom header map#11546

Merged
mattklein123 merged 5 commits intomasterfrom
custom_headers
Jun 17, 2020
Merged

http: O(1) custom header map#11546
mattklein123 merged 5 commits intomasterfrom
custom_headers

Conversation

@mattklein123
Copy link
Member

Implement custom O(1) header registration for header maps.

  1. Remove virtual inheritence from header maps.
  2. Implement variable inline storage for concrete header maps.
  3. Various TODO/logic cleanups from previous header refactors.
  4. Demonstrate what custom header registration looks like for the
    CORS filter. More cleanup of this type can be done.

Fixes #4815

Risk Level: High
Testing: TBD
Docs Changes: N/A
Release Notes: N/A

Signed-off-by: Matt Klein <mklein@lyft.com>
@mattklein123
Copy link
Member Author

Here is a perf delta between the old header map and the new one on the benchmark:

ubuntu@ip-10-0-0-215:~$ Source/benchmark/tools/compare.py benchmarks ./old_header.json ./new_header.json 
Comparing ./old_header.json to ./new_header.json
Benchmark                                          Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------------------------
HeaderMapImplCreate                             +0.4041         +0.4041            13            19            13            19
HeaderMapImplSetReference/0                     +0.0491         +0.0491            46            48            46            48
HeaderMapImplSetReference/1                     +0.0372         +0.0372            53            55            53            55
HeaderMapImplSetReference/10                    +0.0150         +0.0150           141           143           141           143
HeaderMapImplSetReference/50                    +0.0192         +0.0192           338           344           338           344
HeaderMapImplGet/0                              -0.0026         -0.0026            10            10            10            10
HeaderMapImplGet/1                              +0.0407         +0.0407            19            19            19            19
HeaderMapImplGet/10                             +0.0223         +0.0223           106           108           106           108
HeaderMapImplGet/50                             +0.0204         +0.0204           300           306           300           306
HeaderMapImplGetInline/0                        +1.4427         +1.4427             2             6             2             6
HeaderMapImplGetInline/1                        +1.4472         +1.4472             2             6             2             6
HeaderMapImplGetInline/10                       +1.4388         +1.4387             2             6             2             6
HeaderMapImplGetInline/50                       +1.4444         +1.4443             2             6             2             6
HeaderMapImplSetInlineMacro/0                   +0.2190         +0.2191             9            11             9            11
HeaderMapImplSetInlineMacro/1                   +0.2246         +0.2246             9            11             9            11
HeaderMapImplSetInlineMacro/10                  +0.1733         +0.1732            10            11            10            11
HeaderMapImplSetInlineMacro/50                  +0.2539         +0.2539             9            12             9            12
HeaderMapImplSetInlineInteger/0                 -0.0202         -0.0202            28            27            28            27
HeaderMapImplSetInlineInteger/1                 +0.0159         +0.0159            28            28            28            28
HeaderMapImplSetInlineInteger/10                -0.0111         -0.0111            28            27            28            27
HeaderMapImplSetInlineInteger/50                -0.0142         -0.0143            28            27            28            27
HeaderMapImplGetByteSize/0                      -0.1622         -0.1622             1             1             1             1
HeaderMapImplGetByteSize/1                      -0.1627         -0.1627             1             1             1             1
HeaderMapImplGetByteSize/10                     -0.1495         -0.1494             1             1             1             1
HeaderMapImplGetByteSize/50                     -0.1522         -0.1522             1             1             1             1
HeaderMapImplIterate/0                          +0.0611         +0.0611             2             2             2             2
HeaderMapImplIterate/1                          +0.2846         +0.2846             3             4             3             4
HeaderMapImplIterate/10                         +0.0197         +0.0197            14            15            14            15
HeaderMapImplIterate/50                         +0.0061         +0.0061            74            74            74            74
HeaderMapImplLookup/0                           +0.5095         +0.5095            13            20            13            20
HeaderMapImplLookup/1                           +0.5076         +0.5076            13            20            13            20
HeaderMapImplLookup/10                          +0.5062         +0.5062            13            20            13            20
HeaderMapImplLookup/50                          +0.5088         +0.5088            13            20            13            20
HeaderMapImplRemove/0                           +0.0540         +0.0540            53            56            53            56
HeaderMapImplRemove/1                           +0.0334         +0.0334            60            62            60            62
HeaderMapImplRemove/10                          +0.0245         +0.0244           148           151           148           151
HeaderMapImplRemove/50                          +0.0221         +0.0222           345           352           345           352
HeaderMapImplRemoveInline/0                     +0.2864         +0.2863            67            86            67            86
HeaderMapImplRemoveInline/1                     +0.2814         +0.2813            67            86            67            86
HeaderMapImplRemoveInline/10                    +0.3723         +0.3723            67            92            67            92
HeaderMapImplRemoveInline/50                    +0.3782         +0.3782            67            92            67            92
HeaderMapImplPopulate                           +0.2822         +0.2822           412           528           412           528

There is definitely a cost for the new implementation as it involves some more cycles of indirection to look up various pieces of data, and a few more virtual calls. I can likely run some of this through callgrind and optimize it some more. Overall though I think the cost is likely worth it given the cleanup and the opportunity to have more custom headers, and compile out custom headers that the user doesn't care about.

Note that there is further cleanup work I want to do including removing more default headers and moving them into extensions as well as making the route header add/remove functions pre-lookup O(1) headers and use those if they are defined instead of the O(N) versions.

I also want to add some printing of registered headers and sizes at server startup. I will do that on this PR before we finalize it. @jmarantz I wanted to get some initial feedback of this refreshed code though. I have fixed all the previous comments and the impl is even simpler now.

@jmarantz
Copy link
Contributor

Thanks for fleshing this out!

Should we also try a pure flat_hash_map with int-ordered entries just to see what perf looks like there as well, a well as try to compare in one table the performance of this vs the lazy-map variants described in https://docs.google.com/document/d/1yxyRODqaLwMyzBGRkzF_6DJGCGfp4nfjzkUIhtQ8Tc0/edit and prototyped in #9261 ?

@mattklein123
Copy link
Member Author

Should we also try a pure flat_hash_map with int-ordered entries just to see what perf looks like there as well, a well as try to compare in one table the performance of this vs the lazy-map variants described in https://docs.google.com/document/d/1yxyRODqaLwMyzBGRkzF_6DJGCGfp4nfjzkUIhtQ8Tc0/edit and prototyped in #9261 ?

The issue with ^ is that IMO to do it justice we need to flesh out the benchmarks we have quite a bit, to account for the different perf profiles that we expect to see with the different implementations (e.g. re-ordering on encoding, possibly maintaining a linked hash map, building the lookaside map, etc.). My preference would be to land this as we understand the perf profile, it will improve perf in certain cases, and the perf degradation above over the existing impl is on the order of single digit nanoseconds per operation. After that I can start poking at some other stuff in my spare time. WDYT? If we feel that want to compare this with the other potential implementations that will take a lot more time.

@jmarantz
Copy link
Contributor

I agree that having a comprehensive suite that evaluates the best strategy across realistic use-cases is out-of-scope.

I was just thinking about getting a matrix of results for the existing speed tests.

@htuch would be interested in your take here too.

@adisuissa
Copy link
Contributor

Looks cleaner to me, and more readable, thanks!
The registration of custom headers seems straight-forward.
SGTM!

@htuch
Copy link
Member

htuch commented Jun 11, 2020

Yeah, I think landing this is totally fine as an immediate step. I'd love to see the absl::flat_hash_map comparison at some point, because I think there is a lot of opportunity to simplify if we can find a way to make that work (and even if we don't, we would have convinced ourselves that the current level of complexity is necessary).

@mattklein123
Copy link
Member Author

I'm willing to work on additional implementations and benchmarks after I land this, but if we agree that we can land this now that would avoid it going out of date as we do that. Since it sounds like we agree, I will finish the PR and get it into its final state, but PTAL now and lmk if you have any major comments.

Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confess I haven't been through this in detail but I gave a first pass. I don't think I'll be able to take another look today as my schedule is tight, but I'll try to pour over it this weekend.

Just some small notes so far.

Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
@mattklein123
Copy link
Member Author

@jmarantz this should be ready for final review.

yanavlasov
yanavlasov previously approved these changes Jun 16, 2020
Copy link
Contributor

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me, just added a question.

jmarantz
jmarantz previously approved these changes Jun 16, 2020
Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks great. Would you consider adding a header_map.md file to describe the architecture, data structure diagram, threading model, life-of-a-custom-header, etc?

A few minor perf/commenting nits.

@mattklein123
Copy link
Member Author

Would you consider adding a header_map.md file to describe the architecture, data structure diagram, threading model, life-of-a-custom-header, etc?

Yes will do.

Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
@mattklein123 mattklein123 dismissed stale reviews from jmarantz and yanavlasov via 17a0f39 June 17, 2020 16:35
@mattklein123
Copy link
Member Author

@jmarantz updated PTAL and lmk if you want to see anything else in the markdown file.

const auto http_response_status = Http::Utility::getResponseStatus(*headers);
const auto grpc_status = Common::getGrpcStatus(*headers);
callbacks_.onReceiveInitialMetadata(end_stream ? std::make_unique<Http::ResponseHeaderMapImpl>()
callbacks_.onReceiveInitialMetadata(end_stream ? Http::ResponseHeaderMapImpl::create()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy doesn't like the pre-existing std::move of headers below, followed on line 115 by a reference to the moved object. I'm guessing you'll want to fix this somehow; maybe by holding a ref in a temp?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a TODO below. This is a pre-existing bug and I would rather not fix this in this change.

Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great, thank you!

@mattklein123 mattklein123 merged commit 8262a0d into master Jun 17, 2020
@mattklein123 mattklein123 deleted the custom_headers branch June 17, 2020 23:36
florincoras added a commit to florincoras/envoy that referenced this pull request Jun 18, 2020
Test added by envoyproxy#11414 uses RequestHeaderMapImpl constructor made private
in envoyproxy#11546

Signed-off-by: Florin Coras <fcoras@cisco.com>
lizan pushed a commit that referenced this pull request Jun 18, 2020
Test added by #11414 uses RequestHeaderMapImpl constructor made private
in #11546

Signed-off-by: Florin Coras <fcoras@cisco.com>
yashwant121 pushed a commit to yashwant121/envoy that referenced this pull request Jun 24, 2020
Implement custom O(1) header registration for header maps.

- Remove virtual inheritence from header maps.
- Implement variable inline storage for concrete header maps.
- Various TODO/logic cleanups from previous header refactors.
- Demonstrate what custom header registration looks like for the
  CORS filter. More cleanup of this type can be done.

Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: yashwant121 <yadavyashwant36@gmail.com>
yashwant121 pushed a commit to yashwant121/envoy that referenced this pull request Jun 24, 2020
Test added by envoyproxy#11414 uses RequestHeaderMapImpl constructor made private
in envoyproxy#11546

Signed-off-by: Florin Coras <fcoras@cisco.com>
Signed-off-by: yashwant121 <yadavyashwant36@gmail.com>
songhu pushed a commit to songhu/envoy that referenced this pull request Jun 25, 2020
Implement custom O(1) header registration for header maps.

- Remove virtual inheritence from header maps.
- Implement variable inline storage for concrete header maps.
- Various TODO/logic cleanups from previous header refactors.
- Demonstrate what custom header registration looks like for the
  CORS filter. More cleanup of this type can be done.

Signed-off-by: Matt Klein <mklein@lyft.com>
songhu pushed a commit to songhu/envoy that referenced this pull request Jun 25, 2020
Test added by envoyproxy#11414 uses RequestHeaderMapImpl constructor made private
in envoyproxy#11546

Signed-off-by: Florin Coras <fcoras@cisco.com>
yashwant121 pushed a commit to yashwant121/envoy that referenced this pull request Jul 24, 2020
Implement custom O(1) header registration for header maps.

- Remove virtual inheritence from header maps.
- Implement variable inline storage for concrete header maps.
- Various TODO/logic cleanups from previous header refactors.
- Demonstrate what custom header registration looks like for the
  CORS filter. More cleanup of this type can be done.

Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: yashwant121 <yadavyashwant36@gmail.com>
yashwant121 pushed a commit to yashwant121/envoy that referenced this pull request Jul 24, 2020
Test added by envoyproxy#11414 uses RequestHeaderMapImpl constructor made private
in envoyproxy#11546

Signed-off-by: Florin Coras <fcoras@cisco.com>
Signed-off-by: yashwant121 <yadavyashwant36@gmail.com>
@wbpcode wbpcode mentioned this pull request Oct 30, 2020
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

header map: allow extensions to statically register O(1) headers

5 participants