common: support parsing cookies into a map#17811
common: support parsing cookies into a map#17811mattklein123 merged 11 commits intoenvoyproxy:mainfrom theshubhamp:theshubhamp/cookie-map-refactor
Conversation
Added a method to HTTP Utilities to parse out cookies into a map. Changed oauth2 extension to lookups from a cookie map to reduce redundant scanning of cookies. Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
|
Hi @theshubhamp, welcome and thank you for your contribution. We will try to review your Pull Request as quickly as possible. In the meantime, please take a look at the contribution guidelines if you have not done so already. |
| expires_ = Http::Utility::parseCookieValue(headers, "OauthExpires"); | ||
| token_ = Http::Utility::parseCookieValue(headers, "BearerToken"); | ||
| hmac_ = Http::Utility::parseCookieValue(headers, "OauthHMAC"); | ||
| const auto& cookies = Http::Utility::parseCookies(headers); |
There was a problem hiding this comment.
This implementation parses all cookies into a map even when we just need a subset.
Should parseCookies(...) be enhanced to accept a filter to filter down on keys at the time of parsing ?
There was a problem hiding this comment.
Not sure if possible performance gain can balance increased code complexity. I'd rather keep it as it is for now.
There was a problem hiding this comment.
Yeah that instantiates many small strings, I think this should pass a lambda like the following to forEachCookies:
if (k == "OauthExpires") {
expires_ = v;
}
...
There was a problem hiding this comment.
if (k == "OauthExpires") { expires_ = v; } ...
This may require consumers of forEachCookies to worry about duplicate cookie semantics discussed on this thread #17811 (comment)
There was a problem hiding this comment.
How about this variant of parseCookies that accepts a key filter lambda ? It can support both static checks like these ^ OR something more dynamic like checking set / map membership.
absl::flat_hash_map<std::string, std::string>
Utility::parseCookies(const RequestHeaderMap& headers, const std::function<bool(absl::string_view)>& key_filter) {
absl::flat_hash_map<std::string, std::string> cookies;
forEachCookie(headers, Http::Headers::get().Cookie,
[&cookies, &key_filter](absl::string_view k, absl::string_view v) -> bool {
if (key_filter(k)) {
cookies.emplace(k, v);
}
// continue iterating until all cookies are processed.
return true;
});
return cookies;
}
An overload can be added that that defaults key_filter to a lambda that always returns true to provide a parse all cookies fallback.
There was a problem hiding this comment.
Yes, the latter sounds good to me if we care about uniform handling of duplicate cookies. Checking for set membership may be even faster than the static string comparisons.
There was a problem hiding this comment.
Added. Its being used like this in the oauth2 filter now:
...
const auto& cookies = Http::Utility::parseCookies(headers, [](absl::string_view key) -> bool {
return key == "OauthExpires" || key == "BearerToken" || key == "OauthHMAC";
});
...
Kept static comparisons inside the predicate thinking that serial = comparisons should be (a little ?) faster than using a set which requires hash computation before every check (if I am not mistaken)
source/common/http/utility.cc
Outdated
| // If the key matches, parse the value from the rest of the cookie string. | ||
| if (k == key) { | ||
| void forEachCookie(const HeaderMap& headers, const LowerCaseString& cookie_header, | ||
| const std::function<bool (const absl::string_view&, const absl::string_view&)> cookie_consumer) { |
There was a problem hiding this comment.
absl::string_view should be passed by value as the spec suggests. No need to add const and &.
There was a problem hiding this comment.
Thanks for pointing this out! I have dropped const and & as recommended in the spec
source/common/http/utility.cc
Outdated
| return value; | ||
| } | ||
|
|
||
| std::map<std::string, std::string> Utility::parseCookies(const RequestHeaderMap& headers) { |
There was a problem hiding this comment.
Better use absl::flat_hash_map here.
There was a problem hiding this comment.
Switched to using absl::flat_hash_map
For my curiosity: Why should it be preferred over std::map ? I tried looking at the abseil docs and other sources but could not get a simple answer!
There was a problem hiding this comment.
Algorithmic complexity of inserts and lookups is logarithmic for std::map. For hashes it's basically O(1).
| expires_ = Http::Utility::parseCookieValue(headers, "OauthExpires"); | ||
| token_ = Http::Utility::parseCookieValue(headers, "BearerToken"); | ||
| hmac_ = Http::Utility::parseCookieValue(headers, "OauthHMAC"); | ||
| const auto& cookies = Http::Utility::parseCookies(headers); |
There was a problem hiding this comment.
Not sure if possible performance gain can balance increased code complexity. I'd rather keep it as it is for now.
| const auto expires_it = cookies.find("OauthExpires"); | ||
| expires_ = expires_it != cookies.end() ? expires_it->second : EMPTY_STRING; |
There was a problem hiding this comment.
Replace it with a function defined in the anonymous namespace and call the function also for token_ and hmac_.
There was a problem hiding this comment.
Done, added std::string findValue(...) to unpack the hash_map as we need it here.
Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
Following best practices mentioned on https://abseil.io/docs/cpp/guides/strings#string_view: string_view objects are very lightweight, so you should always pass them by value within your methods and functions; don’t pass a const absl::string_view &. (Passing absl::string_view instead of const absl::string_view & has the same algorithmic complexity, but because of register allocation and parameter passing rules, it is generally faster to pass by value in this case.) Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
source/common/http/utility.cc
Outdated
| return std::string{result}; | ||
| // Iterate over each cookie & return if its value is not empty. | ||
| forEachCookie(headers, cookie, [&key, &value](absl::string_view k, absl::string_view v) -> bool { | ||
| if (key == k && !v.empty()) { |
There was a problem hiding this comment.
I guess this new code implements different behavior and theoretically may break certain setups. Consider the case of
Cookie: a=; b=1; a=2
Cookie: a=3
The old parseCookie(headers, "a", "cookie") would return 3. The new code returns 2 if I'm not mistaken.
I think it makes sense to add a test for this case and make sure it passes for both versions.
There was a problem hiding this comment.
Ah yes, this new version is not equivalent to the old one. I'll try to make both the variants (single-cookie and map) compatible with the old behaviour.
There was a problem hiding this comment.
On a related note it appears that even the old behaviours haven't been consistent over time.
For example if we use this new test as a reference:
TEST(HttpUtility, TestParseCookieDuplicates) {
TestRequestHeaderMapImpl headers{
{"someheader", "10.0.0.1"},
{"cookie", "a=; b=1; a=2"},
{"cookie", "a=3; b=2"}};
EXPECT_EQ(Utility::parseCookieValue(headers, "a"), "3");
EXPECT_EQ(Utility::parseCookieValue(headers, "b"), "1");
}
It fails before #17560 landed (changed header iter. from reverse to natural order)
# git status
HEAD detached at b57827c20
# bazel test //test/common/http:utility_test
...
NFO: 12 processes: 1 internal, 11 darwin-sandbox.
INFO: Build completed, 1 test FAILED, 12 total actions
//test/common/http:utility_test FAILED in 2.5s
# less /private/var/tmp/_bazel_shubhamp/306aac4fa57fcfade8ca38e576399a40/execroot/envoy/bazel-out/darwin-fastbuild/testlogs/test/common/http/utility_test/test.log
...
[ RUN ] HttpUtility.TestParseCookieDuplicates
test/common/http/utility_test.cc:559: Failure
Expected equality of these values:
Utility::parseCookieValue(headers, "b")
Which is: "2"
"1"
Stack trace:
0x10b50f3aa: Envoy::Http::HttpUtility_TestParseCookieDuplicates_Test::TestBody()
0x1148c1fc4: testing::internal::HandleSehExceptionsInMethodIfSupported<>()
0x114897a2b: testing::internal::HandleExceptionsInMethodIfSupported<>()
0x114897963: testing::Test::Run()
0x114898977: testing::TestInfo::Run()
... Google Test internal frames ...
But passes on main:
# less less /private/var/tmp/_bazel_shubhamp/306aac4fa57fcfade8ca38e576399a40/execroot/envoy/bazel-out/darwin-fastbuild/testlogs/test/common/http/utility_test/test.log
...
[ RUN ] HttpUtility.TestParseCookieDuplicates
[ OK ] HttpUtility.TestParseCookieDuplicates (0 ms)
There was a problem hiding this comment.
Behaviour on main (and the older version before it got changed) is a little inconsistent because it:
- picks the first matching cookie from a single
cookieheader (i.e. resolvesa's value to an empty string if the header value looks like:"a=; b=1; a=2") - but exhaustively searches remaining cookie headers until a non empty value is found.
And this ^ makes parsing out duplicate cookies spilt between multiple cookie headers fun.
There was a problem hiding this comment.
Any suggestions on how to proceed here ?
- Keep the behaviour from
mainas-is - Pick the
first cookie valueORfirst non-empty cookie value - Pick the
last cookie valueORlast non-empty cookie value
I'm inclined towards approaches 2 or 3 because it keeps the parsing simple in the long run - but this can break some setups.
RFC 6265 - HTTP State Management Mechanism itself is pretty ambiguous about duplicate cookies and ordering:
Although cookies are serialized linearly in the Cookie header,
servers SHOULD NOT rely upon the serialization order. In particular,
if the Cookie header contains two cookies with the same name (e.g.,
that were set with different Path or Domain attributes), servers
SHOULD NOT rely upon the order in which these cookies appear in the
header.
There was a problem hiding this comment.
Hm... That's a good catch! If the spec explicitly states that the order is unimportant then just pick the first value irrespective of its emptiness for the sake of perf.
A second reviewer may give a second opinion on it.
There was a problem hiding this comment.
+1 to take the first value for perf.
There was a problem hiding this comment.
Done, switched to picking up the first occurrence of a cookie!
Added tests so that this behavioural quirk has coverage.
|
/wait |
source/common/http/utility.cc
Outdated
| return std::string{result}; | ||
| // Iterate over each cookie & return if its value is not empty. | ||
| forEachCookie(headers, cookie, [&key, &value](absl::string_view k, absl::string_view v) -> bool { | ||
| if (key == k && !v.empty()) { |
There was a problem hiding this comment.
+1 to take the first value for perf.
source/common/http/utility.cc
Outdated
| return v; | ||
|
|
||
| bool continue_iteration = cookie_consumer(k, v); | ||
| if (!continue_iteration) { |
There was a problem hiding this comment.
| if (!continue_iteration) { | |
| if (!cookie_consumer(k, v)) { |
then you don't need continue_iteration.
There was a problem hiding this comment.
Done, moved the invocation inside if condition itself
| expires_ = Http::Utility::parseCookieValue(headers, "OauthExpires"); | ||
| token_ = Http::Utility::parseCookieValue(headers, "BearerToken"); | ||
| hmac_ = Http::Utility::parseCookieValue(headers, "OauthHMAC"); | ||
| const auto& cookies = Http::Utility::parseCookies(headers); |
There was a problem hiding this comment.
Yeah that instantiates many small strings, I think this should pass a lambda like the following to forEachCookies:
if (k == "OauthExpires") {
expires_ = v;
}
...
Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
…in headers + Tests for coverage Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
…2 ext Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
Added a method to HTTP Utilities to parse out cookies into a map.
Changed oauth2 extension to lookups from a cookie map to reduce
redundant scanning of cookies.
Requested as part of PR raised for
Risk Level: Low
Testing: Existing tests pass & added a new test for coverage.
Docs Changes: N/A
Release Notes: N/A
Requested as part of the PR #17721
Related to #17424
Signed-off-by: Shubham Patil theshubhamp@gmail.com