stats: Guard regex lookups with substring searches where possible#2693
stats: Guard regex lookups with substring searches where possible#2693htuch merged 9 commits intoenvoyproxy:masterfrom
Conversation
Signed-off-by: Joshua Marantz <jmarantz@google.com>
… regex-specify-substr Signed-off-by: Joshua Marantz <jmarantz@google.com>
Followed instructions in https://stackoverflow.com/questions/1657017/how-to-squash-all-git-commits-into-one to try to resolve DCO issue due to envoyproxy@1fac332 as there are no comments in the PR yet I think this is OK. Signed-off-by: Joshua Marantz <jmarantz@google.com>
93a86b0 to
fd2b507
Compare
|
ready for initial pass, ptal. Maybe @mrice32 ? |
mrice32
left a comment
There was a problem hiding this comment.
Looks great! Thanks! Just a few small comments.
|
|
||
| // cluster.[<cluster_name>.]ssl.ciphers.(<cipher>) | ||
| addRegex(SSL_CIPHER_SUITE, "^cluster(?=\\.).*?\\.ssl\\.ciphers(\\.(.*?))$"); | ||
| addRegex(SSL_CIPHER_SUITE, "^cluster(?=\\.).*?\\.ssl\\.ciphers(\\.(.*?))$", "ssl.ciphers"); |
There was a problem hiding this comment.
Why not ".ssl.ciphers." like in the other examples?
There was a problem hiding this comment.
That one was an omission, because I had some trouble with ".http." below and I wound up backing off the inclusion of the "." in some places where it was OK.
However I managed to sort out my issues with ".http." by exploiting the fact that I can now have two separate regexes for the same tag.
|
|
||
| // cluster.[<route_target_cluster>.]grpc.(<grpc_service>.)* | ||
| addRegex(GRPC_BRIDGE_SERVICE, "^cluster(?=\\.).*?\\.grpc\\.((.*?)\\.)"); | ||
| addRegex(GRPC_BRIDGE_SERVICE, "^cluster(?=\\.).*?\\.grpc\\.((.*?)\\.)", "grpc"); |
There was a problem hiding this comment.
Same as above, why not ".grpc."?
source/common/stats/stats_impl.cc
Outdated
|
|
||
| PERF_OPERATION(perf); | ||
|
|
||
| if (!substr_.empty() && stat_name.find(substr_) == std::string::npos) { |
There was a problem hiding this comment.
Can you add a simple test case to isolate this new portion of the logic and ensure it works as expected. The code is already covered by the regex test cases, but adding a separate test will isolate the functionality from the tests that check the regexes, themselves. See tests in https://github.com/envoyproxy/envoy/blob/master/test/integration/stats_integration_test.cc and https://github.com/envoyproxy/envoy/blob/master/test/common/stats/stats_impl_test.cc#L88 for other tests that test specific parts of the extraction process.
There was a problem hiding this comment.
done. Broke out substrMismatch as a separate method to make it easier to test.
…ecise substring pruning. Signed-off-by: Joshua Marantz <jmarantz@google.com>
…gal & useful now. Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
jmarantz
left a comment
There was a problem hiding this comment.
By breaking out the HTTP_CONN_MANAGER tag into two regexes I got the init speed to 2.8 seconds.
There is more on the table here for further PRs.
|
|
||
| // cluster.[<cluster_name>.]ssl.ciphers.(<cipher>) | ||
| addRegex(SSL_CIPHER_SUITE, "^cluster(?=\\.).*?\\.ssl\\.ciphers(\\.(.*?))$"); | ||
| addRegex(SSL_CIPHER_SUITE, "^cluster(?=\\.).*?\\.ssl\\.ciphers(\\.(.*?))$", "ssl.ciphers"); |
There was a problem hiding this comment.
That one was an omission, because I had some trouble with ".http." below and I wound up backing off the inclusion of the "." in some places where it was OK.
However I managed to sort out my issues with ".http." by exploiting the fact that I can now have two separate regexes for the same tag.
|
|
||
| // cluster.[<route_target_cluster>.]grpc.(<grpc_service>.)* | ||
| addRegex(GRPC_BRIDGE_SERVICE, "^cluster(?=\\.).*?\\.grpc\\.((.*?)\\.)"); | ||
| addRegex(GRPC_BRIDGE_SERVICE, "^cluster(?=\\.).*?\\.grpc\\.((.*?)\\.)", "grpc"); |
source/common/stats/stats_impl.cc
Outdated
|
|
||
| PERF_OPERATION(perf); | ||
|
|
||
| if (!substr_.empty() && stat_name.find(substr_) == std::string::npos) { |
There was a problem hiding this comment.
done. Broke out substrMismatch as a separate method to make it easier to test.
…p' anchor to a different regex. Signed-off-by: Joshua Marantz <jmarantz@google.com>
| * @return TagExtractorPtr newly constructed TagExtractor. | ||
| */ | ||
| static TagExtractorPtr createTagExtractor(const std::string& name, const std::string& regex); | ||
| static TagExtractorPtr createTagExtractor(const std::string& name, const std::string& regex, |
There was a problem hiding this comment.
Please add docs here and elsewhere that explain the prefix and what it does.
There was a problem hiding this comment.
Done; thanks for the review, and good catch!
Signed-off-by: Joshua Marantz <jmarantz@google.com>
source/common/stats/stats_impl.h
Outdated
| /** | ||
| * @param stat_name The stat name | ||
| * @return bool indicates whether tag extraction should be skipped for this stat_name due | ||
| * to a subdstring mismatch. |
Removing Admin from release builds by default Risk Level: medium Testing: n/a Docs Changes: n/a Release Notes: inline Signed-off-by: Alyssa Wilk <alyssar@chromium.org> Signed-off-by: alyssawilk <alyssar@google.com> Co-authored-by: JP Simard <jp@jpsim.com> Signed-off-by: JP Simard <jp@jpsim.com>
Removing Admin from release builds by default Risk Level: medium Testing: n/a Docs Changes: n/a Release Notes: inline Signed-off-by: Alyssa Wilk <alyssar@chromium.org> Signed-off-by: alyssawilk <alyssar@google.com> Co-authored-by: JP Simard <jp@jpsim.com> Signed-off-by: JP Simard <jp@jpsim.com>
Annotate addRegex calls with expected substrings. It is significantly faster to search a candidate stat for these substrings prior to running regex analysis. This change speeds up 10k case from 8 seconds to 3.5 sec.
Before:
After:
Description:
Allows explicit specification of a required substring, which can be quickly scanned for in an input string before applying regexes.
Risk Level: Medium - mis-specifying these can lead to broken tag processing
Testing:
//test/...
Release Notes: N/A