Skip to content

dfp: adding timing information about DNS resolution#18934

Merged
alyssawilk merged 9 commits intoenvoyproxy:mainfrom
alyssawilk:dfp
Nov 17, 2021
Merged

dfp: adding timing information about DNS resolution#18934
alyssawilk merged 9 commits intoenvoyproxy:mainfrom
alyssawilk:dfp

Conversation

@alyssawilk
Copy link
Copy Markdown
Contributor

Risk Level: low
Testing: new unit, integration tests
Docs Changes:
Release Notes:
Part of envoyproxy/envoy-mobile#1520

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk
Copy link
Copy Markdown
Contributor Author

Hey @mattklein123 pinging you for high level API thoughts.
The linked envoy mobile bug includes some other timing characteristics I'm going to add in follow-up PRs, namely time for the usptream connection to connect (start/ end) and time for SSL handshake (start/end)
right now I'm using the filterstate as it seems the simplest way to be extensible but if you have improvements to suggest I'd definitely prefer them earlier in this PR chain rather than later :-)

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@mattklein123
Copy link
Copy Markdown
Member

Hey @mattklein123 pinging you for high level API thoughts.

At a high level I think using filter state is fine for this, but I have 2 questions/comments:

  1. We should think about how to document this. Our filter state documentation is a bit haphazard and I want to make sure that people know about this.
  2. Do we think there will be multiple related items that could be packaged together in a struct within a single key? I.e., instead of having a key per uint, etc. have a key per related data / struct that we could add on to? That might be a bit easier to understand.

@alyssawilk
Copy link
Copy Markdown
Contributor Author

Agree this should have docs, but I wanted to make sure we agreed on where the info was stored, then sort out a doc location :-)

I like the idea of having a general struct. What do we think of a hash map from timing string to ms_since_epoch or ms_since_start? We could even TODO moving the backing store of the existing stream into metrics over to that and maybe even remove the timings APIs? Then we can doc up a well known filter states doc parallel to well known dynamic metadata? I think if we went that route I would lean towards adding it as a first class member to stream into APIs, but I'm good with whatever.

@mattklein123
Copy link
Copy Markdown
Member

I like the idea of having a general struct. What do we think of a hash map from timing string to ms_since_epoch or ms_since_start? We could even TODO moving the backing store of the existing stream into metrics over to that and maybe even remove the timings APIs? Then we can doc up a well known filter states doc parallel to well known dynamic metadata? I think if we went that route I would lean towards adding it as a first class member to stream into APIs, but I'm good with whatever.

Sure this sounds fine to me. The only thing I'm not sure of is whether we should key by string vs. just have some type of timing object that avoids hash lookups?

@alyssawilk
Copy link
Copy Markdown
Contributor Author

if we want it to be extensible, what would you suggest other than hash map? Given we're updating once per request/response I'm not worried about the hash lookup cost.
we already have an UpstreamTimingInfo struct. I could instead move the downstream constructs to a DownstreamTimingInfo struct add extensible functions to both?

@mattklein123
Copy link
Copy Markdown
Member

we already have an UpstreamTimingInfo struct. I could instead move the downstream constructs to a DownstreamTimingInfo struct add extensible functions to both?

Ah OK got it. sure makes sense to me.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@junr03
Copy link
Copy Markdown
Member

junr03 commented Nov 9, 2021

@alyssawilk do you want @mattklein123 to take a pass, or do you want me to now that you two discussed high level API concerns?

@alyssawilk
Copy link
Copy Markdown
Contributor Author

yeah if you'd be up for a pass that'd be great

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Copy link
Copy Markdown
Member

@junr03 junr03 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

}

DownstreamTiming& downstreamTiming() override {
if (!downstream_timing_.has_value()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have downstream_timing_ as optional? Seems like we could just have the struct if we are constructing on absence?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could go either way here, but I don't think we set these for TCP, and TCP (so redis etc.) have stream info, so I was largely trying to spare them the memory footprint.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah gotcha!

}

Http::FilterHeadersStatus ProxyFilter::decodeHeaders(Http::RequestHeaderMap& headers, bool) {
latchTime(decoder_callbacks_, dnsStart());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about latching the time here vs. latching at other points in this function. For instance, only if the filter gets to loadDnsCacheEntry?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it's the difference between when we start the lookup and when we start doing "DNS work"
I can go either way - moved.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@mattklein123
Copy link
Copy Markdown
Member

Merge main and then I can take a look?

/wait

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
mattklein123
mattklein123 previously approved these changes Nov 16, 2021
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with small optional comments.


class DownstreamTiming {
public:
void setValue(absl::string_view key, MonotonicTime value) { timings_[key] = value; }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we assert that we don't set a value twice? Not sure.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I didn't know if folks using this would want to do updates, so left without.

public:
ProxyFilter(const ProxyFilterConfigSharedPtr& config) : config_(config) {}

static const std::string& dnsStart() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think you can just do static constexpr absl::string_view = ... for these.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk alyssawilk merged commit 41438c9 into envoyproxy:main Nov 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants