Skip to content

stats: add mechanism to create dynamic tokens without locks or sharing#9355

Merged
jmarantz merged 88 commits intoenvoyproxy:masterfrom
jmarantz:stats-dynamic-escape
Jan 21, 2020
Merged

stats: add mechanism to create dynamic tokens without locks or sharing#9355
jmarantz merged 88 commits intoenvoyproxy:masterfrom
jmarantz:stats-dynamic-escape

Conversation

@jmarantz
Copy link
Contributor

@jmarantz jmarantz commented Dec 15, 2019

Description:
Solves the issue of lock contention for stat-name creation for names not known at compile-time.

This could be broken up into as many 3 PRs, but I think it's reviewable as is. If we did break it up we could have:

  • one new PR to just add the StatNameDynamicStorage and StatNameDynamicPool abstractions
  • another PR to change all the uses of StatNameSet::getDynamic() to StatNameDynamicStorage
  • a final PR to remove the StatNameSet::getDynamic capabilities, as well as the recent_lookups_
    infrastructure from StatNameSet

In any case, a follow-up PR could remove the dynamics from ip_tagging_filter.cc which appears not to need them; I think all the possible tags are known at config-time. This is not included in this PR.

Risk Level: medium -- lots of bytes being messed with
Testing: //test/...
Docs Changes: n/a
Release Notes: n/a
Fixes: #7003

Signed-off-by: Joshua Marantz <jmarantz@google.com>
…, or on the length of a fake-symbol table stat-name.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…es or symbols. it's bytes.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…ass.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…odel.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
… into stats-fuzzer-not-too-long

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Contributor Author

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for diving in. I know this is pretty hairy low-level stuff. Hopefully we've got the testing & fizzing covered.

I will follow up with more detail about the representation in stats.md.


class StatNameDeathTest : public StatNameTest {};

#if 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch; forgot I had deferred reviving this test. Done.


// When storing Symbol arrays, we disallow Symbol 0, which is the only Symbol
// that will decode into uint8_t array starting (and ending) with {0}. Thus we
// can use a leading 0 in in the first byte to indicate that what follows is
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


// payload_bytes is the total number of bytes needed to represent the characters
// in name, plus their encoded size, plus the literal indicator.
uint64_t payload_bytes = SymbolTableImpl::Encoding::totalSizeBytes(name.size()) + 1;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// So the layout is
// [ length-of-whole-StatName, LiteralStringIndicator, length-of-name, name ]
// So we need to figure out how many bytes we need to represent length-of-name and
// bytes
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

time_source_.monotonicTime() - start_decode_);

size_t group_index = DynamoStats::groupIndex(status);
Stats::StatNameDynamicPool dynamic(stats_->symbolTable());
Copy link
Contributor Author

@jmarantz jmarantz Jan 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the perf implication is in the noise when there is no contention. Previously this code would construct some strings and look them up in a map, re-using a StatName on a hit. Now we construct the same strings and create an encoding of them unconditionally, skipping the map lookup and mutex acquisition. I would guess that the successful map lookup, which has to touch all the bytes of the string during the hash and again in the compare, is roughly on par with the cost of doing the unconditional encoding.

In the new scenario we'll have to do a memory allocation even for a symbol we've seen before, but in most of the cases we'll have to do a StrCat first anyway.

If you compare this to a scenario where we don't use symbol tables at all, there's some overhead because we need to re-encode the string into a block of memory that's roughly equivalent in size. But we'll still benefit when doing join() because the non-dynamic fragments (e.g. scope) that are being joined with will likely be 4 bytes or less.

Probably I should try to borrow or steal some of the infrastructure in common/http/codes.cc though so we don't have to make dynamic entries for upstream_req_total_XX and upstream_rq_time_XX, though I don't know if the XX codes are constrained the same way http codes are.

I'll try to summarize this for stats.md.

Comment on lines +142 to +143
if (downstream_cluster_.empty()) {
downstream_cluster_storage_.reset();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

tag_data.emplace_back(ip_tag.ip_tag_name(), cidr_set);
}
trie_ = std::make_unique<Network::LcTrie::LcTrie<std::string>>(tag_data);
// TODO(jmarantz): save stat-names for each remote address as stat_name_set builtins.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure; I used 'address' here as trie is constructed from:

  std::vector<std::string> tags =
      config_->trie().getData(callbacks_->streamInfo().downstreamRemoteAddress());

but happy to use whatever terminology you think makes the most sense. Changed.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks generally LGTM but will wait for the final updates to statsd.md for further review. Thank you!

/wait

as SSL ciphers or Redis commands.

### Dynamic stat tokens
### Dynnamic stat tokens
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done; also added a diagram showing memory layouts of a few scenarios, and a table of all the classes related to symbol tables.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@mattklein123
Copy link
Member

@tonya11en this is the PR I mentioned to you. Can you help review the symbol table changes? Thank you.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

🔨 rebuilding ci/circleci: coverage (failed build)

🐱

Caused by: a #9355 (comment) was created by @jmarantz.

see: more, trace.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the added docs and diagram. It's super clear to me now. This LGTM so I'm fine with merging, but I would still like @tonya11en to post review when he has time. Thank you!

names.reserve(6); // 2 entries are added by chargeQueryStats().
names.push_back(mongo_stats_->collection_);
names.push_back(mongo_stats_->getDynamic(active_query->query_info_.collection()));
Stats::StatNameDynamicPool dynamic(mongo_stats_->symbolTable());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jplevyak PTAL at this change even if you won't be able to review the entire PR. Note taht StatNameSet::getDynamic() is being removed in this PR in preference to this new model of making StatNames from dynamically changing strings that inline all the bytes and do not share or take locks (inspired by our discussion last month).

I imagine some of these sorts of changes will need to happen in the wasm fork.

Copy link
Member

@tonya11en tonya11en left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually- editing that last comment. I still need to grok what's going on in the symbol table files.

I'm spread kind of thin this week, so please don't block on me. I'm still coming up to speed on stats code.

are scenarios where the names are newly discovered from data in requests. To
avoid taking locks in this case, tokens can be formed dynamically using
`StatNameDynamicStorage` or `StatNameDynamicPool`. In this case we lose
substring sharing but we avoid taking locks. Dynamically generaeted tokens can
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: spelling (generaeted)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks will do a quick follow-up for that so I can merge this in.

@jmarantz jmarantz merged commit 7c78372 into envoyproxy:master Jan 21, 2020
@jmarantz jmarantz deleted the stats-dynamic-escape branch January 21, 2020 13:35
@jmarantz jmarantz mentioned this pull request Jan 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no stalebot Disables stalebot from closing an issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

stats: need a way to lookup stats from strings in hot-path without taking a lock.

4 participants