[admin] config_dump scaffolding and routes endpoint#2771
[admin] config_dump scaffolding and routes endpoint#2771htuch merged 18 commits intoenvoyproxy:masterfrom jsedgwick:structured_admin
Conversation
|
@junr03 can you take first pass? Thanks. |
mattklein123
left a comment
There was a problem hiding this comment.
Cool! Few random comments from a quick skim. I did not do a full review. Thanks!
include/envoy/router/rds.h
Outdated
There was a problem hiding this comment.
Let's just kill /routes as part of this change. I don't think we need to go through a deprecation cycle for it.
There was a problem hiding this comment.
OK. Wasn't intending for a deprecation cycle, rather immediate removal after this change. I do not know what the feeling is around here on PR size from a reviewer's; I was trying to keep it small-ish
The point was also not to delete the old impl before making sure reviewers don't object to the meat of the PR. But I can do so now, no problem.
There was a problem hiding this comment.
There was a problem hiding this comment.
s/= 0/PURE (same elsewhere)
There was a problem hiding this comment.
Sure. Curious about that one though, why invent our own keyword? It doesn't save any keystrokes. I understand that pure is an adjective folks might be familiar with, and they absolutely should have made the syntax be = pure;, but it is supposed to stand alongside = default; and = delete;. This way it's more easily confused with const-qualifiers and ref-qualifiers. Syntactically, they're not interchangeable.
source/common/router/rds_impl.cc
Outdated
There was a problem hiding this comment.
Someone stealing the "routes" key for the dump map. See my next comment below.
There was a problem hiding this comment.
In general for error handling philosophy I would take a look at: https://github.com/envoyproxy/envoy/blob/master/STYLE.md#error-handling
For this one, I would probably just make this a RELEASE_ASSERT for now to make sure it doesn't get registered over. This is an important part of server functionality so we should just fail hard if someone bungles it with a plugin. We can make the error handling nicer later if this becomes a real issue.
source/server/http/admin.cc
Outdated
There was a problem hiding this comment.
don't catch exceptions here. Just crash.
There was a problem hiding this comment.
By crash, do you mean RELEASE_ASSERTING or letting the exception escape? Presumably the latter will also crash us, just clarifying your request.
I think you mentioned the desire to be able to dynamically hook in plugins and so on, so I was coding defensively. Is the stance going to be that we won't try to recover from any exception thrown by someone's plugin? It seems that segfaults and such aside, we don't want the rest of envoy to go down when you curl an admin endpoint (i.e. at some unpredictable time after start) because of something outside of this repo's control. So swallowing/logging would be fine here from that point of view because for this code the exception is recoverable.
Just asking about the philosophy here, can definitely change this and just crash.
There was a problem hiding this comment.
Per above take a look at the philosophy document. Our general philosophy is fail hard when possible, make it nicer later if needed. In this case I would just let the exceptions propagate which will crash the server hard (for now).
source/server/http/admin.cc
Outdated
There was a problem hiding this comment.
See above, except even softer case since they're just failing to follow an API, not letting an exception escape. I am just picturing everything humming along fine and then someone is firefighting and consuming admin and crashes all their instances because it turns out their logging function didn't return the right thing.
There was a problem hiding this comment.
Yup that's a bug and should be tested and fixed. I would fail hard here for now. RELEASE_ASSERT if you like over normal ASSERT.
source/server/http/admin.h
Outdated
There was a problem hiding this comment.
This TODO isn't actionable. Either make it more detailed or delete.
There was a problem hiding this comment.
I'll do the refactor in this PR since it's actionable for me and most of this code is changing anyways.
There was a problem hiding this comment.
OK. Again curious about crashing philosophy. Up above, you want to crash, but here, in code we totally own, you don't? If this invariant is violated, ConfigTracker is broken and doesn't do its job.
There was a problem hiding this comment.
See philosophy. For normal programming errors that should never happen always use ASSERT.
There was a problem hiding this comment.
Still need to switch this to ASSERT
There was a problem hiding this comment.
nit: generally we do typedef over using
There was a problem hiding this comment.
Sure, will change.
As you are probably aware, they're not always exactly the same, and if this gets genericized I might go back to using for some stuff it interacts better with templates. IMO it's also nicer to unify these aliases with using aliases for namespace members but whatever.
source/common/router/rds_impl.cc
Outdated
There was a problem hiding this comment.
Sorry I don't follow why this is necessary. Please remove the word "jank" and make the TODO more clear on why the thing is being done and how it will be fixed in the future. I haven't looked through in detail but it seems odd to me that we need to do this lazy clearing at all.
(Better, just fix it now if a better ownership model would make the code cleaner. This is not a very difficult change).
There was a problem hiding this comment.
I will just change it now. Only had to leave it for the old "routes/" implementation which I'll remove in this PR per your request above.
|
Thanks for the comments Matt. Tomorrow, I will remove the current "/routes" implementation and do other refactors mentioned above. |
…nt (#532) See envoyproxy/envoy#2771 for context I will add documentation once the interfaces and placement of these things is settled. Current location (new admin package) was agreed upon as a good start with @htuch but i don't feel strongly about that or naming. Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
|
oh wtf |
|
Per new release note policy please add a release note with any applicable feature docs to https://github.com/envoyproxy/data-plane-api/blob/master/docs/root/intro/version_history.rst. Thank you! |
|
@jsedgwick LMK when this is ready to review again (unclear on current state). No rush. Thanks. |
|
Yup it's ready.
On Sat, Mar 17, 2018 at 2:59 PM Matt Klein ***@***.***> wrote:
@jsedgwick <https://github.com/jsedgwick> LMK when this is ready to
review again (unclear on current state). No rush. Thanks.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2771 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAc4Ih3wUaZKVVs7llHTTw-i_-EMmZm2ks5tfVz-gaJpZM4Sk2kM>
.
--
*James Sedgwick*
Software Engineer, Networking Infrastructure
973.738.0737 <+19737380737>
[image: Lyft] <http://www.lyft.com/>
|
|
@jsedgwick tests are failing and there are merge conflicts. Can you fix? |
|
@mattklein123 rebased and fixed builds, should be good to go |
mattklein123
left a comment
There was a problem hiding this comment.
Very nice. Some small comments otherwise looks great.
include/envoy/server/admin.h
Outdated
include/envoy/router/rds.h
Outdated
There was a problem hiding this comment.
I think you can return a const envoy::api::v2::RouteConfiguration& here. You can use default_instance() in places where you need a reference to a default configuration.
There was a problem hiding this comment.
const envoy::api::v2::RouteConfiguration&. Also can we add some API docs for this new function?
There was a problem hiding this comment.
Will add docs.
Re: pass by const& vs value, this isn't explicitly addressed in our/G's style guide? I'm sure you already know the reasons I like it by value but I wouldn't fight about it save perf-sensitive paths, which this obviously isn't.
There was a problem hiding this comment.
We should flesh it out. In general the existing code follows roughly the following guidelines.
"For small objects, pass by value. For large objects, prefer passing by const-reference. If it's clear that move semantics will yield better performance, pass by rvalue reference with an explicit && operator. The reason for the explicit rvalue reference indication is that the call site then must either pass a temporary or use std::move() which makes the intention more clear." (The latter part is already covered when discussing unique_ptr, etc.).
There was a problem hiding this comment.
I, for one, try to always pass by value when possible (and move() to take advantage) aside from primitives. This is reflexive enough for me that I want to get ahead of whether you/other maintainers care enough about this that I have to change that.
So, to formalize my take:
- For cheap-copy objects, e.g. primitives and some POD, pass by value
- If the function needs its own copy of the object for any reason, pass by value
- Otherwise, pass by const reference
The additional restrictions you've laid out above, while fine for existing code, don't really make sense to me going forward. They complicate things for the caller instead of giving the caller the option of whether they care about perf. Passing by const& precludes taking advantage of move semantics, and passing by && precludes anything but rvalues. Passing by value gives you both.
It's obviously possible to have the internet's millionth long discussion about this but I also don't see it as something where consistency is critical, like with many extant conventions. In any event let's take it off this PR if necessary.
There was a problem hiding this comment.
This is reflexive enough for me that I want to get ahead of whether you/other maintainers care enough about this that I have to change that.
Yes, we are care about consistency.
It's obviously possible to have the internet's millionth long discussion about this but I also don't see it as something where consistency is critical, like with many extant conventions. In any event let's take it off this PR if necessary.
The expectation is that in general consistency is followed unless there is a good reason not to, which in this case is a matter of opinion. Happy to discuss further offline but at this point I'm going to consider this thread closed and assume that you will be consistent in this area. Thanks!
There was a problem hiding this comment.
nit: /** (same elsewhere if applicable). We should probably just check this with the linter at some point.
There was a problem hiding this comment.
Yeah, oops. Don't we just use vanilla clang-format? Or would we have control over this somewhere.
There was a problem hiding this comment.
/ worth filing an issue? By my count there are roughly 15 occurrences in trunk atm
There was a problem hiding this comment.
Sure, it's not a big deal, just noticed it.
source/common/router/rds_impl.cc
Outdated
There was a problem hiding this comment.
Can you add a small comment here on why RELEASE_ASSERT (basically just hard checking that someone doesn't add a random "routes" entry)
source/common/router/rds_impl.cc
Outdated
There was a problem hiding this comment.
I'm still a little bit unclear on why the dance in here is required. Can you add some comments?
There was a problem hiding this comment.
The code is paranoid about static route objects going away (they are shared_ptrs that we don't own, claims of static permanence or not). So this cleans out expired weak_ptrs simultaneously with accumulating the shared_ptrs for the caller. I will add comments.
source/common/router/rds_impl.cc
Outdated
There was a problem hiding this comment.
Surprised it doesn't just cast correctly when you return config_dump? Oh well.
There was a problem hiding this comment.
Yeah this was a wtf moment for me, this is what broke the Mac build the first time around. I tried so hard to resist the temptation, but I just went down a deep rabbit hole on why it's not compiling and have had no luck pinning down an answer yet. It really seems like clang or libc++ is misbehaving though.
There was a problem hiding this comment.
Don't worry about it. Was just curious.
There was a problem hiding this comment.
Still need to switch this to ASSERT
There was a problem hiding this comment.
nit: period end of sentence. Same elsewhere.
htuch
left a comment
There was a problem hiding this comment.
overall, I like where this is heading, thanks for taking this on. I still would like to ultimately have the final dump be closer to the ADS delivered config (i.e. replayable by a management server), but this gets us most of the way there, we can iterate on the final dump format later once we have this merged.
There was a problem hiding this comment.
I would actually drop the above two points, they confused me more as a reader than I think they helped. I would replace it with just "This has the advantage of simplifying callback lifetime and ownership management."
There was a problem hiding this comment.
I think you can also drop the last sentence here, most Envoy developers will know this to be the case.
There was a problem hiding this comment.
Envoy style is just {} instead of default. I know there are valid arguments for default, but consistency is a powerful argument as well. If you feel strongly about doing default, maybe change it everywhere in a distinct PR.
There was a problem hiding this comment.
Nah, don't feel strongly, this isn't one of the rare cases where it makes a difference. It should be in STYLE.md though, I'll make a note to add it.
There was a problem hiding this comment.
Nit: prefer for consistency to stick to Envoy typedef std:::unique_ptr<EntryOwner> EntryOwnerPtr. Consistency is a strong argument, if we want to change how this is done in Envoy, best to do it everywhere in a separate PR.
There was a problem hiding this comment.
as with default above, sure & I will add a discouragement of using in STYLE.md
source/common/router/rds_impl.cc
Outdated
There was a problem hiding this comment.
Do we just keep accumulating here if we don't do a GC sweep in getStaticRouteConfigProviders() ?
There was a problem hiding this comment.
Yup. I didn't stress too much because A. these are static and there should be a fixed number and B. I was planning on separately refactoring so the weak_ptr backflips are unnecessary
There was a problem hiding this comment.
Yeah agreed. Thanks for the comments in the other function, much more clear what is going on here. This seems fine to me given the use case.
source/common/router/rds_impl.cc
Outdated
There was a problem hiding this comment.
This should probably be MergeFrom.
There was a problem hiding this comment.
Can you clarify? I can't tell how to use MergeFrom here from the PB docs.
There was a problem hiding this comment.
Something like static_configs->Add()->MergeFrom(provider->configAsProto()) I think is how it is typically done IIRC.
source/common/router/rds_impl.cc
Outdated
There was a problem hiding this comment.
This should probably be MergeFrom.
htuch
left a comment
There was a problem hiding this comment.
LGTM modulo comments and build fixes.
source/common/router/rds_impl.cc
Outdated
There was a problem hiding this comment.
I think this falls into the ASSERT category, it's an internal programming invariant rather than a statement about the runtime system. RELEASE_ASSERT is largely used for unrecoverable environmental issues such as OOM or some unexpected syscall behavior that should "never happen".
There was a problem hiding this comment.
@htuch FWIW I recommended using RELEASE_ASSERT here because it's theoretically possible for a plugin to register a config tracker entry with the name "routes" which would be pretty confusing. This would at least crash in all builds. I don't feel very strongly either way.
There was a problem hiding this comment.
Fair enough, I would buy the argument that a plugin is an external environment factor that we can't control for as an internal invariant.
source/server/http/admin.cc
Outdated
There was a problem hiding this comment.
Sorry I didn't notice this till just now, but per the docs this should only respond to a POST. Can we fix that? Since we are probably going to have more POST methods in the future (cc @jmarantz) it might be worth it to build this into the registration system somehow, but up to you.
There was a problem hiding this comment.
+1 on making mutations respond only to POSTs. But is this method a mutation? It doesn't look like it, based on that it's a const method and it looks like it's not mutating config_tracker_. Which docs are you talking about? I couldn't find a reference to POST in the linked data-frame API PR.
Note in admin.h:
struct UrlHandler {
const std::string prefix_;
const std::string help_text_;
const HandlerCb handler_;
const bool removable_;
const bool mutates_server_state_;
}
This last 'false' in the initializer for /config_dump should not mutate server state, and therefore should be fine as a GET. That bool can be used to check for POST. Currently it is used to determine whether to render the HTML table entry as a simple a-tag link or as a button with a form that POSTs.
There was a problem hiding this comment.
Oops sorry I was crossing streams with #2837. This comment applies there. :)
source/server/http/admin.cc
Outdated
There was a problem hiding this comment.
+1 on making mutations respond only to POSTs. But is this method a mutation? It doesn't look like it, based on that it's a const method and it looks like it's not mutating config_tracker_. Which docs are you talking about? I couldn't find a reference to POST in the linked data-frame API PR.
Note in admin.h:
struct UrlHandler {
const std::string prefix_;
const std::string help_text_;
const HandlerCb handler_;
const bool removable_;
const bool mutates_server_state_;
}
This last 'false' in the initializer for /config_dump should not mutate server state, and therefore should be fine as a GET. That bool can be used to check for POST. Currently it is used to determine whether to render the HTML table entry as a simple a-tag link or as a button with a form that POSTs.
source/server/http/admin.cc
Outdated
There was a problem hiding this comment.
nit combine these two lines:
ProtobufTypes::MessagePtr message = key_callback_pair.second();
|
PTAL |
|
@jsedgwick build is still broken. |
|
@mattklein123 ah, variable that is only referenced in an ASSERT, didn't built Should be ok now. PTAL |
|
BTW I think our ASSERT should be something like |
|
@jsedgwick yeah, that ASSERT implementation would be a nice improvement, maybe either file a PR or issue for it? |
There was a problem hiding this comment.
Why is this only a weak_ptr? The map itself doesn't need to be weak does it? Style guide prefers simple ownership semantics where possible https://google.github.io/styleguide/cppguide.html#Ownership_and_Smart_Pointers.
Here's my strawman suggestion. Can we just assume ConfigTracker is going to outlive all EntryOwners? It should be effectively a singleton across the server, I think.
There was a problem hiding this comment.
Your second question answers your first - I wanted either to be able to outlive the other but for the map belong to ConfigTracker.
You're totally right with your strawman but as I intimated a few times in this PR I was planning on reusing ConfigTracker in situations where we couldn't guarantee it will effectively be a singleton.
Side note: +1 on simple ownership semantics, but that's exactly why I like to hide more complicated ownership situations behind simpler interfaces when they come up, as in this PR.
There was a problem hiding this comment.
In general my preference is to keep it simple unless the more complicated code is actually needed right now, but in the interest of moving this along I don't feel that strongly about it. Will defer to @htuch on it.
Either way, this PR needs a master merge now.
There was a problem hiding this comment.
@jsedgwick Ack. Looks like https://github.com/envoyproxy/envoy/pull/2771/files#diff-a112a0982616fac9024fc1c8e42f0661R175 is one such place, but the map is strongly owned there. If that's the only one, I'd say just change to shared_ptr for now and leave a TODO, that way we can see how it plays out.
|
@htuch I'll file a PR for |
|
This is an implementation detail. It's not an exposed API. Let's merge.
On Sun, Mar 25, 2018 at 12:21 PM Matt Klein ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In source/server/http/config_tracker_impl.h
<#2771 (comment)>:
> +namespace Server {
+
+/**
+ * Implementation of ConfigTracker.
+ */
+class ConfigTrackerImpl : public ConfigTracker {
+public:
+ EntryOwnerPtr add(const std::string& key, Cb cb) override;
+ const CbsMap& getCallbacksMap() const override;
+
+private:
+ std::shared_ptr<CbsMap> map_{std::make_shared<CbsMap>()};
+
+ class EntryOwnerImpl : public ConfigTracker::EntryOwner {
+ public:
+ EntryOwnerImpl(std::weak_ptr<CbsMap> map_weak, std::string key);
In general my preference is to keep it simple unless the more complicated
code is actually needed right now, but in the interest of moving this along
I don't feel that strongly about it. Will defer to @htuch
<https://github.com/htuch> on it.
Either way, this PR needs a master merge now.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2771 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAc4Iv3Km86rGT7KTZxZneM6zvmd_gT0ks5th8P-gaJpZM4Sk2kM>
.
--
*James Sedgwick*
Software Engineer, Networking Infrastructure
973.738.0737 <+19737380737>
[image: Lyft] <http://www.lyft.com/>
|
|
@htuch changed, PTAL |
|
@jsedgwick you need to merge master. Thanks! |
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
|
rebased on master |
…nt (#532) See envoyproxy/envoy#2771 for context I will add documentation once the interfaces and placement of these things is settled. Current location (new admin package) was agreed upon as a good start with @htuch but i don't feel strongly about that or naming. Signed-off-by: James Sedgwick <jsedgwick@lyft.com>
Description:
This is a step towards dumping configs in proto form. There is a new
config_dumpendpoint, which spits out all registered configs. It's pretty-printed JSON for now; obviously we can play with format/serialization.Anyone can add a config via the ConfigTracker object that hangs off Admin. ConfigTracker makes the whole thing RAII-y, so that the list of config providers is maintained opaquely. This means the end user doesn't have to worry about ownership, capturing
this, etc. Contrast with current unenforced pair of addHandler/removeHandler. This sort of managed weak ownership can be useful all over envoy; just in this diff I left a comment where something like ConfigTracker would delete a bunch of fragile code and confusion. I will genericize it in a future diff though, so let's focus on the functionality here and try to punt on extended discussion of this component and its possibilities.To demonstrate all the structured admin/config dumping scaffolding, I implemented it all for routes. See the data-plane-api PR (envoyproxy/data-plane-api#532) for context. But you can see how other config_dump handlers and, hopefully, all admin endpoints can be turned into structured, arbitrarily consumable proto via this setup.
Risk Level: Medium. New feature on admin subsystem, intended to be minimally intrusive into rest of server
Testing:
New + existing unit. Tested locally with
sudo ./bazel-bin/source/exe/envoy-static -c examples/front-proxy/front-envoy.yamlOutput looks like:
https://gist.github.com/jsedgwick/3a46408a9728ccef9a63d32b4c463c2b
Docs Changes:
I put
TODO doxmeeverywhere I plan to add docs once the code/API is agreed upon. I won't merge this PR until docs are in.Release Notes:
TODO pending consensus on this PR
Issues:
Makes progress on #2421, #2172
API Changes:]
envoyproxy/data-plane-api#532
Deprecated:
Current routes endpoint should be considered marked for deletion, but I'll do that in a subsequent PR. There should be some nice cleanup in addition to code deletion with this change.
Signed-off-by: James Sedgwick jsedgwick@lyft.com