Skip to content

Support for external authorization tcp filter.#2416

Merged
htuch merged 26 commits intoenvoyproxy:masterfrom
colabsaumoh:ext-auth-review-tcp
Feb 20, 2018
Merged

Support for external authorization tcp filter.#2416
htuch merged 26 commits intoenvoyproxy:masterfrom
colabsaumoh:ext-auth-review-tcp

Conversation

@saumoh
Copy link
Copy Markdown
Contributor

@saumoh saumoh commented Jan 19, 2018

Title
Authorization TCP and HTTP filters using an external gRPC service.
Patch 2 of 3: TCP filter.

Description
As per the feedback on #2359 I am breaking it up into three PR's.

The patch in this PR adds support for the TCP filter.
This PR implements the discussing we had in #2291.

Testing
I have tested this on my local setup with an external gRPC authorization server. I have also added UT's for the TCP filter.

Risk
Low: Because only the users of this filter will be impacted. It should not impact general stability of Envoy it self.

Caveats

Signed-off-by: Saurabh Mohan saurabh+github@tigera.io

Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Saurabh Mohan added 17 commits January 23, 2018 14:44
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
@htuch
Copy link
Copy Markdown
Member

htuch commented Feb 7, 2018

@junr03 could you take a first pass on this?

@saumoh
Copy link
Copy Markdown
Contributor Author

saumoh commented Feb 9, 2018

@junr03, @htuch Not in any particular rush from my end but just want to make sure it is in your radar. :-)
Thanks.

@junr03
Copy link
Copy Markdown
Member

junr03 commented Feb 9, 2018

@saumoh yes still on my radar. Sorry for the delay. I have been sick this week, so I had to prioritize my screen time. I will take a first pass this afternoon.

@saumoh
Copy link
Copy Markdown
Contributor Author

saumoh commented Feb 9, 2018

@junr03 hope you feel better soon...this can wait till then!

Copy link
Copy Markdown
Member

@junr03 junr03 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for splitting this work into 3 easily digestible steps. Some comments to get started.

/**
* Config registration for the external authorization filter. @see NamedNetworkFilterConfigFactory.
*/
class ExtAuthzConfigFactory : public NamedNetworkFilterConfigFactory {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@envoyproxy/senior-maintainers I noticed we are inconsistent about the naming of these filter config classes, i.e some have "filter" in them some dont.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, but if we want to have some standard to make things consistent we can put it in STYLE.md.

FactoryContext& context) {

ASSERT(!proto_config.stat_prefix().empty());
ASSERT(proto_config.grpc_service().has_envoy_grpc());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would update the documentation with an attention blob that says that the filter only works with envoy grpc https://github.com/envoyproxy/data-plane-api/blob/master/envoy/config/filter/network/ext_authz/v2/ext_authz.proto#L18

bool failOpen() const { return failure_mode_allow_; }

private:
static InstanceStats generateStats(const std::string& name, Stats::Scope& scope);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newline


Network::FilterStatus Instance::onData(Buffer::Instance&) {
if (status_ == Status::NotStarted) {
// If the ssl handshake was not done and data is the next event!
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please expand this comment. If I understand correctly, what you are trying to express is that there are two ways to initiate the external authz check, either when the filter gets an onEvent callback with a ConnectionEvent::Connected after a succesful ssl handshake or after an onData callback if the connection is not over ssl. I would make this more clear in this comment

// request already completed.
if (event == Network::ConnectionEvent::RemoteClose ||
event == Network::ConnectionEvent::LocalClose) {
if (status_ == Status::Calling) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the comment above here

Invoke([&](RequestCallbacks& callbacks) -> void { request_callbacks_ = &callbacks; })));

EXPECT_EQ(Network::FilterStatus::Continue, filter_->onNewConnection());
// Called by SSL when the handshake is done.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

EXPECT_CALL(*client_, cancel());
filter_callbacks_.connection_.raiseEvent(Network::ConnectionEvent::RemoteClose);

EXPECT_EQ(1U, stats_store_.counter("ext_authz.name.total").value());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would set expectations on the other stats to be 0

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added those.

namespace ExtAuthz {
namespace TcpFilter {

class ExtAuthzFilterTest : public testing::Test {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment for all the tests: can you set expectations on the active gauge

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The destructor checks for the gauge. But if that isn't read-able then i can add a check for the gauge.

"grpc_service": {
"envoy_grpc": { "cluster_name": "ext_authz_server" }
},
"failure_mode_allow": true,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no test with default (false)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this. Added a test.

EXPECT_EQ(0U, stats_store_.counter("ext_authz.name.cx_closed").value());
}

TEST_F(ExtAuthzFilterTest, Error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a very descriptive test name. And I am actually having a tough time groking what it is that you want to test here. Do you mind clarifying

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also like the idea of adding a leading comment line for tests that aren't obvious/trivial. E.g.:

// Validate error handling when foo does bar with baz.
TEST_F(ExtAuthzFilterTest, Error) {

Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
@saumoh
Copy link
Copy Markdown
Contributor Author

saumoh commented Feb 12, 2018

@junr03 Thanks for the detailed review comments. I think I have addressed all of them.
I created the PR in data-plane-api also: envoyproxy/data-plane-api#491

htuch pushed a commit to envoyproxy/data-plane-api that referenced this pull request Feb 13, 2018
Fix up the comments for failure_mode_allow that was incorrectly worded and caught during review envoyproxy/envoy#2416

Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a few comments.

namespace TcpFilter {

InstanceStats Config::generateStats(const std::string& name, Stats::Scope& scope) {
std::string final_prefix = fmt::format("ext_authz.{}.", name);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: const.


void Instance::callCheck() {
envoy::service::auth::v2::CheckRequest request;
CreateCheckRequest::createTcpCheck(filter_callbacks_, request);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have picked up on this at the time, but can you rename from CreateCheckRequest to CheckRequestUtils or something? Package and class name should be nouns, not verbs.

config_->stats().active_.inc();
config_->stats().total_.inc();

calling_check_ = true;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you integrate this state variable with status_ somehow? Having multiple state variables makes it harder to reason about.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch thanks for the review. I've addressed other changes. For this one I really found it more difficult to combine calling_check_ with status_.
calling_check_ is specifically to check for on stack callback; i.e when onComplete is invoked in the same call stack. When I tried to combine it into status_ it ended up with status_ CallingOnStack, CallingOffStack; which made it more confusing to me.

To me these two states feel disjoint. Any suggestions on how to address it?

config_->stats().total_.inc();

calling_check_ = true;
client_->check(*this, request, Tracing::NullSpan::instance());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might want to retain ownership of request until it has been sent, the gRPC client doesn't make a copy and only guarantees the send happens after you receive a response.

Saurabh Mohan added 2 commits February 14, 2018 16:39
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing last round of feedback, a few more comments/questions.

} else {
// SSL connection is post TCP newConnection. Therefore the ext_authz check in onEvent.
// if the ssl handshake was successful then it will invoke the
// Network::ConnectionEvent::Connected.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we assert or if condition on this event type?

// Network::ConnectionEvent::Connected.
// Thus invoke the authorization service check if not done yet.
if (status_ == Status::NotStarted) {
callCheck();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we treat SSL and non-SSL differently? Why not, in both cases, wait until we get the first onData? Then we can pause the pipe and perform auth check. Seems like that would have simpler flow, but I may well be missing something here, so interested to hear your thoughts.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a very good point/observation. Deferring to onData() would definitely simplify the call logic.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, I didnt think about that

if (status_ == Status::NotStarted) {
// If there was no ssl handshake there will be no event Network::ConnectionEvent::Connected
// and onData() will be the first event into the filter.
// Therefore, if the authorization service hasn not been invoked then we must do so now.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/hasn/hasn't/


void Instance::onComplete(CheckStatus status) {
status_ = Status::Complete;
filter_callbacks_->connection().readDisable(false);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be deferred until we know we are going to be successful and want to continue reading?

};
}

NetworkFilterFactoryCb ExtAuthzConfigFactory::createFilterFactory(const Json::Object& json_config,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need this for v1 config; are we supporting v1 ext_authz?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are doing only v2. This helps in specifying the configuration as v2 json and still use it without rigging up a xDS server.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is strictly needed for v2 JSON, this should only apply to v1 JSON placed in deprecated_v1. Does the test fail if you don't have this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaik this function is the entry point when I give envoy executable a configuration file envoy.conf and I can then specify the filter's configuration like:

      "filters": [
        {
         "type": "read",
         "name": "ext_authz",
         "config": { 
                     "stat_prefix": "authz",
                     "grpc_service": {
                        "envoy_grpc": { "cluster_name": "authz_server" }
                      },
                     "failure_mode_allow": true
                   }
        },

And that way I can run an end-2-end system.
i'll remove it since it seems to be not what Envoy natively expects to support.

} else {
// We can get completion inline, so only call continue if that isn't happening.
if (!calling_check_) {
filter_callbacks_->continueReading();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if readDisable(false) above is sufficient, and you don't need to do this; it's docs say it will restart the read pipeline, and redispatch pending data. Does this work for this filter? That could eliminate the need to maintain calling_check_.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving readDisable(false) here -which is what u were asking about at the top of this function-makes sense.
From what I understood-which may be totally wrong-is that readDisable(false) was to get the socket buffer pipeline moving and the filter_callbacks_->continueReading() is to get the filter chain iteration moving forward. The filter iteration may have been stopped iff the callCheck()and the onComplete() callback happened asynchronously.
But if they both are invoked in the same call stack then the filter chain was never stopped, hence it doesn't need to be kick started again by invoking filter_callbacks_->continueReading().

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that filter_callback->connection().readDisable(false|true) will no longer be necessary since now we will invoke the external authz service from onData() which clearly returns either a StopIteration or Continue and NOT from onEvent().

Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last couple of things and we can ship. This is looking great, thanks for your patience through the review process.

};
}

NetworkFilterFactoryCb ExtAuthzConfigFactory::createFilterFactory(const Json::Object& json_config,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is strictly needed for v2 JSON, this should only apply to v1 JSON placed in deprecated_v1. Does the test fail if you don't have this?

ASSERT(!proto_config.stat_prefix().empty());
ASSERT(!proto_config.grpc_service().envoy_grpc().cluster_name().empty());

ExtAuthz::TcpFilter::ConfigSharedPtr ext_authz_config(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my bad...thanks for catching that. Let me address this.

@saumoh
Copy link
Copy Markdown
Contributor Author

saumoh commented Feb 16, 2018

@htuch Quick clarification please. The NamedNetworkFilterFactory expects an implementation for createFilterFactory.
Should v2 only implementations throw an exception in this case?

Saurabh Mohan added 3 commits February 16, 2018 16:09
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
@saumoh
Copy link
Copy Markdown
Contributor Author

saumoh commented Feb 17, 2018

@htuch Thanks for all your feedback and patience in working through these PR's.
I've added the missing test cases. Based off your recommendation for what to do for v2 only createFilterFactory(). (throw exception/or something else). I'll modify the code and the corresponding test.

@htuch
Copy link
Copy Markdown
Member

htuch commented Feb 18, 2018

Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, appreciate all the changes, this is rad.

)EOF";

envoy::config::filter::network::ext_authz::v2::ExtAuthz proto_config{};
MessageUtil::loadFromJson(json, proto_config);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, we now generally prefer YAML examples and using MessageUtil::loadFromYaml, but we can fix this later or leave it as is, I'd like to unblock your other PR review.

@htuch htuch merged commit bae55f9 into envoyproxy:master Feb 20, 2018
@htuch
Copy link
Copy Markdown
Member

htuch commented Feb 20, 2018

@saumoh I'm seeing build failures on master following the merge here, e.g. https://circleci.com/gh/envoyproxy/envoy/27658?utm_campaign=build-failed&utm_content=failing-command&utm_medium=email#tests/containers/0/actions/103

Can you take a look? Thanks. CC @danielhochman who is on-call for maintainer rotation this week.

htuch added a commit to htuch/envoy that referenced this pull request Feb 20, 2018
Root cause: PR was not up-to-date when merging.

Signed-off-by: Harvey Tuch <htuch@google.com>
htuch added a commit that referenced this pull request Feb 20, 2018
Root cause: PR was not up-to-date when merging.

Signed-off-by: Harvey Tuch <htuch@google.com>
@saumoh
Copy link
Copy Markdown
Contributor Author

saumoh commented Feb 20, 2018

@htuch thanks for addressing the post merge issue. I just got in and was starting to look!

@htuch
Copy link
Copy Markdown
Member

htuch commented Feb 20, 2018

@saumoh NP. A good habit when working on PRs with rounds of reviews is to fetch/merge upstream each time you do a push. That way, the chance of this happening is reduced (but not eliminated). Try it out in your next PR review :)

@saumoh saumoh deleted the ext-auth-review-tcp branch May 4, 2018 17:28
jpsim pushed a commit that referenced this pull request Nov 28, 2022
Description: Integrate existing upstream Envoy support for proxies (#21942) with Android APIs that are responsible for getting the information about the configured proxy settings from user's device. There are a few things that we will need to address before we fully productionize this - they are listed in envoyproxy/envoy-mobile#2533.
Risk Level: Medium, it's supposed to be a no-op if not enabled explicitly.
Testing: Manual and integration
Docs Changes: Done
Release Notes: Done

Signed-off-by: Rafal Augustyniak <raugustyniak@lyft.com>
Signed-off-by: JP Simard <jp@jpsim.com>
jpsim pushed a commit that referenced this pull request Nov 29, 2022
Description: Integrate existing upstream Envoy support for proxies (#21942) with Android APIs that are responsible for getting the information about the configured proxy settings from user's device. There are a few things that we will need to address before we fully productionize this - they are listed in envoyproxy/envoy-mobile#2533.
Risk Level: Medium, it's supposed to be a no-op if not enabled explicitly.
Testing: Manual and integration
Docs Changes: Done
Release Notes: Done

Signed-off-by: Rafal Augustyniak <raugustyniak@lyft.com>
Signed-off-by: JP Simard <jp@jpsim.com>
Elite1015 pushed a commit to Elite1015/data-plane-api that referenced this pull request Feb 23, 2025
Fix up the comments for failure_mode_allow that was incorrectly worded and caught during review envoyproxy/envoy#2416

Signed-off-by: Saurabh Mohan <saurabh+github@tigera.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants