Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions source/extensions/filters/http/common/aws/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ envoy_cc_library(
name = "utility_lib",
srcs = ["utility.cc"],
hdrs = ["utility.h"],
external_deps = ["curl"],
deps = [
"//source/common/common:utility_lib",
"//source/common/http:headers_lib",
Expand Down
51 changes: 51 additions & 0 deletions source/extensions/filters/http/common/aws/utility.cc
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
#include "common/common/utility.h"

#include "absl/strings/str_join.h"
#include "curl/curl.h"

namespace Envoy {
namespace Extensions {
Expand Down Expand Up @@ -87,6 +88,56 @@ Utility::joinCanonicalHeaderNames(const std::map<std::string, std::string>& cano
});
}

static size_t curlCallback(char* ptr, size_t, size_t nmemb, void* data) {
auto buf = static_cast<std::string*>(data);
buf->append(ptr, nmemb);
return nmemb;
}

absl::optional<std::string> Utility::metadataFetcher(const std::string& host,
const std::string& path,
const std::string& auth_token) {
static const size_t MAX_RETRIES = 4;
static const std::chrono::milliseconds RETRY_DELAY{1000};
static const std::chrono::seconds TIMEOUT{5};

CURL* const curl = curl_easy_init();
if (!curl) {
return absl::nullopt;
};

const std::string url = fmt::format("http://{}/{}", host, path);
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_setopt(curl, CURLOPT_TIMEOUT, TIMEOUT.count());
curl_easy_setopt(curl, CURLOPT_FAILONERROR, 1L);

std::string buffer;
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &buffer);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, curlCallback);

struct curl_slist* headers = nullptr;
if (!auth_token.empty()) {
const std::string auth = fmt::format("Authorization: {}", auth_token);
headers = curl_slist_append(headers, auth.c_str());
curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headers);
}

for (size_t retry = 0; retry < MAX_RETRIES; retry++) {
const CURLcode res = curl_easy_perform(curl);
if (res == CURLE_OK) {
break;
}
ENVOY_LOG_MISC(debug, "Could not fetch AWS metadata: {}", curl_easy_strerror(res));
buffer.clear();
std::this_thread::sleep_for(RETRY_DELAY);
}

curl_easy_cleanup(curl);
curl_slist_free_all(headers);

return buffer.empty() ? absl::nullopt : absl::optional<std::string>(buffer);
}

} // namespace Aws
} // namespace Common
} // namespace HttpFilters
Expand Down
18 changes: 17 additions & 1 deletion source/extensions/filters/http/common/aws/utility.h
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,26 @@ class Utility {
*/
static std::string
joinCanonicalHeaderNames(const std::map<std::string, std::string>& canonical_headers);

/**
* Fetch AWS instance or task metadata.
*
* @param host host or ip address of the metadata endpoint.
* @param path path of the metadata document.
* @auth_token authentication token to pass in the request, empty string indicates no auth.
* @return Metadata document or nullopt in case if unable to fetch it.
*
* @note In case of an error, function will log ENVOY_LOG_MISC(debug) message.
*
* @note This is not main loop safe method as it is blocking. It is intended to be used from the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use curl here at all and have it be blocking? Why can't you use the built in Envoy HTTP async client facilities?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We initially implemented this using the Envoy Http client in #5546. Although I do not know what was the original motivation, there was a suggestion on that PR to use libcurl.

@htuch and @lavignes can comment more on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Don't have much more to add. When I originally implemented this, I ran into the issue that I ultimately needed to block signing the xDS request on the http request to get credentials. Note that the blocking is done on the gRPC client thread in my implementation. Not the main thread. It was after asking about methods for making blocking http requests it was suggested I make the blocking http client in #5546. But I don't recall in what issue it was suggested.

Then later in #5546 just using curl was proposed since it was being merged in as a dependency anyway. I felt more comfortable doing that since I wasn't very confident in my http client implementation. It was non-trivial for me to implement and test especially since I was very new to the codebase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK that's fine. Ultimately I don't think curl should be used here for a variety of reasons (blocking, more third party code to deal with, not the normal Envoy logs/metrics/traces, etc.) but we can merge and iterate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I pointed out that libcurl was now available, but I can't remember if this was something we decided as the preferred approach. The main concern would be that we don't block on a worker thread either in the Envoy-based HTTP client or libcurl.

* gRPC auth plugins that are able to schedule blocking plugins on a different thread.
*/
static absl::optional<std::string>
metadataFetcher(const std::string& host, const std::string& path, const std::string& auth_token);
};

} // namespace Aws
} // namespace Common
} // namespace HttpFilters
} // namespace Extensions
} // namespace Envoy
} // namespace Envoy
20 changes: 20 additions & 0 deletions test/integration/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -795,3 +795,23 @@ envoy_cc_test(
"//test/test_common:utility_lib",
],
)

envoy_cc_test(
name = "aws_metadata_fetcher_integration_test",
srcs = [
"aws_metadata_fetcher_integration_test.cc",
],
tags = ["exclusive"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this test tagged as exclusive? This acts as a serialization step and forces the total test run to increase by > 30s, as it can't be parallelized.

Copy link
Contributor

@lavignes lavignes Jul 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch I'm not sure why this is set this way but @ivitjuk is no longer working on this so I'm taking this back over from here.

I think this is safe to remove. I can open a PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration test is creating an instance of the Envoy which is acting as a metadata endpoint. If we remove exclusive, wouldn't we risk port collision with other integration tests that create listeners?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you are binding to fixed ports (which I didn't spot, we typically don't do this for the reason you state), this is safe to not be exclusive. The tests tend to bind to port zero, which allows safe concurrent execution.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @lavignes can you make a PR to remove exclusive?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lavignes thanks, appreciated!

deps = [
":integration_lib",
"//source/common/common:fmt_lib",
"//source/extensions/filters/http/common/aws:utility_lib",
"//source/extensions/filters/http/fault:config",
"//source/extensions/filters/http/fault:fault_filter_lib",
"//source/extensions/filters/http/router:config",
"//source/extensions/filters/network/echo:config",
"//source/extensions/filters/network/http_connection_manager:config",
"//test/server:utility_lib",
"//test/test_common:utility_lib",
],
)
152 changes: 152 additions & 0 deletions test/integration/aws_metadata_fetcher_integration_test.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
#include "common/common/fmt.h"

#include "extensions/filters/http/common/aws/utility.h"

#include "test/integration/integration.h"
#include "test/integration/utility.h"
#include "test/server/utility.h"
#include "test/test_common/utility.h"

namespace Envoy {

using Envoy::Extensions::HttpFilters::Common::Aws::Utility;

class AwsMetadataIntegrationTestBase : public ::testing::Test, public BaseIntegrationTest {
public:
AwsMetadataIntegrationTestBase(int status_code, int delay_s)
: BaseIntegrationTest(Network::Address::IpVersion::v4, renderConfig(status_code, delay_s)) {}

static std::string renderConfig(int status_code, int delay_s) {
return fmt::format(ConfigHelper::BASE_CONFIG + R"EOF(
filter_chains:
filters:
name: envoy.http_connection_manager
config:
stat_prefix: metadata_test
http_filters:
- name: envoy.fault
config:
delay:
fixed_delay:
seconds: {}
nanos: {}
percentage:
numerator: 100
denominator: HUNDRED
- name: envoy.router
codec_type: HTTP1
route_config:
virtual_hosts:
name: metadata_endpoint
routes:
- name: auth_route
direct_response:
status: {}
body:
inline_string: METADATA_VALUE_WITH_AUTH
match:
prefix: "/"
headers:
- name: Authorization
exact_match: AUTH_TOKEN
- name: no_auth_route
direct_response:
status: {}
body:
inline_string: METADATA_VALUE
match:
prefix: "/"
domains: "*"
name: route_config_0
)EOF",
delay_s, delay_s > 0 ? 0 : 1000, status_code, status_code);
}

void SetUp() override { BaseIntegrationTest::initialize(); }

void TearDown() override {
test_server_.reset();
fake_upstreams_.clear();
}
};

class AwsMetadataIntegrationTestSuccess : public AwsMetadataIntegrationTestBase {
public:
AwsMetadataIntegrationTestSuccess() : AwsMetadataIntegrationTestBase(200, 0) {}
};

TEST_F(AwsMetadataIntegrationTestSuccess, Success) {
const auto endpoint = fmt::format("{}:{}", Network::Test::getLoopbackAddressUrlString(version_),
lookupPort("listener_0"));
const auto response = Utility::metadataFetcher(endpoint, "", "");

ASSERT_TRUE(response.has_value());
EXPECT_EQ("METADATA_VALUE", *response);

ASSERT_NE(nullptr, test_server_->counter("http.metadata_test.downstream_rq_completed"));
EXPECT_EQ(1, test_server_->counter("http.metadata_test.downstream_rq_completed")->value());
}

TEST_F(AwsMetadataIntegrationTestSuccess, AuthToken) {
const auto endpoint = fmt::format("{}:{}", Network::Test::getLoopbackAddressUrlString(version_),
lookupPort("listener_0"));
const auto response = Utility::metadataFetcher(endpoint, "", "AUTH_TOKEN");

ASSERT_TRUE(response.has_value());
EXPECT_EQ("METADATA_VALUE_WITH_AUTH", *response);

ASSERT_NE(nullptr, test_server_->counter("http.metadata_test.downstream_rq_completed"));
EXPECT_EQ(1, test_server_->counter("http.metadata_test.downstream_rq_completed")->value());
}

class AwsMetadataIntegrationTestFailure : public AwsMetadataIntegrationTestBase {
public:
AwsMetadataIntegrationTestFailure() : AwsMetadataIntegrationTestBase(503, 0) {}
};

TEST_F(AwsMetadataIntegrationTestFailure, Failure) {
const auto endpoint = fmt::format("{}:{}", Network::Test::getLoopbackAddressUrlString(version_),
lookupPort("listener_0"));

const auto start_time = timeSystem().monotonicTime();
const auto response = Utility::metadataFetcher(endpoint, "", "");
const auto end_time = timeSystem().monotonicTime();

EXPECT_FALSE(response.has_value());

// Verify correct number of retries
ASSERT_NE(nullptr, test_server_->counter("http.metadata_test.downstream_rq_completed"));
EXPECT_EQ(4, test_server_->counter("http.metadata_test.downstream_rq_completed")->value());

// Verify correct sleep time between retries: 4 * 1000 = 4000
EXPECT_LE(4000,
std::chrono::duration_cast<std::chrono::milliseconds>(end_time - start_time).count());
}

class AwsMetadataIntegrationTestTimeout : public AwsMetadataIntegrationTestBase {
public:
AwsMetadataIntegrationTestTimeout() : AwsMetadataIntegrationTestBase(200, 10) {}
};

TEST_F(AwsMetadataIntegrationTestTimeout, Timeout) {
const auto endpoint = fmt::format("{}:{}", Network::Test::getLoopbackAddressUrlString(version_),
lookupPort("listener_0"));

const auto start_time = timeSystem().monotonicTime();
const auto response = Utility::metadataFetcher(endpoint, "", "");
const auto end_time = timeSystem().monotonicTime();

EXPECT_FALSE(response.has_value());

// We do now check http.metadata_test.downstream_rq_completed value here because it's
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like downstream_rq_completed is incremented in Mac even in the case connection is terminated on the Curl side due to the timeout. This doesn't happen on Linux.

Unfortunately I don't have access to a Mac machine to be able to debug this in detail.

@dio any suggestions how to proceed with this? Do we have a process when developers do not have access to a specific platform?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Unfortunately, we only have this CI. I'll try to take a look at this on my local. However, seems like the test is successful now?

Copy link
Member Author

@ivitjuk ivitjuk Jun 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because I removed the following check that was succeeding on Linux but failing on Mac:

ASSERT_NE(nullptr, test_server_->counter("http.metadata_test.downstream_rq_completed")); EXPECT_EQ(0, test_server_->counter("http.metadata_test.downstream_rq_completed")->value());

For, to me unknown, reason downstream_rq_completed was incremented on Mac for each request Curl aborted due to the timeout. On Linux this counter was not incremented.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's have a comment regarding this for now.

// behavior is different between Linux and Mac when Curl disconnects on timeout. On Mac it is
// incremented, while on Linux it is not.

// Verify correct sleep time between retries: 4 * 5000 = 20000
EXPECT_LE(20000,
std::chrono::duration_cast<std::chrono::milliseconds>(end_time - start_time).count());
EXPECT_GT(40000,
std::chrono::duration_cast<std::chrono::milliseconds>(end_time - start_time).count());
}

} // namespace Envoy
1 change: 1 addition & 0 deletions tools/check_format.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
# definitions for real-world time, the construction of them in main(), and perf annotation.
# For now it includes the validation server but that really should be injected too.
REAL_TIME_WHITELIST = ("./source/common/common/utility.h",
"./source/extensions/filters/http/common/aws/utility.cc",
"./source/common/event/real_time_system.cc",
"./source/common/event/real_time_system.h", "./source/exe/main_common.cc",
"./source/exe/main_common.h", "./source/server/config_validation/server.cc",
Expand Down