Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion source/common/network/dns_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,8 @@ void DnsResolverImpl::PendingResolution::onAresGetAddrInfoCallback(int status, i
// We can't add a main thread assertion here because both this code is reused by dns filter
// and executed in both main thread and worker thread. Maybe split the code for filter and
// main thread.
TRY_NEEDS_AUDIT { callback_(resolution_status, std::move(address_list)); }
TRY_ASSERT_MAIN_THREAD { callback_(resolution_status, std::move(address_list)); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having 2 copies of the code that handles DNS resolutions just because of this try/catch issue doesn't seem good.

The use of dispatcher_ post to rethrow the exception in a different thread seems even stranger. Who would catch the posted exception?

Can we require that the callback passed in to dns_impl never throw?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I think it's a non-starter to copy this code. Either we need to share all of it, or we should just change the semantics of exception handling within the callback per @antoniovicente (though this may be difficult).

rethrow the exception in a different thread seems even stranger.

It's posting to the same thread. It's an artifact of not throwing through the C code. I agree, it's not good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is only a draft, I am not satisfied with the code duplication myself. l agree it is better to handle it without copying the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know which callback throws?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why we post to the same thread is because of #4307, c_ares as a c library has exception unsafe code, so exception throwing need to be executed in another call stack in the same thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The callbeck can be in logical_dns_cluster or strict_dns_cluster
Network::Utility::getAddressWithPort may throw. I haven't found any other exception throwing code yet.

END_TRY
catch (const EnvoyException& e) {
ENVOY_LOG(critical, "EnvoyException in c-ares callback: {}", e.what());
dispatcher_.post([s = std::string(e.what())] { throw EnvoyException(s); });
Expand Down
17 changes: 17 additions & 0 deletions source/extensions/filters/udp/dns_filter/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ envoy_cc_library(
],
external_deps = ["ares"],
deps = [
":dns_lib",
"//include/envoy/buffer:buffer_interface",
"//include/envoy/event:dispatcher_interface",
"//include/envoy/network:address_interface",
Expand Down Expand Up @@ -62,3 +63,19 @@ envoy_cc_extension(
"@envoy_api//envoy/extensions/filters/udp/dns_filter/v3alpha:pkg_cc_proto",
],
)

envoy_cc_library(
name = "dns_lib",
srcs = ["dns_impl.cc"],
hdrs = ["dns_impl.h"],
external_deps = ["ares"],
deps = [
"//include/envoy/event:dispatcher_interface",
"//include/envoy/event:file_event_interface",
"//include/envoy/network:dns_interface",
"//source/common/common:assert_lib",
"//source/common/common:linked_object",
"//source/common/network:address_lib",
"//source/common/network:utility_lib",
],
)
4 changes: 4 additions & 0 deletions source/extensions/filters/udp/dns_filter/dns_filter.h
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,10 @@ class DnsFilter : public Network::UdpListenerReadFilter, Logger::Loggable<Logger
*/
bool isKnownDomain(const absl::string_view domain_name);

void setResolver(const Network::DnsResolverSharedPtr& resolver) {
resolver_->setResolver(resolver);
}

private:
/**
* Prepare the response buffer and send it to the client
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
#include "envoy/event/dispatcher.h"
#include "envoy/network/dns.h"

#include "extensions/filters/udp/dns_filter/dns_impl.h"
#include "extensions/filters/udp/dns_filter/dns_parser.h"

namespace Envoy {
Expand All @@ -22,7 +23,8 @@ class DnsFilterResolver : Logger::Loggable<Logger::Id::filter> {
std::chrono::milliseconds timeout, Event::Dispatcher& dispatcher,
uint64_t max_pending_lookups)
: timeout_(timeout), dispatcher_(dispatcher),
resolver_(dispatcher.createDnsResolver(resolvers, false /* use_tcp_for_dns_lookups */)),
resolver_(DnsResolverImpl::createDnsResolver(dispatcher, resolvers,
false /* use_tcp_for_dns_lookups */)),
callback_(callback), max_pending_lookups_(max_pending_lookups) {}
/**
* @brief entry point to resolve the name in a DnsQueryRecord
Expand All @@ -34,6 +36,7 @@ class DnsFilterResolver : Logger::Loggable<Logger::Id::filter> {
* @param domain_query the query record object containing the name for which we are resolving
*/
void resolveExternalQuery(DnsQueryContextPtr context, const DnsQueryRecord* domain_query);
void setResolver(const Network::DnsResolverSharedPtr& resolver) { resolver_ = resolver; }

private:
struct LookupContext {
Expand Down Expand Up @@ -63,7 +66,7 @@ class DnsFilterResolver : Logger::Loggable<Logger::Id::filter> {

std::chrono::milliseconds timeout_;
Event::Dispatcher& dispatcher_;
const Network::DnsResolverSharedPtr resolver_;
Network::DnsResolverSharedPtr resolver_;
DnsFilterResolverCallback& callback_;
absl::flat_hash_map<const DnsQueryRecord*, LookupContext> lookups_;
uint64_t max_pending_lookups_;
Expand Down
311 changes: 311 additions & 0 deletions source/extensions/filters/udp/dns_filter/dns_impl.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,311 @@
#include "extensions/filters/udp/dns_filter/dns_impl.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is largely the same as implementation in source/common/network/dns_impl.cc (looks like the main differences are an addition of DnsResolverImpl::createDnsResolver() and removal of a main thread assert in onAresGetAddrInfoCallback). Couldn't you most of the implementation in a base class and use subclasses in common/network package and here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that makes sense, onAresGetAddrInfoCallback is not a virtual method to override though. Let me see if I can change it to use error propagation without exception.


#include <chrono>
#include <cstdint>
#include <list>
#include <memory>
#include <string>

#include "envoy/common/platform.h"

#include "common/common/assert.h"
#include "common/common/fmt.h"
#include "common/common/thread.h"
#include "common/network/address_impl.h"
#include "common/network/utility.h"

#include "absl/strings/str_join.h"
#include "ares.h"

namespace Envoy {
namespace Extensions {
namespace UdpFilters {
namespace DnsFilter {

DnsResolverImpl::DnsResolverImpl(
Event::Dispatcher& dispatcher,
const std::vector<Network::Address::InstanceConstSharedPtr>& resolvers,
const bool use_tcp_for_dns_lookups)
: dispatcher_(dispatcher),
timer_(dispatcher.createTimer([this] { onEventCallback(ARES_SOCKET_BAD, 0); })),
use_tcp_for_dns_lookups_(use_tcp_for_dns_lookups),
resolvers_csv_(maybeBuildResolversCsv(resolvers)) {
AresOptions options = defaultAresOptions();
initializeChannel(&options.options_, options.optmask_);
}

DnsResolverImpl::~DnsResolverImpl() {
timer_->disableTimer();
ares_destroy(channel_);
}

absl::optional<std::string> DnsResolverImpl::maybeBuildResolversCsv(
const std::vector<Network::Address::InstanceConstSharedPtr>& resolvers) {
if (resolvers.empty()) {
return absl::nullopt;
}

std::vector<std::string> resolver_addrs;
resolver_addrs.reserve(resolvers.size());
for (const auto& resolver : resolvers) {
// This should be an IP address (i.e. not a pipe).
if (resolver->ip() == nullptr) {
throw EnvoyException(
fmt::format("DNS resolver '{}' is not an IP address", resolver->asString()));
}
// Note that the ip()->port() may be zero if the port is not fully specified by the
// Address::Instance.
// resolver->asString() is avoided as that format may be modified by custom
// Address::Instance implementations in ways that make the <port> not a simple
// integer. See https://github.com/envoyproxy/envoy/pull/3366.
resolver_addrs.push_back(fmt::format(resolver->ip()->ipv6() ? "[{}]:{}" : "{}:{}",
resolver->ip()->addressAsString(),
resolver->ip()->port()));
}
return {absl::StrJoin(resolver_addrs, ",")};
}

DnsResolverImpl::AresOptions DnsResolverImpl::defaultAresOptions() {
AresOptions options{};

if (use_tcp_for_dns_lookups_) {
options.optmask_ |= ARES_OPT_FLAGS;
options.options_.flags |= ARES_FLAG_USEVC;
}

return options;
}

void DnsResolverImpl::initializeChannel(ares_options* options, int optmask) {
dirty_channel_ = false;

options->sock_state_cb = [](void* arg, os_fd_t fd, int read, int write) {
static_cast<DnsResolverImpl*>(arg)->onAresSocketStateChange(fd, read, write);
};
options->sock_state_cb_data = this;
ares_init_options(&channel_, options, optmask | ARES_OPT_SOCK_STATE_CB);

// Ensure that the channel points to custom resolvers, if they exist.
if (resolvers_csv_.has_value()) {
int result = ares_set_servers_ports_csv(channel_, resolvers_csv_->c_str());
RELEASE_ASSERT(result == ARES_SUCCESS, "");
}
}

void DnsResolverImpl::PendingResolution::onAresGetAddrInfoCallback(int status, int timeouts,
ares_addrinfo* addrinfo) {
// We receive ARES_EDESTRUCTION when destructing with pending queries.
if (status == ARES_EDESTRUCTION) {
ASSERT(owned_);
// This destruction might have been triggered by a peer PendingResolution that received a
// ARES_ECONNREFUSED. If the PendingResolution has not been cancelled that means that the
// callback_ target _should_ still be around. In that case, raise the callback_ so the target
// can be done with this query and initiate a new one.
if (!cancelled_) {
callback_(ResolutionStatus::Failure, {});
}
delete this;
return;
}
if (!fallback_if_failed_) {
completed_ = true;

// If c-ares returns ARES_ECONNREFUSED and there is no fallback we assume that the channel_ is
// broken. Mark the channel dirty so that it is destroyed and reinitialized on a subsequent call
// to DnsResolver::resolve(). The optimal solution would be for c-ares to reinitialize the
// channel, and not have Envoy track side effects.
// context: https://github.com/envoyproxy/envoy/issues/4543 and
// https://github.com/c-ares/c-ares/issues/301.
//
// The channel cannot be destroyed and reinitialized here because that leads to a c-ares
// segfault.
if (status == ARES_ECONNREFUSED) {
parent_.dirty_channel_ = true;
}
}

std::list<Network::DnsResponse> address_list;
ResolutionStatus resolution_status;
if (status == ARES_SUCCESS) {
resolution_status = ResolutionStatus::Success;
if (addrinfo != nullptr && addrinfo->nodes != nullptr) {
if (addrinfo->nodes->ai_family == AF_INET) {
for (const ares_addrinfo_node* ai = addrinfo->nodes; ai != nullptr; ai = ai->ai_next) {
sockaddr_in address;
memset(&address, 0, sizeof(address));
address.sin_family = AF_INET;
address.sin_port = 0;
address.sin_addr = reinterpret_cast<sockaddr_in*>(ai->ai_addr)->sin_addr;

address_list.emplace_back(
Network::DnsResponse(std::make_shared<const Network::Address::Ipv4Instance>(&address),
std::chrono::seconds(ai->ai_ttl)));
}
} else if (addrinfo->nodes->ai_family == AF_INET6) {
for (const ares_addrinfo_node* ai = addrinfo->nodes; ai != nullptr; ai = ai->ai_next) {
sockaddr_in6 address;
memset(&address, 0, sizeof(address));
address.sin6_family = AF_INET6;
address.sin6_port = 0;
address.sin6_addr = reinterpret_cast<sockaddr_in6*>(ai->ai_addr)->sin6_addr;
address_list.emplace_back(
Network::DnsResponse(std::make_shared<const Network::Address::Ipv6Instance>(address),
std::chrono::seconds(ai->ai_ttl)));
}
}
}

if (!address_list.empty()) {
completed_ = true;
}

ASSERT(addrinfo != nullptr);
ares_freeaddrinfo(addrinfo);
} else {
resolution_status = ResolutionStatus::Failure;
}

if (timeouts > 0) {
ENVOY_LOG(debug, "DNS request timed out {} times", timeouts);
}

if (completed_) {
if (!cancelled_) {
// TODO(chaoqin-li1123): remove this exception catching by refactoring.
// We can't add a main thread assertion here because both this code is reused by dns filter
// and executed in both main thread and worker thread. Maybe split the code for filter and
// main thread.
try {
callback_(resolution_status, std::move(address_list));
} catch (const EnvoyException& e) {
ENVOY_LOG(critical, "EnvoyException in c-ares callback: {}", e.what());
dispatcher_.post([s = std::string(e.what())] { throw EnvoyException(s); });
} catch (const std::exception& e) {
ENVOY_LOG(critical, "std::exception in c-ares callback: {}", e.what());
dispatcher_.post([s = std::string(e.what())] { throw EnvoyException(s); });
} catch (...) {
ENVOY_LOG(critical, "Unknown exception in c-ares callback");
dispatcher_.post([] { throw EnvoyException("unknown"); });
}
}
if (owned_) {
delete this;
return;
}
}

if (!completed_ && fallback_if_failed_) {
fallback_if_failed_ = false;
getAddrInfo(AF_INET);
// Note: Nothing can follow this call to getAddrInfo due to deletion of this
// object upon synchronous resolution.
return;
}
}

void DnsResolverImpl::updateAresTimer() {
// Update the timeout for events.
timeval timeout;
timeval* timeout_result = ares_timeout(channel_, nullptr, &timeout);
if (timeout_result != nullptr) {
const auto ms =
std::chrono::milliseconds(timeout_result->tv_sec * 1000 + timeout_result->tv_usec / 1000);
ENVOY_LOG(trace, "Setting DNS resolution timer for {} milliseconds", ms.count());
timer_->enableTimer(ms);
} else {
timer_->disableTimer();
}
}

void DnsResolverImpl::onEventCallback(os_fd_t fd, uint32_t events) {
const ares_socket_t read_fd = events & Event::FileReadyType::Read ? fd : ARES_SOCKET_BAD;
const ares_socket_t write_fd = events & Event::FileReadyType::Write ? fd : ARES_SOCKET_BAD;
ares_process_fd(channel_, read_fd, write_fd);
updateAresTimer();
}

void DnsResolverImpl::onAresSocketStateChange(os_fd_t fd, int read, int write) {
updateAresTimer();
auto it = events_.find(fd);
// Stop tracking events for fd if no more state change events.
if (read == 0 && write == 0) {
if (it != events_.end()) {
events_.erase(it);
}
return;
}

// If we weren't tracking the fd before, create a new FileEvent.
if (it == events_.end()) {
events_[fd] = dispatcher_.createFileEvent(
fd, [this, fd](uint32_t events) { onEventCallback(fd, events); },
Event::FileTriggerType::Level, Event::FileReadyType::Read | Event::FileReadyType::Write);
}
events_[fd]->setEnabled((read ? Event::FileReadyType::Read : 0) |
(write ? Event::FileReadyType::Write : 0));
}

Network::ActiveDnsQuery* DnsResolverImpl::resolve(const std::string& dns_name,
Network::DnsLookupFamily dns_lookup_family,
ResolveCb callback) {
// TODO(hennna): Add DNS caching which will allow testing the edge case of a
// failed initial call to getAddrInfo followed by a synchronous IPv4
// resolution.

// @see DnsResolverImpl::PendingResolution::onAresGetAddrInfoCallback for why this is done.
if (dirty_channel_) {
ares_destroy(channel_);
AresOptions options = defaultAresOptions();
initializeChannel(&options.options_, options.optmask_);
}

std::unique_ptr<PendingResolution> pending_resolution(
new PendingResolution(*this, callback, dispatcher_, channel_, dns_name));
if (dns_lookup_family == Network::DnsLookupFamily::Auto) {
pending_resolution->fallback_if_failed_ = true;
}

if (dns_lookup_family == Network::DnsLookupFamily::V4Only) {
pending_resolution->getAddrInfo(AF_INET);
} else {
pending_resolution->getAddrInfo(AF_INET6);
}

if (pending_resolution->completed_) {
// Resolution does not need asynchronous behavior or network events. For
// example, localhost lookup.
return nullptr;
} else {
// Enable timer to wake us up if the request times out.
updateAresTimer();

// The PendingResolution will self-delete when the request completes (including if cancelled or
// if ~DnsResolverImpl() happens via ares_destroy() and subsequent handling of ARES_EDESTRUCTION
// in DnsResolverImpl::PendingResolution::onAresGetAddrInfoCallback()).
pending_resolution->owned_ = true;
return pending_resolution.release();
}
}

void DnsResolverImpl::PendingResolution::getAddrInfo(int family) {
struct ares_addrinfo_hints hints = {};
hints.ai_family = family;

/**
* ARES_AI_NOSORT result addresses will not be sorted and no connections to resolved addresses
* will be attempted
*/
hints.ai_flags = ARES_AI_NOSORT;

ares_getaddrinfo(
channel_, dns_name_.c_str(), /* service */ nullptr, &hints,
[](void* arg, int status, int timeouts, ares_addrinfo* addrinfo) {
static_cast<PendingResolution*>(arg)->onAresGetAddrInfoCallback(status, timeouts, addrinfo);
},
this);
}

} // namespace DnsFilter
} // namespace UdpFilters
} // namespace Extensions
} // namespace Envoy
Loading