Skip to content

Dns resolve fuzz bug#17107

Merged
ggreenway merged 7 commits intoenvoyproxy:mainfrom
chaoqin-li1123:dns_resolve_fuzz_bug
Aug 17, 2021
Merged

Dns resolve fuzz bug#17107
ggreenway merged 7 commits intoenvoyproxy:mainfrom
chaoqin-li1123:dns_resolve_fuzz_bug

Conversation

@chaoqin-li1123
Copy link
Contributor

@chaoqin-li1123 chaoqin-li1123 commented Jun 23, 2021

Commit Message: The DNS resolver takes reference to a local variable as its member dns_resolver_options_, and may access invalid data during a DNS refresh, fix by copy the parameter in ctor of DnsResolverImpl.(https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=35272)
Additional Description:NA
Risk Level:low
Testing:add fuzz corpus
Docs Changes:NA
Release Notes:NA
Platform Specific Features:NA
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Deprecated:]
[Optional API Considerations:]

chaoqin-li1123 added 4 commits June 23, 2021 03:54
Signed-off-by: chaoqin-li1123 <chaoqinli@google.com>
Signed-off-by: chaoqin-li1123 <chaoqinli@google.com>
Signed-off-by: chaoqin-li1123 <chaoqinli@google.com>
Signed-off-by: chaoqin-li1123 <chaoqinli@google.com>
@asraa
Copy link
Contributor

asraa commented Jun 23, 2021

Could you please add the link to the bug in Fixes description?

Copy link
Contributor

@asraa asraa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an easy way to test the refresh?

@chaoqin-li1123
Copy link
Contributor Author

I don't know how to reproduce dns refresh in an easier way, we do have a few related fuzz bugs(35136, 34907) caused by this issue tho.

@lizan lizan added the waiting label Jun 30, 2021
@chaoqin-li1123
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #17107 (comment) was created by @chaoqin-li1123.

see: more, trace.

@github-actions
Copy link

github-actions bot commented Aug 8, 2021

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Aug 8, 2021
@chaoqin-li1123
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #17107 (comment) was created by @chaoqin-li1123.

see: more, trace.

@github-actions github-actions bot removed the stale stalebot believes this issue/PR has not been touched recently label Aug 11, 2021
asraa
asraa previously approved these changes Aug 11, 2021
Copy link
Contributor

@asraa asraa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggreenway can you please take a final pass? (or if you have any ideas for testing/or if this makes sense to you)

Copy link
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that an integration test with a dns cluster, pointed at a non-existent tcp resolver, will reproduce the issue in an ASAN build. It looks like currently the options are usually on the stack of the caller. They are used the first time in DnsResolverImpl::DnsResolverImpl while the stack-based config is still in scope, but then they can be used again if (status == ARES_ECONNREFUSED). Can you try to add an integration test that fails without your fix?

/wait

@ggreenway ggreenway self-assigned this Aug 11, 2021
Signed-off-by: chaoqin-li1123 <chaoqinli@google.com>
@chaoqin-li1123
Copy link
Contributor Author

I think that an integration test with a dns cluster, pointed at a non-existent tcp resolver, will reproduce the issue in an ASAN build. It looks like currently the options are usually on the stack of the caller. They are used the first time in DnsResolverImpl::DnsResolverImpl while the stack-based config is still in scope, but then they can be used again if (status == ARES_ECONNREFUSED). Can you try to add an integration test that fails without your fix?

/wait

we can make the dns_resolver_options in DnsImplTest a variable on stack to create a regression asan test.

Copy link
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/wait

Network::Test::getCanonicalLoopbackAddress(GetParam()));
listener_ = dispatcher_->createListener(socket_, *server_, true);
updateDnsResolverOptions();
envoy::config::core::v3::DnsResolverOptions dns_resolver_options = dns_resolver_options_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here explaining what we're testing by putting this on the stack?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense.

Signed-off-by: chaoqin-li1123 <chaoqinli@google.com>
Copy link
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ggreenway ggreenway merged commit a7de9b8 into envoyproxy:main Aug 17, 2021
yanjunxiang-google added a commit to yanjunxiang-google/envoy that referenced this pull request Aug 17, 2021
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
mpuncel added a commit to mpuncel/envoy that referenced this pull request Aug 19, 2021
* main:
  Fix for fuzz tests failing due to invalid corpus paths (envoyproxy#17767)
  kafka: fix integration test (envoyproxy#17764)
  Fix typo in cluster.proto (envoyproxy#17755)
  cluster manager: add drainConnections() API (envoyproxy#17747)
  kafka broker filter: move to contrib (envoyproxy#17750)
  quiche: switch external dependency to github (envoyproxy#17732)
  quiche: implement stream idle timeout in codec (envoyproxy#17674)
  Update c-ares to 1.17.2 (envoyproxy#17704)
  Fix dns resolve fuzz bug (envoyproxy#17107)
  Remove members that shadow members of the base class (envoyproxy#17713)
  thrift proxy: missing parts from the previous PR (envoyproxy#17668)
  thrift-proxy: cleanup ConnectionManager::ActiveRpc (envoyproxy#17734)
  listener: extra warning for deprecated use_proxy_proto field (envoyproxy#17736)
  kafka: add support for metadata request in mesh-filter (envoyproxy#17597)
  upstream: add all host map to priority set for fast host searching (envoyproxy#17290)
  Remove the support for `hidden_envoy_deprecated_per_filter_config` (envoyproxy#17725)
  tls: SDS support for private key providers (envoyproxy#16737)
  bazel: update rules_foreign_cc (envoyproxy#17445)

Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants