Skip to content

connection: don't pass read data to filters when in delayed close.#6852

Merged
htuch merged 6 commits intoenvoyproxy:masterfrom
htuch:hcm-fix-delayed-close
May 14, 2019
Merged

connection: don't pass read data to filters when in delayed close.#6852
htuch merged 6 commits intoenvoyproxy:masterfrom
htuch:hcm-fix-delayed-close

Conversation

@htuch
Copy link
Member

@htuch htuch commented May 7, 2019

Fuzzing revealed that HCM is not happy when it think it has performed a
close(), e.g. due to CodecProtocolException, and new data arrives in
onData(). It basically tries to dispatch that to the codec, which might
be in reset state. This situation can arise when the connection (at the
network level) is in delayed close and it continues to dispatch to HCM.

The fix is to not forward data in ConnectionImpl::onRead(). This seems
the safest place to short circuit during delayed close, since L4 filters
(including HCM) should be able to rely on the fact that they don't
receive any data after a close().

Fixes oss-fuzz issue
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14516.

Risk level: Medium (one line change, sensitive area of code base)
Testing: Unit tests added.

Signed-off-by: Harvey Tuch htuch@google.com

Fuzzing revealed that HCM is not happy when it think it has performed a
close(), e.g. due to CodecProtocolException, and new data arrives in
onData(). It basically tries to dispatch that to the codec, which might
be in reset state. This situation can arise when the connection (at the
network level) is in delayed close and it continues to dispatch to HCM.

The fix is to not forward data in ConnectionImpl::onRead(). This seems
the safest place to short circuit during delayed close, since L4 filters
(including HCM) should be able to rely on the fact that they don't
receive any data after a close().

Fixes oss-fuzz issue
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14516.

Risk level: Medium (one line change, sensitive area of code base)
Testing: TBD

Signed-off-by: Harvey Tuch <htuch@google.com>
@htuch
Copy link
Member Author

htuch commented May 7, 2019

@AndresGuedez @alyssawilk @mattklein123 LMK if you think this is the right approach and I'll add some tests beyond the corpus entry. There's actually a bunch of bad stuff that happens in HCM if you pump in data after it thinks it has closed; I think other L4 filters could also suffer from this.

Copy link
Contributor

@AndresGuedez AndresGuedez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this. I suggested an alternate approach to your fix which avoids the extra conditional.

Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
alyssawilk
alyssawilk previously approved these changes May 8, 2019
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though optional (me being me) I'd love an addition to unit tests as well as the fuzz test :-)

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely looks like the right approach.

/wait

Signed-off-by: Harvey Tuch <htuch@google.com>
@htuch htuch force-pushed the hcm-fix-delayed-close branch from bdcc185 to 06d8363 Compare May 10, 2019 23:23
mattklein123
mattklein123 previously approved these changes May 13, 2019
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks like a solid fix. Would definitely like @AndresGuedez and @alyssawilk to take a look.

Copy link
Contributor

@AndresGuedez AndresGuedez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the extra test coverage!

A few general comments regarding the tests:

  1. I think it would be helpful to EXPECT_CALL(.., closeSocket(...)) since the socket's state explains why transport_socket_->doRead() does or does not trigger in the tests.
  2. transport_socket_->canFlushClose() is returning false which means that some corner cases regarding write buffers not being depleted after writes are not being exercised. Consider adding a test to cover that edge case.

alyssawilk
alyssawilk previously approved these changes May 13, 2019
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending Andres' comments being addressed

@htuch htuch dismissed stale reviews from alyssawilk and mattklein123 via b849e01 May 13, 2019 20:52
Copy link
Contributor

@AndresGuedez AndresGuedez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@htuch htuch merged commit 1d104e5 into envoyproxy:master May 14, 2019
@htuch htuch deleted the hcm-fix-delayed-close branch May 14, 2019 01:02
duderino pushed a commit to istio/envoy that referenced this pull request Aug 13, 2019
…nvoyproxy#6852)

Fuzzing revealed that HCM is not happy when it think it has performed a
close(), e.g. due to CodecProtocolException, and new data arrives in
onData(). It basically tries to dispatch that to the codec, which might
be in reset state. This situation can arise when the connection (at the
network level) is in delayed close and it continues to dispatch to HCM.

The fix is to not forward data in ConnectionImpl::onRead(). This seems
the safest place to short circuit during delayed close, since L4 filters
(including HCM) should be able to rely on the fact that they don't
receive any data after a close().

Fixes oss-fuzz issue
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14516.

Risk level: Medium (one line change, sensitive area of code base)
Testing: Unit tests added.

Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants