Skip to content

network: reintroduce #4382 (delayed conn close) with segv fix#4587

Merged
mattklein123 merged 8 commits intoenvoyproxy:masterfrom
AndresGuedez:delayed-close-timeout-segv
Oct 4, 2018
Merged

network: reintroduce #4382 (delayed conn close) with segv fix#4587
mattklein123 merged 8 commits intoenvoyproxy:masterfrom
AndresGuedez:delayed-close-timeout-segv

Conversation

@AndresGuedez
Copy link
Contributor

Reintroduce PR #4382 (aa9478f) with a bug fix to prevent segfaults during connection tear down race conditions.

#4382 was reverted in PR #4581 (9d32e5c) due to detection of the crashing bug.

Risk Level: Medium (reintroduces medium risk, previously reverted PR)
Testing: Added unit tests to reproduce (previously) crashing states
Docs Changes: N/A
Release Notes: Added

Fixes #4583

Re-enable the changes reverted in 9d32e5c, which were originally
merged as part of envoyproxy#4382.

Signed-off-by: Andres Guedez <aguedez@google.com>
Fixes a segfault introduced in envoyproxy#4382 due to a connection tear down race condition when the delayed
close timer triggers after connection state has been reset via closeSocket().

Signed-off-by: Andres Guedez <aguedez@google.com>
@mattklein123 mattklein123 self-assigned this Oct 2, 2018
@AndresGuedez
Copy link
Contributor Author

asan/tsan failing due to an issue with one of the tests I added; I'm working on a fix.

Signed-off-by: Andres Guedez <aguedez@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the bug!

// It's ok to disable even if the timer has already fired.
delayed_close_timer_->disableTimer();
}
disableDelayedCloseTimer();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the timer ever exist here? I don't think so? Can this just be an ASSERT that the timer is nullptr? If so we can remove the private function I think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timer disablement was here as a defensive measure (like the close() call after it is). However, now that closeSocket() is disabling the timer, I agree this is no longer needed and have added a check to the ASSERT().

}

// Test that tearing down the connection will disable the delayed close timer.
TEST_P(ConnectionImplTest, DelayedCloseTimeoutDisableOnSocketClose) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a big pain, but is it possible to switch all these new tests over to using InSequence? For tests like this IMO it makes the tests much stronger.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed and done.

Signed-off-by: Andres Guedez <aguedez@google.com>
Signed-off-by: Andres Guedez <aguedez@google.com>
@mattklein123
Copy link
Member

@AndresGuedez sorry looks like you need another master merge.

…eout-segv

Signed-off-by: Andres Guedez <aguedez@google.com>
Signed-off-by: Andres Guedez <aguedez@google.com>
@AndresGuedez
Copy link
Contributor Author

@AndresGuedez sorry looks like you need another master merge.

Done. Thanks!

Signed-off-by: Andres Guedez <aguedez@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@mattklein123 mattklein123 merged commit b16f529 into envoyproxy:master Oct 4, 2018
aa-stripe pushed a commit to aa-stripe/envoy that referenced this pull request Oct 11, 2018
…ix (envoyproxy#4587)

Re-enable the changes reverted in 9d32e5c, which were originally merged as part of envoyproxy#4382.

Signed-off-by: Andres Guedez <aguedez@google.com>
Signed-off-by: Aaltan Ahmad <aa@stripe.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Investigate crash introduced in #4382

2 participants