Skip to content

listener: clean up accept filter before creating connection#8922

Merged
mattklein123 merged 5 commits intoenvoyproxy:masterfrom
lambdai:ADfix
Nov 8, 2019
Merged

listener: clean up accept filter before creating connection#8922
mattklein123 merged 5 commits intoenvoyproxy:masterfrom
lambdai:ADfix

Conversation

@lambdai
Copy link
Contributor

@lambdai lambdai commented Nov 7, 2019

Signed-off-by: Yuchen Dai silentdai@gmail.com

Description:
When listener continue to accept connection on listener filter timeout, the OS socket fd could be owned by two separate FileEventImpl.
The destructing FileEventImpl cannot recover the intention of the alive FileEventImpl.

The observation is that EdgeTrigger flag is cleared and the fd will be LevelTriggered and the EPOLLOUT will be activated in each epoll_wait
immediately after the invoke.

Envoy worker thread owning that connecting will be consuming 1 cpu core in the repeated epoll_wait.

This PR is maintaining the immutability that at most one FileEventImpl could owns the fd so the assigned event always match the expectation
of FileEventImpl.

Before this PR

 # registered by tls_inspector
1573028637.322859 epoll_ctl(14, EPOLL_CTL_ADD, 26, {EPOLLIN|EPOLLRDHUP|EPOLLET, {u32=26, u64=26}}) = 0 
# registered by HCM
1573028637.419820 epoll_ctl(14, EPOLL_CTL_MOD, 26, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=26, u64=26}}) = 0         
# tls_inspector clean up, EPOLLET is cleared somehow
1573028637.420488 epoll_ctl(14, EPOLL_CTL_MOD, 26, {EPOLLIN|EPOLLOUT, {u32=26, u64=26}}) = 0                                                   

#repeated immediate return of epoll_wait, no write since the http write buffer is empty at the beginning
1573028637.420537 epoll_wait(14, [{EPOLLOUT, {u32=26, u64=26}}], 32, 24) = 1

With this PR

# registered by tls_inspector
1573030976.461595 epoll_ctl(14, EPOLL_CTL_ADD, 26, {EPOLLIN|EPOLLRDHUP|EPOLLET, {u32=26, u64=26}}) = 0 
# tls_inspector clean up  
1573030976.562546 epoll_ctl(14, EPOLL_CTL_DEL, 26, 0x7f1d914cd0a0) = 0                                                                                          
# registered by HCM, EPOLLET show up and leave there forever
1573030976.562986 epoll_ctl(14, EPOLL_CTL_ADD, 26, {EPOLLIN|EPOLLOUT|EPOLLET, {u32=26, u64=26}}) = 0                       

The proof that EPOLLIN|EPOLLOUT w/o EPOLLET is registered due to ~FileEventImpl
Below fd 21 op 3 event_uaddr 5 is recognized as epoll_ctl( ..., EPOLL_MOD, 21, { EPOLLIN|EPOLLOUT, {...}}) where EPOLLET (1 << 31) is missing.
fd 21 is the English spelling of 26 since I capture the logs in different run.

fd 21 op 3 event_uaddr 5 0x7f443e11d24a : epoll_ctl 0xa/0x30 [/lib/x86_64-linux-gnu/libc-2.27.so]
 0x562790a619b2 : epoll_apply_one_change 0x172/0x570 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x562790a61356 : epoll_nochangelist_del 0xa6/0xb0 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x562790a53b1c : evmap_io_del_ 0x34c/0x3e0 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x562790a48d15 : event_del_nolock_ 0x2d5/0x490 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x562790a4e60c : event_del_ 0xac/0x100 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x562790a4cc2a : event_del 0x1a/0x20 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x56279038c145 : Envoy::Event::ImplBase::~ImplBase() 0x15/0x30 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x562790366047 : Envoy::Event::FileEventImpl::~FileEventImpl() 0x47/0x60 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x56279036607c : Envoy::Event::FileEventImpl::~FileEventImpl() 0x1c/0x30 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x56278f7b8a5f : std::default_delete<Envoy::Event::FileEvent>::operator()(Envoy::Event::FileEvent*) const 0x2f/0x40 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x56278f7b83a3 : std::unique_ptr<Envoy::Event::FileEvent, std::default_delete<Envoy::Event::FileEvent> >::~unique_ptr() 0x53/0x80 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x56278f7d0187 : Envoy::Extensions::ListenerFilters::TlsInspector::Filter::~Filter() 0x47/0x70 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x56278f7d01cc : Envoy::Extensions::ListenerFilters::TlsInspector::Filter::~Filter() 0x1c/0x30 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x56278f7b23bf : std::default_delete<Envoy::Network::ListenerFilter>::operator()(Envoy::Network::ListenerFilter*) const 0x2f/0x40 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x56278f7b1cd3 : std::unique_ptr<Envoy::Network::ListenerFilter, std::default_delete<Envoy::Network::ListenerFilter> >::~unique_ptr() 0x53/0x80 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x562790350879 : void __gnu_cxx::new_allocator<std::_List_node<std::unique_ptr<Envoy::Network::ListenerFilter, std::default_delete<Envoy::Network::ListenerFilter> > > >::destroy<std::unique_ptr<Envoy::Network::ListenerFilter, std::default_delete<Envoy::Network::ListenerFilter> > >(std::unique_ptr<Envoy::Network::ListenerFilter, std::default_delete<Envoy::Network::ListenerFilter> >*) 0x19/0x20 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x5627903507d0 : void std::allocator_traits<std::allocator<std::_List_node<std::unique_ptr<Envoy::Network::ListenerFilter, std::default_delete<Envoy::Network::ListenerFilter> > > > >::destroy<std::unique_ptr<Envoy::Network::ListenerFilter, std::default_delete<Envoy::Network::ListenerFilter> > >(std::allocator<std::_List_node<std::unique_ptr<Envoy::Network::ListenerFilter, std::default_delete<Envoy::Network::ListenerFilter> > > >&, std::unique_ptr<Envoy::Network::ListenerFilter, std::default_delete<Envoy::Network::ListenerFilter> >*) 0x20/0x30 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x56279035072e : std::__cxx11::_List_base<std::unique_ptr<Envoy::Network::ListenerFilter, std::default_delete<Envoy::Network::ListenerFilter> >, std::allocator<std::unique_ptr<Envoy::Network::ListenerFilter, std::default_delete<Envoy::Network::ListenerFilter> > > >::_M_clear() 0x6e/0xa0 [/home/lambdai/workspace/tls/envoymaster1106.dbg]
 0x5627903505cf : std::__cxx11::list<std::unique_ptr<Envoy::Network::ListenerFilter, std::default_delete<Envoy::Network::ListenerFilter> >, std::allocator<std::unique_ptr<Envoy::Network::ListenerFilter, std::default_delete<Envoy::Network::ListenerFilter> > > >::clear() 0x1f/0x40 [/home/lambdai/workspace/tls/envoymaster1106.dbg]

Risk Level:
Testing:
Docs Changes:
Release Notes:
Fix istio/istio#18229

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging into this. I realize the urgency around this so a TODO/issue is fine for follow up, but optimally we would have some integration test that covers this, or at minimum some unit test. Can you try to add something to this PR or if not TODO that also? Thanks.

/wait

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
MockListenerFilter::MockListenerFilter() = default;
MockListenerFilter::~MockListenerFilter() = default;
MockListenerFilter::MockListenerFilter() {
// ON_CALL(*this, die_()).WillByDefault(testing::Invoke([](){}));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really know why ON_CALL + NiceMock reports uninterested call.
I add explicit EXPECT_CALL in each usage. That's a tech debt.
I will investigate later

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't work in the destructor of a mock. We do this all over the place. Please delete this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sad...During the constructor or destructor of MockFoo, the mock object is not nice or strict

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with small nits, thanks.

/wait

MockListenerFilter::MockListenerFilter() = default;
MockListenerFilter::~MockListenerFilter() = default;
MockListenerFilter::MockListenerFilter() {
// ON_CALL(*this, die_()).WillByDefault(testing::Invoke([](){}));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't work in the destructor of a mock. We do this all over the place. Please delete this comment.

MockListenerFilter();
~MockListenerFilter() override;

MOCK_METHOD0(die_, void());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/die_/destroy which is what we typically name this method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating...
Looks like destroy_ w/ underscore is preferred since it is added in MockClass

@lambdai
Copy link
Contributor Author

lambdai commented Nov 7, 2019

Thanks! Running test locally. Update in 5 minutes

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@mattklein123 mattklein123 merged commit c8de199 into envoyproxy:master Nov 8, 2019
lizan pushed a commit that referenced this pull request Nov 9, 2019
Signed-off-by: Yuchen Dai <silentdai@gmail.com>

Signed-off-by: Lizan Zhou <lizan@tetrate.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Istio Sidecar consuming high CPU Istio 1.30-1.3.3

4 participants