Fix iOS termination crash in ProvisionalDispatcher#2056
Conversation
When iOS is terminating a process, it may release some static memory, which could cause the `event_dispatcher_` variable to point to invalid memory, leading to a crash. A crash that occurs when an app is already in the process of being terminated isn't too bad, considering it's already shutting down, but this matters to some extent, considering that crash reporters will report this as an app crash, and the OS may choose to penalize your app by not proactively suggesting it. This should fix a crash with a backtrace such as this one: ``` EXC_BAD_ACCESS: Attempted to dereference garbage pointer 0x8000000000000010. 0 Envoy::Event::ProvisionalDispatcher::post(std::__1::function<void ()>) 1 Envoy::EngineHandle::runOnEngineDispatcher(long, std::__1::function<void (Envoy::Engine&)>) 2 start_stream 3 -[EnvoyHTTPStreamImpl initWithHandle:callbacks:explicitFlowControl:] 4 -[EnvoyEngineImpl startStreamWithCallbacks:explicitFlowControl:] 5 StreamPrototype.start(queue:) ``` Signed-off-by: JP Simard <jp@jpsim.com>
|
Caveat: I haven't been able to reproduce this crash, nor validate that this would actually fix it. I'm just making an educated guess based on the backtrace at the time of the crash. |
ProvisionalDispatcher
| if (drained_) { | ||
| event_dispatcher_->post(callback); | ||
| return ENVOY_SUCCESS; | ||
| if (event_dispatcher_) { |
There was a problem hiding this comment.
If this is for shutdown, when does this get set to null? I wouldn't expect the destructor to null this out
If this is post-destruction I think ideally you want to avoid calling post at all since we're in undefined behavior territory
There was a problem hiding this comment.
ok that makes sense, but how could I detect that we're post-destruction at the call site here?
There was a problem hiding this comment.
The C++ way is to never have handles to anything that is deallocated, since there's really no way to inspect something post-destruction to see if destruction has happened (this is UB if done naively). What I would probably want to understand is how we can lock the shared ptr but end up with a handle that is not guaranteed to be valid for the duration of this shared ptr. If we were calling terminateEngine I can see how this happens (we call terminate even though there might be external references to the engine), but I think in the cases we're seeing the issue we're not explicitly terminating? So not sure how this plays out in that case
There was a problem hiding this comment.
ok I'll think about this some more and re-open a new PR if I come up with a good way to handle this.
When iOS is terminating a process, it may release some static memory, which could cause the
event_dispatcher_variable to point to invalid memory, leading to a crash.A crash that occurs when an app is already in the process of being terminated isn't too bad, considering it's already shutting down, but this matters to some extent, considering that crash reporters will report this as an app crash, and the OS may choose to penalize your app by not proactively suggesting it.
This should fix a crash with a backtrace such as this one:
Signed-off-by: JP Simard jp@jpsim.com