event: update fd registration mask even if it hasn't changed.#16389
event: update fd registration mask even if it hasn't changed.#16389wrowe merged 7 commits intoenvoyproxy:mainfrom
Conversation
Updates to the fd mask can result in new events when operating in EDGE trigger mode. Doing this update unconditionally is specially important in cases where there was a synthetic event scheduled and setEnabled ends up clearing it before the call to updateEvents since by skipping the update we prevent the generation of a new real event to replace the lost synthetic event. Without this change, calling close(Flush) a socket that is readDisabled and had a pending synthetic write event can result in th write event never being delivered so the final flush will fail due to timeout. Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Antonio Vicente <avd@google.com>
|
/assign-from @envoyproxy/first-pass-reviewers |
|
@envoyproxy/first-pass-reviewers assignee is @dio |
|
I have this PR under my radar, but I haven't got time to understand the edge case exactly. This is why I haven't commented on it |
davinci26
left a comment
There was a problem hiding this comment.
Thanks for fixing a Windows issue!
Windows works fine AFAIK, it's linux that has the issues. Thanks for the review! |
|
/assign-from @envoyproxy/senior-maintainers |
|
@envoyproxy/senior-maintainers assignee is @alyssawilk |
alyssawilk
left a comment
There was a problem hiding this comment.
Thanks for the thorough explanation (and sorry for review delay - second shot hit me hard)
LGTM modulo some explanatory comments :-)
| if (trigger_ == FileTriggerType::EmulatedEdge) { | ||
| auto new_event_mask = enabled_events_ & ~event; | ||
| updateEvents(new_event_mask); | ||
| if (new_event_mask != enabled_events_) { |
There was a problem hiding this comment.
comment here and below why this case doesn't need the update?
There was a problem hiding this comment.
Writing the comment for updateEvents made me rethink what we should do here. I reverted these changes and instead added a trigger_mode_ check to updateEvents so the update is skipped in modes where it is truly a no-op.
| @@ -85,9 +85,6 @@ void FileEventImpl::assignEvents(uint32_t events, event_base* base) { | |||
|
|
|||
| void FileEventImpl::updateEvents(uint32_t events) { | |||
| ASSERT(dispatcher_.isThreadSafe()); | |||
There was a problem hiding this comment.
I'd like either a comment here on why this is important, or a comment in the test calling out that it's regression testing [info from PR description] just to ensure no clever person decides to improve perf by undoing this PR :-P
There was a problem hiding this comment.
Comment added. Thanks for asking for further info in the code, there's a lot of subtle behavior that can be accidentally missed.
Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Antonio Vicente <avd@google.com>
I know the feeling. The second shot is rough but so worth it. |
|
Coverage flaked. I'll kick off another run but feel free to merge once CI is happy |
The coverage failure is real, I'll fix it by adding a test case for Level trigger events. /wait |
Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Antonio Vicente <avd@google.com>
|
//test/integration:integration_test is a known flake in CI, investigating, don't let that stop you if the rest of CI passes. |
|
The issues I see right now are bazel RPC failure while building ASAN and TSAN failure related to the issue I'm trying to address in #16590 /retest |
|
Retrying Azure Pipelines: |
…roxy#16389) * event: update fd registration mask even if it hasn't changed. Updates to the fd mask can result in new events when operating in EDGE trigger mode. Doing this update unconditionally is specially important in cases where there was a synthetic event scheduled and setEnabled ends up clearing it before the call to updateEvents since by skipping the update we prevent the generation of a new real event to replace the lost synthetic event. Without this change, calling close(Flush) a socket that is readDisabled and had a pending synthetic write event can result in th write event never being delivered so the final flush will fail due to timeout. * Remove call to dispatcher exit that is not needed Signed-off-by: Antonio Vicente <avd@google.com>
Commit Message:
event: update fd registration mask even if it hasn't changed.
Updates to the fd mask can result in new events when operating in EDGE trigger mode.
Doing this update unconditionally is specially important in cases where there was a synthetic event scheduled
and setEnabled ends up clearing it before the call to updateEvents. By skipping the update we prevent the
generation of a new real event to replace the lost synthetic event.
Without this change, calling close(Flush) a socket that is readDisabled and had a pending synthetic write event
can result in th write event never being delivered so the final flush will fail due to timeout.
Additional Description:
The issue was introduced by this optimization request from me which works for EmulatedEdge but not for Edge triggered sockets: https://github.com/envoyproxy/envoy/pull/13787/files#r520315847
I don't know if there are current situations where this bug would trigger in the proxy. I ran into this while trying to change the order of operations when generating HTTP/1.0 responses framed by connection close; before my prototype changes there was a call to readDisable(false) before Connection::close(Flush) which resulted in existing tests to work.
Risk Level: low
Testing: unit
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a