Skip to content

listener: make reuse port the default#17259

Merged
mattklein123 merged 32 commits intomainfrom
reuse_port_enabled
Jul 20, 2021
Merged

listener: make reuse port the default#17259
mattklein123 merged 32 commits intomainfrom
reuse_port_enabled

Conversation

@mattklein123
Copy link
Copy Markdown
Member

  1. Deprecate existing reuse_port field
  2. Add new enable_reuse_port field which uses a WKT
  3. Make the new default hot restart aware so the default is
    not changed during hot restart.
  4. Allow the default to be reverted using the
    "envoy.reloadable_features.listener_reuse_port_default_enabled"
    feature flag.
  5. Change listener init so that almost all error handling occurs on
    the main thread. This a) vastly simplifies error handling and
    b) makes it so that we pre-create all sockets on the main thread
    and can use them all during hot restart.
  6. Change hot restart to pass reuse port sockets by socket/worker
    index. This works around a race condition in which a draining
    listener has a new connection on its accept queue, but it's
    never accepted by the old process worker. It will be dropped.
    By passing all sockets (even reuse port sockets) we make sure
    the accept queue is fully processed.

Fixes #15794

Risk Level: High, scary stuff involving hot restart and listener init
Testing: New and existing tests. It was very hard to get the tests to pass which gives me more confidence.
Docs Changes: N/A
Release Notes: Added
Platform Specific Features: N/A

1) Deprecate existing reuse_port field
2) Add new enable_reuse_port field which uses a WKT
3) Make the new default hot restart aware so the default is
   not changed during hot restart.
4) Allow the default to be reverted using the
   "envoy.reloadable_features.listener_reuse_port_default_enabled"
   feature flag.
5) Change listener init so that almost all error handling occurs on
   the main thread. This a) vastly simplifies error handling and
   b) makes it so that we pre-create all sockets on the main thread
   and can use them all during hot restart.
6) Change hot restart to pass reuse port sockets by socket/worker
   index. This works around a race condition in which a draining
   listener has a new connection on its accept queue, but it's
   never accepted by the old process worker. It will be dropped.
   By passing all sockets (even reuse port sockets) we make sure
   the accept queue is fully processed.

Fixes #15794

Risk Level: High, scary stuff involving hot restart and listener init
Testing: New and existing tests. It was very hard to get the tests to pass which gives me more confidence.
Docs Changes: N/A
Release Notes: Added
Platform Specific Features: N/A

Signed-off-by: Matt Klein <mklein@lyft.com>
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
envoyproxy/api-shepherds assignee is @adisuissa
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #17259 was opened by mattklein123.

see: more, trace.

@mattklein123 mattklein123 assigned lambdai and ggreenway and unassigned adisuissa Jul 7, 2021
@mattklein123
Copy link
Copy Markdown
Member Author

@ggreenway @lambdai scary change, but ultimately I think cleans up a lot of stuff. I could potentially split out the listener init changes from the reuseport part of the change, but I'm going to be honest, it took me so long to get the tests passing that I would prefer not to as I will have to redo listener_manager_impl_test which will take a long time. Let's see what we think during review.

@mattklein123
Copy link
Copy Markdown
Member Author

cc @danzh2010 for QUIC/UDP changes.

Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
Copy link
Copy Markdown
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of time for today; I'll continue tomorrow, but posting this here for now. (Note to self: got through listener_manager_impl.cc)

This is a huge improvement! I really like moving all this error handling to the main thread.

I'm curious how this will affect QUIC during hot-restart. Hot-restart doesn't work with QUIC right now AFAIK, but I'm curious if this will make it better, worse, or just different. I think in the old behavior, the child would create all new listeners, and that would probably result in the BPF routing all packets to the parent listeners (because in the kernel, they were created earlier, and we mod by the number of envoy workers). If we duplicate the FDs and send them to the child, probably half the packets will go the parent, and half to the child, but not based on the connection/tuple, just randomly. Not sure if this matters or not.

Have you tested hot-restart between the previous code and this? I think it'll work, but a manual test of that would be good.

/wait

Signed-off-by: Matt Klein <mklein@lyft.com>
@mattklein123
Copy link
Copy Markdown
Member Author

Bringing some offline conversations to the forefront, it appears that on OSX, REUSE_PORT does not do what it does on Linux. It is a "last socket wins" option, such that the last socket that binds gets all the connections. This means that for Envoy purposes, enable_reuse_port does not work on OSX and should probably be blocked. The issue with blocking is this is going to cause the tests (especially listener_manager_impl_test) to get really messy. I will see how messy it will be to just block this on OSX and change the default to false.

Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
Copy link
Copy Markdown
Contributor

@danzh2010 danzh2010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review. This is a huge PR and big behavior change. Does it worth to document how listen socket is created and sometimes duplicated and closed. And applying socket option and listen() before starting workers is a great idea!

@mattklein123
Copy link
Copy Markdown
Member Author

Does it worth to document how listen socket is created and sometimes duplicated and closed.

Sure I can do that. Do you have a location in mind or should I wrote a markdown file in https://github.com/envoyproxy/envoy/tree/main/source/docs?

@rgs1
Copy link
Copy Markdown
Member

rgs1 commented Jul 15, 2021

This should be ready for final review. @rgs1 is doing some smoke testing which should include hot restart from old to new.

I tried the following hot restarts on our latest fleet:

old -> new
new -> new

Nothing obvious breaks 🚀 .

@danzh2010
Copy link
Copy Markdown
Contributor

Does it worth to document how listen socket is created and sometimes duplicated and closed.

Sure I can do that. Do you have a location in mind or should I wrote a markdown file in https://github.com/envoyproxy/envoy/tree/main/source/docs?

That sounds to be a reasonable location.

Signed-off-by: Matt Klein <mklein@lyft.com>
@mattklein123
Copy link
Copy Markdown
Member Author

@danzh2010 updated and I also added the MD file. LMK what you think.

rgs1
rgs1 previously approved these changes Jul 16, 2021
Copy link
Copy Markdown
Member

@rgs1 rgs1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ not reviewing the code although I did gloss over it to make sure switching concurrency level over hot restarts wouldn't be a problem – it isn't ]

We've been running this in prod and have done lots of hot restarts, all good thus far.

Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
Copy link
Copy Markdown
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Just a few doc/comment issues. I think the code is good to go.

Might be slightly clearer to have "reuse_port" instead of "reuse port" in comments/docs, but up to you if you want to make that change.

Are there any docs about what happens if concurrency is changed during hot-restart? My guess is that if concurrency is increased, everything works fine (more queues are created, all previously existing queues are reused). If concurrency is reduced, connections may be dropped.

Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
@mattklein123
Copy link
Copy Markdown
Member Author

@ggreenway updated, thanks for the reviews.

Copy link
Copy Markdown
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@mattklein123 mattklein123 merged commit ba474ac into main Jul 20, 2021
@mattklein123 mattklein123 deleted the reuse_port_enabled branch July 20, 2021 02:03
@rgs1
Copy link
Copy Markdown
Member

rgs1 commented Jul 29, 2021

cc: @fishcakez who found an apparent case where the default isn't kicking in after hot restart.

leyao-daily pushed a commit to leyao-daily/envoy that referenced this pull request Sep 30, 2021
1) Deprecate existing reuse_port field
2) Add new enable_reuse_port field which uses a WKT
3) Make the new default hot restart aware so the default is
   not changed during hot restart.
4) Allow the default to be reverted using the
   "envoy.reloadable_features.listener_reuse_port_default_enabled"
   feature flag.
5) Change listener init so that almost all error handling occurs on
   the main thread. This a) vastly simplifies error handling and
   b) makes it so that we pre-create all sockets on the main thread
   and can use them all during hot restart.
6) Change hot restart to pass reuse port sockets by socket/worker
   index. This works around a race condition in which a draining
   listener has a new connection on its accept queue, but it's
   never accepted by the old process worker. It will be dropped.
   By passing all sockets (even reuse port sockets) we make sure
   the accept queue is fully processed.

Fixes envoyproxy#15794

Risk Level: High, scary stuff involving hot restart and listener init
Testing: New and existing tests. It was very hard to get the tests to pass which gives me more confidence.
Docs Changes: N/A
Release Notes: Added
Platform Specific Features: N/A

Signed-off-by: Matt Klein <mklein@lyft.com>
wrowe pushed a commit that referenced this pull request Dec 14, 2021
SO_REUSEPORT socket option can lead bazel test to use the same port across parallel jobs. 

This causes test flakiness in http_quic_integration_test with --runs_per_test > 1. 

To mitigate this issue, the integration test should run with enable_reuse_port explicitly false unless it needs concurrency_ > 1. 

For those tests with concurrency_ > 1, running with --jobs=1 would eliminate the flakiness.

Additional Description: We used to disable SO_REUSEPORT for tests whose concurrency_ = 1 in #9495. However this was undo in #17259.

Signed-off-by: Dan Zhang <danzh@google.com>

Co-authored-by: Dan Zhang <danzh@google.com>
joshperry pushed a commit to joshperry/envoy that referenced this pull request Feb 13, 2022
SO_REUSEPORT socket option can lead bazel test to use the same port across parallel jobs. 

This causes test flakiness in http_quic_integration_test with --runs_per_test > 1. 

To mitigate this issue, the integration test should run with enable_reuse_port explicitly false unless it needs concurrency_ > 1. 

For those tests with concurrency_ > 1, running with --jobs=1 would eliminate the flakiness.

Additional Description: We used to disable SO_REUSEPORT for tests whose concurrency_ = 1 in envoyproxy#9495. However this was undo in envoyproxy#17259.

Signed-off-by: Dan Zhang <danzh@google.com>

Co-authored-by: Dan Zhang <danzh@google.com>
Signed-off-by: Josh Perry <josh.perry@mx.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

listener: switch default for SO_REUSEPORT to true

7 participants