Skip to content

socket: IP_FREEBIND support for listeners and upstream connections.#2922

Merged
htuch merged 1 commit intoenvoyproxy:masterfrom
htuch:ip-freebind
Mar 29, 2018
Merged

socket: IP_FREEBIND support for listeners and upstream connections.#2922
htuch merged 1 commit intoenvoyproxy:masterfrom
htuch:ip-freebind

Conversation

@htuch
Copy link
Copy Markdown
Member

@htuch htuch commented Mar 28, 2018

This patch introduces support for setting IP_FREEBIND on both listener sockets and upstream
connection sockets prior to binding. This enables the use of IP addresses that are not currently
bound to the NIC for listening and initiating connections from. This is useful in environments with
virtualized networking.

There's also some related work on SocketOption that continues from #2734, which was needed to enable
this to work cleanly.

Risk Level: Low (no change unless enabled).
Testing: Unit tests for ListenerManager, ClusterManager and SocketOptionImpl. Manual end-to-end
validation with steps described in configs/freebind/README.md.
API Changes: envoyproxy/data-plane-api#536

Fixes #528.

Signed-off-by: Harvey Tuch htuch@google.com

@htuch
Copy link
Copy Markdown
Member Author

htuch commented Mar 28, 2018

@jrajahalme @rlenglet this is the SocketOption work I've been working on, we should figure out how to reconcile with #2719, which I based some parts of this on.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick drive by: Can we test this config (just that it loads) in our config tests? Along with the original_dst config? From a quick look it doesn't seem like they are being tested for config sanity.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take this as a follow up action item, I have a WiP to fix this, but it's a bit complicated because we're using a MockListenerComponentFactory, which is not compatible with socket options. I'd like to unblock #2719.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK SGTM.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 - let's avoid text config because when it stops working we won't be alerted but a TODO is fine.

I think we do need to be able to test socket options, esp with all the other PRs in flight

htuch added a commit to htuch/envoy-api that referenced this pull request Mar 28, 2018
Corresponding PR envoyproxy/envoy#2922.

Signed-off-by: Harvey Tuch <htuch@google.com>
Copy link
Copy Markdown
Contributor

@rlenglet rlenglet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SocketOptionImpl design LGTM.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, when adding support for another option, we'll only need to modify this library here and 2 subclasses (UpstreamSocketOption, and ListenerSocketOption). Is that correct?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. You only need to modify here if it's a shared option, and specifically in the subclasses if it's listener or upstream only. I haven't fully plumbed the hash stuff, I'll leave that for you folks.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also move the SOL_SOCKET, SO_REUSEADDR option from TcpListenSocket::TcpListenSocket, so that all socket options are set in this one library.
This will require either defining a new setTcpSocketOption method or passing setsockopt's level as a parameter to use SOL_SOCKET.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave a TODO, I think this is orthogonal and that setsockopt doesn't cause any #ifdef tangle. So, in the interest of unblocking #2719 will skip for now.

Copy link
Copy Markdown
Contributor

@rlenglet rlenglet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggreenway
Copy link
Copy Markdown
Member

@rlenglet We're trying very hard to not add any more v1 config changes, as it is deprecated.

Copy link
Copy Markdown
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know some of this work is to make it easier for the other socket options in the queue to be added. Can you give a brief overview of what those additions will look like? Specifically, I'm not understanding whether SocketOptionImpl is going to be extended with other options, or whether those should go in separate classes.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment on why this needs to be a shared_ptr (not unique)? I skimmed the code but didn't see where/why they need to be shared.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, why dos this need to be shared?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now an immutable object which benefits from shared const ownership; we do this already for address instances and some other cases. unique_ptr was problematic given how the OptionPtr vector is combined in https://github.com/envoyproxy/envoy/pull/2922/files#diff-cd280606785949980237b2a7f08b7885R95, you would need some way to copy/clone the objects across the lists otherwise.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see where you need the shared_ptr in the vector, the reason I didn't like was a shared_ptr of shared_ptr vector OptionsSharedPtr, if it is inevitable I'm fine.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a shared_ptr, this probably shouldn't be an rvalue.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this was global search/replace snafu.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably already saw but this issue is in a bunch of other places.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an optional, to match the type in data-plane-api?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I went back-and-forth on whether we should push the optional logic to the client or put it here. I think given my understanding of how SocketOptions are intended to be combined, we probably should make it optional here, so I'll make this change.

@rlenglet
Copy link
Copy Markdown
Contributor

If there's more work to be done here, could we try to get #2719 merged first?

@htuch
Copy link
Copy Markdown
Member Author

htuch commented Mar 29, 2018

@rlenglet I think in order to get through review for #2719 we'd need to fixup a bunch of stuff that is in this PR. I don't think there's more work to be done in this specific PR; I have two followups that I'm going to take to future PRs: (1) address @mattklein123 ask for freebind.yaml to be config sanity checked in tests and (2) think about how to test outbound socket freebind with packets flowing (I can manually validate the bind/bind fail there).

So, we should be able to move quickly here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean for this to be commented out?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it relates to my TODO on figuring out how to test this working with packets flowing end-to-end. I was mucking around with iptables today, but it seems that basic NATing isn't sufficient. I'll add an explicit TODO there as well.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is redundant, it is already met in line 94.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that was some forest-and-trees code, will remove.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see where you need the shared_ptr in the vector, the reason I didn't like was a shared_ptr of shared_ptr vector OptionsSharedPtr, if it is inevitable I'm fine.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we not have a localAddress() here? Can you add some more comments? (Just wondering why we can't just check if this is an IP socket and just return if not -- why do we need to try to get the address from the fd)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bummer. I would probably add a TODO here to look into cleaning this up. Optimally, localAddress() would be available unconditionally before we call this code so we can avoid this exception stuff.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it's possible to clean this up; we only know the full localAddress() after we have done the bind/connect AFAIK, since that determines the local port for outgoing connections. I'll add a TODO to provide the IP version earlier, which we can do.

@htuch
Copy link
Copy Markdown
Member Author

htuch commented Mar 29, 2018

@ggreenway the context is #2719, that's where additional options start to appear (specifically on the Listener side). I actually based this on #2719 patch and then added IP_FREEBIND, generalized/refactored, and then removed the IP_TRANSPARENT bits.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a "listener upstream option?" Clarify?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a copy-paste snafu, fixed.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this should probably be const OptionConstSharedPtr& here and elsewhere similar.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a non-actionable thought/comment: but I do feel like this series of changes is increasingly putting more UNIX/Linux stuff directly in the codebase and I wish we were thinking a bit more about x-platform. I would not worry about it now but just raising as food for thought.

Copy link
Copy Markdown
Member Author

@htuch htuch Mar 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR deals reasonably with the POSIXy side of the world. I agree that in general though, we would benefit from having something like source/platform that moves these specifics out of the Envoy core. It seems inevitable that any sufficiently advanced proxy will have features that are deeply tied to specific kernel features, so we can't be too abstract.

What are the other non-POSIXy platforms we care about besides Windows?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realistically I think only Windows right now. (I agree this is a very nice abstraction around POSIX socket options that might not exist on the host system.)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for platform work early while it's not too painful.
Harvey: consider pinging Randy when he joins the team next week - I wonder if we can steal best practices from the chrome network stack...

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff. Generally LGTM at a high level. Left some drive by comments.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: converting from bool to int, I'd prefer "freebind_.value() ? 1 : 0". But feel free to disagree with my style preference.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: converting from bool to int, I'd prefer "freebind_.value() ? 1 : 0". But feel free to disagree with my style preference.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also naming: should_freebind? It makes the SetSocketOption() more clear

ggreenway
ggreenway previously approved these changes Mar 29, 2018
Copy link
Copy Markdown
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from one minor nit (that you can choose to ignore if you want), this LGTM

Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for driving this through!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 - let's avoid text config because when it stops working we won't be alerted but a TODO is fine.

I think we do need to be able to test socket options, esp with all the other PRs in flight

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also naming: should_freebind? It makes the SetSocketOption() more clear

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> we either use the IPv6 variant if configured, otherwise we use the IPv6 variant.

We don't really try both.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly encourage LOGs for each failure mode. "this failed" is less useful than "you're using an IPv4 socket and forgot and configured an IPv6 only option"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for platform work early while it's not too painful.
Harvey: consider pinging Randy when he joins the team next week - I wonder if we can steal best practices from the chrome network stack...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: I'm a big fan of per test comments

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allow mocking system call failure?

alyssawilk
alyssawilk previously approved these changes Mar 29, 2018
Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if invalid is the correct word in this case :-)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invalid or unspecified?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsupported :D

This patch introduces support for setting IP_FREEBIND on both listener sockets and upstream
connection sockets prior to binding. This enables the use of IP addresses that are not currently
bound to the NIC for listening and initiating connections from. This is useful in environments with
virtualized networking.

There's also some related work on SocketOption that continues from envoyproxy#2734, which was needed to enable
this to work cleanly.

Risk Level: Low (no change unless enabled).
Testing: Unit tests for ListenerManager, ClusterManager and SocketOptionImpl. Manual end-to-end
validation with steps described in configs/freebind/README.md.
API Changes: envoyproxy/data-plane-api#536

Fixes envoyproxy#528.

Signed-off-by: Harvey Tuch <htuch@google.com>
@htuch htuch merged commit 725457c into envoyproxy:master Mar 29, 2018
@htuch htuch deleted the ip-freebind branch March 29, 2018 19:53
mattklein123 pushed a commit to envoyproxy/data-plane-api that referenced this pull request Mar 29, 2018
Corresponding PR envoyproxy/envoy#2922.

Signed-off-by: Harvey Tuch <htuch@google.com>
Elite1015 pushed a commit to Elite1015/data-plane-api that referenced this pull request Feb 23, 2025
Corresponding PR envoyproxy/envoy#2922.

Signed-off-by: Harvey Tuch <htuch@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants