Skip to content

Add ALPN/HTTP2 support to Connect #1485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
seanmonstar opened this issue Apr 13, 2018 · 15 comments
Closed

Add ALPN/HTTP2 support to Connect #1485

seanmonstar opened this issue Apr 13, 2018 · 15 comments
Labels
A-client Area: client. A-http2 Area: HTTP/2 specific. C-feature Category: feature. This is adding a new feature. E-medium Effort: medium. Some knowledge of how hyper internal works would be useful.

Comments

@seanmonstar
Copy link
Member

The Connect trait allows to provide information to the connector in the Destination type. It should be augmented to allow requesting ALPN for HTTP2, and the returned Connected type should include a way for the connector to inform the Client whether HTTP2 was negotiated.

@seanmonstar seanmonstar added A-client Area: client. C-feature Category: feature. This is adding a new feature. E-medium Effort: medium. Some knowledge of how hyper internal works would be useful. labels Apr 13, 2018
@seanmonstar
Copy link
Member Author

Talking through this a bit on IRC, here's some clarification. A user likely has various ways they want to declare when to use HTTP/1 vs HTTP/2. A major way to do is with TLS' ALPN mechanism, but that seems best handled in the Connect implementation, since hyper has delegated all transport negotiation and establishment to that trait.

Some variants a user may need:

  1. HTTP/1 only
  2. HTTP/2 only
  3. HTTP/1 or HTTP/2 over ALPN
  4. HTTP/1 or HTTP/2, with HTTP/1 upgrades to 2

A key part of ALPN is that if a client chooses to use it, it must advertise the protocols it can speak, such as http1,h2. If the server responds back with a protocol identifier, then once the TLS handshake is completed, the connection MUST use that negotiated protocol. So, it ALPN agreed on h2, hyper must use HTTP/2 once the connection is returned from Connect.

This means the connector needs some way of telling the hyper::Client if a certain protocol is required. We can make use of the Connected type to include that information.


There is also a question around how a user configures what variant they need. There is the Destination type, which could gain additional getters so that connectors can know whether to use ALPN. However, a potential downside there is that it seems likely that a user would need to configure their connector anyways, and so why should they need to repeat that configuration on the hyper::Client.

It might be sufficient to not have extra configuration on the Client. Of the scenarios above, the only one that seems to require the hyper::Client to know about is No. 4 (HTTP/1 upgrade to HTTP/2).

@sfackler
Copy link
Contributor

sfackler commented Jun 5, 2018

I was initially thinking that the best route would be something like a Destination::negotiate_h2 method that would tell the connector to use ALPN, but I think it might actually be best to just have the connectors always use ALPN and just tell the Connected what protocol was selected. native-tls and openssl don't currently support configuration of ALPN on a per-handshake basis, and while we could add that it's kind of a weird thing to do.

This would require configuration of both the TLS connector and the client if you want to use only HTTP/1 but that seems fine since I think that'd be a pretty rare use case?

@sfackler
Copy link
Contributor

Here's what I think a reasonable route forward is:

Public API changes:

pub struct Protocol(...);

impl Protocol {
    const HTTP1_1: Protocol = Protocol(...);
    const HTTP2: Protocol = Protocol(...);
}

impl Connected {
    pub fn negotiated_protocol(self, protocol: Protocol) -> Self { ... }
}

If a connector indicates that it has negotiated a protocol, Hyper has to respect that. If HTTP/1.1 is negotiated but the client was configured with http2_only, it'll have to return an error.

The pool implementation will need to change, since it currently depends on knowing up-front which protocol will be used. The constraint that only a single HTTP/2 connection attempt can exist at any time since we don't know if the connection will be using HTTP/2. It seems like a reasonable approach is that the pool's key type is just the authority, and the value is an enum of either a single HTTP/2 connection, or a set of idle HTTP/1.1 connections. The HTTP/2 mode is preferred, so if a new connection comes in that negotiated HTTP/2, it'll replace any existing HTTP/1.1 connections.

Does this all seem plausible? I can start working on the implementation if so.

@seanmonstar
Copy link
Member Author

The caveats you mention sound about right.

An alternative to exposing a new type would be to just add negotiated_http1(self) and negotiated_http2(self) on Connected. It's not necessarily better, it just allows for the API to exist without a new Protocol type.

Does it make sense at all for the Client to tell a connector that a user configured http2_only or something (another option I've been thinking about is if to implicitly guess the preferred protocol by looking at Request::version())? I suppose if it does make sense eventually, it can always be added to Destination.

@sfackler
Copy link
Contributor

Yeah, negotiated_http1 and negotiated_http2 seem reasonable to me as well.

I'm not sure it'll be all that useful for the connector to know that http2_only was set. It seems pretty weird to have a server that supports h2 but prioritizes http/1.1 over it.

@seanmonstar
Copy link
Member Author

Yea, that'd be odd. There could conceptually be an http1_only option, and in that case, if ALPN passed both http1 and h2, and the server supported both, we'd likely have connections that could never work...

Perhaps that concern can just be postponed, and people can configure their HTTPS connector how to do ALPN in the first place.

@sfackler
Copy link
Contributor

Yeah I think you'll need to explicitly configure ALPN for hyper-openssl at least for it to be used at all initially.

@sfackler
Copy link
Contributor

The changes required in the connection pool are pretty significant, so I want to write them out and make sure it seems reasonable before making them.

There are a couple of constraints we want to enforce:

  1. We don't want to allow multiple live HTTP/2 connections to a single host.
  2. We want to allow concurrent creation of new HTTP/1 connections to a single host.

The pool currently just tracks the set of pending HTTP/2 connections to guarantee constraint 1, and you ask the pool for permission before making a new connection. This becomes more complicated now since we don't know up-front what protocol we're going to be using until we get part way through the connection process.

There are a couple of options:

  1. Defer the pending connection check until ALPN tells us what protocol we're using.
  2. Serialize the first connection to a host of any protocol.

Imagine we're spawning off N requests at the same time to a new host. Option 1 is non-ideal in the HTTP/2 case since you'll perform N TCP and TLS handshakes, then throw away N-1 connections and perform a single HTTP/2 handshake. Option 2 is non-ideal in the HTTP/1 case since you'll have N-1 requests queue up behind a single request that performs the first TCP and TLS handshake, realizes the server speaks HTTP/1, and then the other N-1 requests can create their own connections.

On the whole, option 2 seems preferable.

This does mean we now need to track what protocol a host speaks. In particular, there's a distinction between a host we've never connected to and a host that has all of its HTTP/1 connections currently checked out. For simplicity, I'd like to just track that information "forever", even after idle connections have been cleared out.

@avkonst
Copy link

avkonst commented Jun 25, 2018

"We don't want to allow multiple live HTTP/2 connections to a single host."
If single HTTP2 connection is congested, it might be reasonable to open the second connection. Not always, but sometimes it could be a required policy. I think it should be client's decision, not done by hyper.

@sfackler
Copy link
Contributor

Sure, there could be an option to open a second connection if the first has no remaining request slots.

@seanmonstar
Copy link
Member Author

I have seen a few other HTTP2 client libraries which easily allow opening multiple h2 connections, but I'm wary about doing that (at least, about making it "easy" to accidentally do so). Note the spec pleads that clients shouldn't do that:

Clients SHOULD NOT open more than one HTTP/2 connection to a given host and port pair [...]

Of course, since the connection pool is used by a Client, a user could by pass it in hyper by just making a new Client...

@avkonst
Copy link

avkonst commented Jun 25, 2018

"Of course, since the connection pool is used by a Client, a user could by pass it in hyper by just making a new Client..."
There is no a reason for a client to be naughty without a good reason :) And the only good reason to have the second connection is congestion, high latency or other factor indicating low performance. This information should be exposed to a client to make a decision, and a client should be able to instruct hyper whether to allow the second, third and so on connection... PS: I also assume that the second connection might easily target alternative IP address, in case DNS is resolved to multiple addresses.

@pimeys
Copy link

pimeys commented Jun 25, 2018

Let me just write this down here to remind me to read the issue if there's going to be any changes:

I'd really appreciate if the new Connect refactoring wouldn't open more than one connection. I'm using hyper in my Apple push notification client a2, and if I'd open too many connections to Apple's service, they'd consider it a denial-of-service attack and just ban our servers completely. That would be quite unfortunate.

@seanmonstar
Copy link
Member Author

Looking over what Golang's client does, it opts for your option 1, which is to just open a bunch of connections if there isn't an existing HTTP2 connection. If after ALPN, it is HTTP2, then the extras are all closed up and only the first connection is kept.

That seems simpler to me, and less likely to cause problems for people who were successfully using the client for HTTP/1 already. For @pimeys concern, there's a couple things we could do to prevent that:

  1. If the Client is configured for http2_only, then ALPN is never needed, and all connections are treated as HTTP2 (so no mass connect flood).
  2. If a Client isn't configured that way, we could consider looking to see if Request::version() == Version::HTTP_2, and if so, specifically treat just that request as HTTP2-only.

I also think the HTTP/1-and-HTTP/2-and-HTTP1Upgrade option can be removed from consideration. It turns out to be very seldom implemented, and browsers don't bother, since there exist some servers that naively hate upon seeing any Upgrade header, so it can't be sent proactively just to advertise.

@sfackler
Copy link
Contributor

That sounds reasonable to me.

dekellum added a commit to dekellum/body-image that referenced this issue Jan 21, 2019
Live test passing, at least on linux, so far.

github related: hyperium/hyper#1574 hyperium/hyper#1485
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-client Area: client. A-http2 Area: HTTP/2 specific. C-feature Category: feature. This is adding a new feature. E-medium Effort: medium. Some knowledge of how hyper internal works would be useful.
Projects
None yet
Development

No branches or pull requests

4 participants