Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(client): fix Host header when using a Proxy #775

Merged
merged 1 commit into from
May 4, 2016

Conversation

seanmonstar
Copy link
Member

Closes #774

@seanmonstar
Copy link
Member Author

@shaleh do you mind testing that this works with your proxy?

@shaleh
Copy link

shaleh commented Apr 27, 2016

This works for HTTP like www.example.com but fails for HTTPS like https://google.com. I get a response back from the proxy server telling me the request is invalid.

Response: 400 Bad Request
Headers:
Pragma: no-cache
Content-Length: 691
Cache-Control: no-cache
Content-Type: text/html; charset=utf-8
Proxy-Connection: close
Connection: close

@seanmonstar
Copy link
Member Author

To support https end points, I'll need to work in an initial CONNECT example.dom HTTP/1.1 dance before assuming the connection can be wrapped in SSL... Sigh, I was hoping this would just be a minor fix. Although, I think doing the CONNECT stuff may be doable outside of hyper...

@shaleh
Copy link

shaleh commented Apr 27, 2016

This is the sequence that works via telnet/nc

CONNECT www.google.com:80 HTTP/1.0
User-Agent: simple/1.1
Host: www.google.com\n\n
GET / HTTP/1.0\n\n

Note the use of a port number instead of "http://". When I tried to use http://www.example.com I would get back an invalid request response.

Since you have the is_proxied boolean now, calling the CONNECT as a preamble should be easy enough. This is what other C based http libraries are doing. curl is here https://github.com/curl/curl/blob/master/lib/http_proxy.c

@shaleh
Copy link

shaleh commented Apr 27, 2016

Exchanging the 80 for 443 and starting the SSL/TLS stack should be enough to get HTTPS working.

@seanmonstar
Copy link
Member Author

Yea, the issue is that by the time the code in Http11Message is running to write the data, the connection has already been made. However, once CONNECT has been passed, then the ssl handshake should take place to access the site over HTTPS.

I'm now thinking of composing NetworkConnectors to get this.

struct Tunnel<C> {
    connector: C,
    proxy: (String, String, u16),
}

type HttpToHttpsProxy = HttpsConnector<Openssl, Tunnel<HttpConnector>>;

let http_to_https_proxy = HttpsConnector::new(
    Openssl::default(),
    Tunnel::new(
        HttpConnector, "http", "my-proxy", 8080
    )
);

Steps that would occur:

  1. Client.get("https://www.google.com")
  2. let tcp = tunnel.connector.connect(tunnel.proxy)
  3. tcp.write_all("CONNECT www.google.com:443 HTTP/1.1\r\nHost: www.google.com:443\r\n\r\n")
  4. Check for HTTP/1.1 20x response on tcp
  5. https_connector.wrap_client(tcp, "www.google.com")
  6. Carry on like normal.

@shaleh
Copy link

shaleh commented Apr 27, 2016

Are you thinking for making a Client using something like Client.new_with_proxy(host, port)?

@shaleh
Copy link

shaleh commented Apr 27, 2016

Or, HttpsConnector (and HttpConnector) could have a set_proxy() method that the Client could invoke when its own set_proxy is called.

@seanmonstar
Copy link
Member Author

seanmonstar commented Apr 27, 2016

It does seem that this is entirely implementable with the tools hyper provides, but I suppose this can be provided to ease usage of proxies for the easy cases. Those cases seem to be:

  1. http to the proxy, and then https to the target
  2. http to the proxy, http to the target
  3. https to the proxy, and then http to the target
  4. https to the proxy, https to the target

As before, any attempt to connect to the proxy or the target over https without the ssl feature would just result in an unknown scheme error.

Next, to set it up on the Client, it could look like this:

// if target-uri is https, tunnels over http to proxy, otherwise does absolute-uri as in current PR
// if ssl feature is disabled, requests to https would return `Err(unknown scheme)`
let client = Client::with_proxy("http", "my-proxy", 8080);

@seanmonstar
Copy link
Member Author

And thanks to a mistake in designing the Ssl trait, I cannot easily compose these connectors because I cannot make an SslStream<SslStream<HttpStream>> using the Ssl trait, since HttpStream is hard coded in.

Though just think of that sounds horrible. Double encryption yum.

@seanmonstar
Copy link
Member Author

Ah, seems that is a job for NPN...

@shaleh
Copy link

shaleh commented Apr 28, 2016

NPN?

@seanmonstar
Copy link
Member Author

Next Protocol Negotiation, a recent addition to TLS. It allows converting a TLS connection into some other connection (it was originally created for SPDY/HTTP2 so clients can upgrade TLS to h2).

But that's a lot of work, and from my googling, the majority of people use HTTP proxies, not HTTPS, so I'm going to settle by only providing to HTTP out of the box (I want to get back to the async work).

Client::with_http_proxy("my-proxy", 8080)

It will always try the http scheme, being an http proxy.

@shaleh
Copy link

shaleh commented Apr 28, 2016

But it still proxies for connections to HTTPS sites, right? Lots and lots of websites are HTTPS these days.

@seanmonstar
Copy link
Member Author

Yes, it will tunnel to the HTTP proxy, and negotiate TLS after the proxy has tunneled tcp.

@shaleh
Copy link

shaleh commented Apr 28, 2016

Cool. Let me know when you need me to play QA again :-)

@seanmonstar
Copy link
Member Author

I'm not super pleased with the code, but it seems to work. Tests pass, and I edited the examples/client.rs to look for HTTP_PROXY env var, allowing me to to easily compare it's results with curl -x http://my-proxy:8080. My tests against real proxies seemed to work (or at least, if it worked with curl, it worked in hyper. Failures in hyper were also failures in curl, so... 🍌 ).

@seanmonstar
Copy link
Member Author

@shaleh does it work with your corporate proxy?

With this PR, you can do HTTP_PROXY=my-proxy:80 cargo run --example client -- https://www.google.com

@shaleh
Copy link

shaleh commented May 2, 2016

I will check it tomorrow. I had planned today but was busy.

@shaleh
Copy link

shaleh commented May 4, 2016

I get a failure:

thread '<main>' panicked at 'called `Result::unwrap()` on an `Err` value: Io(Error { repr: Custom(Custom { kind: Other, error: StringError("failed to lookup address information: Name or service not known") }) })', ../src/libcore/result.rs:746

wget, curl, etc. all work. I have not had the time to debug this. Sorry.

@seanmonstar
Copy link
Member Author

@shaleh The value provided does not include the scheme, right? The code in the example doesn't do as much parsing as it could have the HTTP_PROXY env var.

@shaleh
Copy link

shaleh commented May 4, 2016

Bingo. I have had problems with other software failing if the variable left off the http://.

@seanmonstar seanmonstar merged commit 283b1ca into master May 4, 2016
@seanmonstar seanmonstar deleted the 774-proxy-host branch May 4, 2016 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants