Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

after e9b1d7d commit, reverse proxy to the web page of GOGS will become garbled #3606

Closed
crwnet opened this issue Jul 26, 2020 · 22 comments · Fixed by #3620
Closed

after e9b1d7d commit, reverse proxy to the web page of GOGS will become garbled #3606

crwnet opened this issue Jul 26, 2020 · 22 comments · Fixed by #3620

Comments

@crwnet
Copy link

crwnet commented Jul 26, 2020

The commit of e9b1d7d solves the #3556 issue, but may lead to new issues.
This problem may be due to the content returned by GOGS in chunked mode and lack of the Content-Length field. I am not sure whether all chunked pages are affected by this commit.

@crwnet crwnet changed the title after e9b1d7dcb4cbf85da7fb4cf8c411a4f840a98cf1 commit, reverse proxy to the web page of GOGS will become garbled after e9b1d7d commit, reverse proxy to the web page of GOGS will become garbled Jul 26, 2020
@masknu
Copy link
Contributor

masknu commented Jul 28, 2020

Could you show me how to reproduce it please? Including all the config files of Caddy and GOGS. Thank you.

@masknu
Copy link
Contributor

masknu commented Jul 28, 2020

According to the http2 standard, http2 doesn't support http 1.1's chunked transfer encoding mechanism, as it provides its own, more efficient, mechanisms for data streaming.
As e9b1d7d only affects HTTP2 connection, not sure how to reproduce this issue.

@crwnet
Copy link
Author

crwnet commented Jul 28, 2020

According to the http2 standard, http2 doesn't support http 1.1's chunked transfer encoding mechanism, as it provides its own, more efficient, mechanisms for data streaming.
As e9b1d7d only affects HTTP2 connection, not sure how to reproduce this issue.

Yes, http2 has its own data streaming, but the main job of most people using caddy is to provide a reverse proxy for http1.1 sites and provide http2 to the browser. If caddy isn't supported to back-end sites which using http1.1 return content in chunked mode, the reverse proxy function will be restricted and disturbing.

The specific problem is to reverse proxy to the latest release version 0.11.91 of Gogs, and the configuration is the simplest reverse_proxy http://gogs_address

@masknu
Copy link
Contributor

masknu commented Jul 29, 2020

Could you show me your Caddyfile? Did you specify http transport version in config file?

@crwnet
Copy link
Author

crwnet commented Jul 29, 2020

Could you show me your Caddyfile? Did you specify http transport version in config file?

xxxx.com {
encode gzip
reverse_proxy http://127.0.0.1:8080
}

@masknu
Copy link
Contributor

masknu commented Jul 29, 2020

Thank you.
Sorry for the inconvenience. I'm trying to reproduce the issue.
Do you remember what's the last version(or commit) of caddy you used that worked normal with GOGS?

@crwnet
Copy link
Author

crwnet commented Jul 29, 2020

Thank you.
Sorry for the inconvenience. I'm trying to reproduce the issue.
Do you remember what's the last version(or commit) of caddy you used that worked normal with GOGS?

Thank you, I remember it should be #246a31a, it came to #e9b1d7d had only one commit.

@masknu
Copy link
Contributor

masknu commented Jul 30, 2020

Things become strange now. Did you change other factors(caddy config? GOGS config or version? other configs?) between those two commit?

I wrote a server as upstream to serve chunked data via http1.1 protocol, caddy(commit #e9b1d7d,with your config) has no problem to proxy it. Log shows caddy received chunked transfer-encoding response from upstream, and client successfully received correct data from caddy:
client<------HTTP2.0------->caddy<--------HTTP1.1 chunked Transfer-Encoding------->server

Next, I'm gonna test GOGS as upstream server.

@crwnet
Copy link
Author

crwnet commented Jul 30, 2020

if compiling with #e9b1d7d, Chrome response:
image
{
content-type: text/html; charset=UTF-8
date: Thu, 30 Jul 2020 07:28:16 GMT
server: Caddy
status: 200
}
else {
content-encoding: gzip
content-length: 2358
content-type: text/html; charset=UTF-8
date: Thu, 30 Jul 2020 07:34:16 GMT
server: Caddy
status: 200
vary: Accept-Encoding
}

@crwnet
Copy link
Author

crwnet commented Jul 30, 2020

I think I found the problem, and after I removed encode gzip, the problem was resolved.
So this bug is that the response header was forward immediately and content-encoding field was not added at that time, but the content uses gzip encoding.

@masknu
Copy link
Contributor

masknu commented Jul 30, 2020

@crwnet I hope this PR could solve the problem. Thank you for your help.

@crwnet
Copy link
Author

crwnet commented Jul 31, 2020

@crwnet I hope this PR could solve the problem. Thank you for your help.

Don't mention it. I think it would be better to coding that only h2c connections are flushed immediately to solve this problem. Other types of reverse proxy are basically and not needed it, and may be the same as encoding module, several up-level modules lose their original functions.

@masknu
Copy link
Contributor

masknu commented Jul 31, 2020

Yes, it might be better to restrict flushing action to h2 upstream, it only checks client protocol at the moment.

@masknu
Copy link
Contributor

masknu commented Jul 31, 2020

But even if it's in client<--h2-->caddy<--h2-->upstream mode, it seems impossible for upper level handles to add headers before first flushing, except upper handles also wrap flush method of responseWriter.
@francislavoie @mholt What do you think?

@mholt
Copy link
Member

mholt commented Jul 31, 2020

Great discussion here so far. Thanks for working it out while I've been busy catching up.

Several middleware handlers intercept the WriteHeader() function to inspect the response headers and potentially modify or defer them before they actually get written. This is how, as you've noticed, the encode handler works: it actually doesn't write the headers at WriteHeader(), it does so after the first call(s) to Write() because it needs to get a sense of the length of the content being written.

I think the change/fix for this should be as minimally invasive as possible. If it only needs to happen on h2c connections, then let's make sure it only happens on h2c connections; and even better, if the client doesn't accept any other encodings.

@crwnet
Copy link
Author

crwnet commented Jul 31, 2020

Yes, the minimal invasive to fix #3556:#e9b1d7d that it should be get transport version values from reverse proxy configuration, and only the values include "h2c" then #e9b1d7d will be executed.

@masknu
Copy link
Contributor

masknu commented Aug 3, 2020

Make sense.
What if configuration includes version "h2c 2 1.1", but the upstream is actually 1.1?
And bi-stream not just happens in h2c, it also happens with h2.

@crwnet
Copy link
Author

crwnet commented Aug 3, 2020

add a option in http like bi-stream?

@masknu
Copy link
Contributor

masknu commented Aug 3, 2020

As far as I know. Haproxy doesn't need to specify "bi-stream"-like options to support v2ray as h2 upstream, you just write h2 1.1 as supported versions in haproxy, then it works like a charm.
In last commit of PR( #3620 ), I added one more check on upstream's response to ensure both sides are talking h2.

@crwnet
Copy link
Author

crwnet commented Aug 3, 2020

As far as I know. Haproxy doesn't need to specify "bi-stream"-like options to support v2ray as h2 upstream, you just write h2 1.1 as supported versions in haproxy, then it works like a charm.
In last commit of PR( #3620 ), I added one more check on upstream's response to ensure both sides are talking h2.

These are two different softwares. Their implementation strategies and module priority methods are not the same, so they will have different effects. We are discussing how to better implement all the desired functions and try to be compatible with all modes.
What I mentioned is just a suggestion to achieve more compatibility with fewer modification, which is certainly not the best way to achieve it. I think the best way to implement this is to automatically transport the chunked mode to the h2 streaming mode, but that has more things to do.
PR( #3620 ) had solved the problem temporarily, thank you.

@masknu
Copy link
Contributor

masknu commented Aug 3, 2020

Understood.
I hope we could come up with an idea that keeps impact as minimal as possible while solving the problem permanently.

@mholt
Copy link
Member

mholt commented Aug 3, 2020

@crwnet Does the latest commit on #3620 resolve the problem satisfactorily? We're ready to merge it.

Edit: I see now that you said it resolves it "temporarily" -- why temporarily? is that sufficient?

mholt added a commit that referenced this issue Aug 4, 2020
…3620)

* reverse_proxy: fix bi-h2stream breaking gzip encode handle(#3606).

* reverse_proxy: check http version of both sides to avoid affecting non-h2 upstream.

* Minor cleanup; apply review suggestions

Co-authored-by: Matthew Holt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants