Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP/2 push support plugin (golang 1.8) #1215

Merged
merged 19 commits into from
Feb 17, 2017
Merged

HTTP/2 push support plugin (golang 1.8) #1215

merged 19 commits into from
Feb 17, 2017

Conversation

wendigo
Copy link

@wendigo wendigo commented Oct 26, 2016

Hi all,

This change introduces new push plugin (adressing #816)

This will be possible in golang 1.8 due to http.Pusher and http.PushOptions interfaces being introduced (golang/go@cf73bbf, tracking golang/go#13443)

As interface was published yesterday I could not resist and implemented new push directive 😁

This middleware operates in two modes: rules-based and link-based.

Rules-based syntax

push /index.html /index.css

push /index2.html /index.css

push / /ga.js

push /index.html /pushedResource.js /pushedCss.css {
   method GET
   header That-Was-Pushed Wow
   header Server Caddy-Push
}

Link-based mode

Middleware intercepts Link headers from the Next middleware and tries to use Pusher to push resources to the client.

For more information about Link header: https://w3c.github.io/preload/

Note for reviewers

This plugin is complete and tested but non functional as of current go's tip (actual implementation in https://go-review.googlesource.com/#/c/29439/ was not yet merged).

But... when go 1.8 will be released this plugin will actually support HTTP/2 Push in Caddy

Discussion

I'm not sure if this plugin should be integral part of the caddy's core or should I move it to it's own repo - opinions on that will be appreciated.

https://www.youtube.com/watch?v=vCadcBR95oU 🕶

@wendigo wendigo changed the title HTTP/2 push support (golang 1.8) HTTP/2 push support plugin (golang 1.8) Oct 26, 2016
@mholt
Copy link
Member

mholt commented Oct 26, 2016

Wow, fantastic @wendigo! I haven't looked at this yet (just woke up) but I am really surprised -- wasn't expecting to see this show up in my inbox. 😄

This definitely belongs in core, not a separate repo.

I have had some thoughts on server push and how Caddy should/might handle it -- let me elaborate them here briefly just to give you an idea.

  • Link headers, as described.
  • Enumerating rules by mapping URIs/paths to lists of resources to push, as described (almost every other server push implementation does these two things).
  • I would like to try to make server push automatic. Can we have Caddy parse the HTML, for instance, and determine which JS, CSS, and image files to push down? We could then cache those results for some time. This is kind of tricky and won't cover every use case for server push but probably will for most.
  • I put a lot of thought into using machine learning for server push (I study deep learning in grad school), since it seems reasonable that Caddy could learn a mapping between requests and subsequent requests by the same client. However, I don't think this is beneficial because each website is different. In other words, it doesn't generalize well. Caddy would have to be re-trained for each website and on an ongoing basis, which might defeat the purpose.
  • It would, perhaps, be more effective to just hardcode rules for inferring a map of requests to dependencies. In other words, Caddy just watches to see which resources a client requests "immediately after" (exact definition TBD) an initial request. It then builds a mapping of URI or path to []Path, for example. However: this doesn't work well for dynamic sites, unless Caddy invokes an HTTP request within itself to get the bytes of the response to push to the client.

In any case, I like the idea of the user being able manually specify push rules and using the Link header, at least to start. But I'm really intent on making server push automatic in the future. What do you think?

Thanks to build tags, this PR is safe to use pre-Go 1.8 (nice work). I'm willing to accept this PR with the proposed set of functionality (without the "automatic push" features) -- but I and other reviewers should go over it in more detail first of course. Expect some iterations on this before it's ready to go.

Thanks, I'm excited for this!

@mholt mholt added the under review 🧐 Review is pending before merging label Oct 26, 2016
@wendigo
Copy link
Author

wendigo commented Oct 26, 2016

I don't know if parsing HTML body is a good idea - it won't catch most of the resources (like images, fonts referenced in CSS files, javascripts/images loaded by javascripts and so on).

Maybe supporting something like push manifest is a good idea? It can be dynamically reloaded / watched via fsnotify and constructed by any tool you want (using machine learning based on logs).

Also pushing whole directories might be a good idea, i.e. push everything in css/ and img/ folders.

@wendigo
Copy link
Author

wendigo commented Oct 26, 2016

I've added header/method validation and loaded plugin. I've built caddy both on 1.8 (current tip) and 1.7.3 and it does not break Caddy.

@wendigo
Copy link
Author

wendigo commented Oct 26, 2016

I've managed to build custom go1.8 with cherry-picked Pusher implementation and test Caddy with push support enabled. I have couple of changes that I didn't anticipate + buildsrv fix (does not work in go 1.8)

But still...

yeah

@wendigo
Copy link
Author

wendigo commented Oct 26, 2016

@wendigo
Copy link
Author

wendigo commented Oct 26, 2016

I've also fixed the case when using:

push / /index.css

will cause recursive push (requested /index.css will match rule and try to push /index.css and so on).

@wendigo
Copy link
Author

wendigo commented Oct 27, 2016

Updated H2 bundle landed in golang's tip (golang/go@07e7266) so this change can be easily tested from now on (and functional on tip 👍)

@mholt
Copy link
Member

mholt commented Oct 28, 2016

I'll need some time before I can review this thoroughly -- any other collaborators are welcome and invited to review in the meantime! (Don't merge yet though. I don't normally say that, but this is a pretty big change.)

@elcore
Copy link
Collaborator

elcore commented Oct 30, 2016

Hello @wendigo,

we want to implement X25519 and CHACHA20-POLY1305 😄

There will be no need to use // +build go1.8 after the release of Golang 1.8 (Q1 2017)

P.S.: I will review your code in the next weeks 😄, if I have enough time.

@wendigo
Copy link
Author

wendigo commented Oct 30, 2016

@elcore: I know, I just wanted this change to not break current go version 1.7 - so it can be merged without breaking anything. After 1.8 release build tags can be removed.

@elcore
Copy link
Collaborator

elcore commented Oct 30, 2016

@wendigo perfect 😄

@mholt
Copy link
Member

mholt commented Nov 9, 2016

I am keeping my eye on this, just so you know ;)

pusher, hasPusher := w.(http.Pusher)

// No Pusher, no cry
if hasPusher {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get rid of a lot of nesting with

if !hasPusher {
    return h.Next.ServeHTTP(w, r)
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right - thx :) Fixed!

@mholt
Copy link
Member

mholt commented Dec 1, 2016

I don't know if parsing HTML body is a good idea - it won't catch most of the resources (like images, fonts referenced in CSS files, javascripts/images loaded by javascripts and so on).

I only partly agree. 😉 We can parse CSS bodies too. JavaScript, a little harder; I wouldn't worry about JS for now. I bet parsing HTML and CSS files would get us most of the way there without much difficulty.

@earthboundkid
Copy link

The machine "learning" could be pretty dumb and still work. What if one-in-N requests was served without using push, and then you noted which subsequent requests were serving files with a mime-type of CSS/JS/image?

@mholt
Copy link
Member

mholt commented Dec 14, 2016

The machine "learning" could be pretty dumb and still work. What if one-in-N requests was served without using push, and then you noted which subsequent requests were serving files with a mime-type of CSS/JS/image?

Yes, indeed. It would basically look like a map of request path to a list of subsequent request paths. (There would be some more data attached to each path in a struct, like validity period or eviction time or something like that... in order to keep the map updated over the lifetime of the server.)

So those are the two approaches I'm deciding between to make push automatic: parsing HTML/CSS documents is one way, and the other way is observing request patterns over time. The first works instantly and is fairly mechanical, straightforward, but may miss some pushes. The second takes a little time to get optimal, but should work well in that it won't "miss" anything it couldn't parse.

@lenovouser
Copy link

Something which would be nice is if a backend server sends a preload header e.g. using cloudflare/netjet to automatically push these resources too

@wendigo
Copy link
Author

wendigo commented Dec 14, 2016

@lenovouser Link preload headers are already supported by this change so https://github.com/cloudflare/netjet will work with it as it basically generates Link headers (https://github.com/cloudflare/netjet/blob/master/index.js#L43)

@lenovouser
Copy link

Ah, okay. So that means Caddy sees the Link headers and pushes them already? Because the node module doesn't push. It just creates the header.

@lenovouser
Copy link

Ah, yeah. Seems like it. Sorry - totally missed that.

@wendigo
Copy link
Author

wendigo commented Dec 14, 2016

@lenovouser exactly, Caddy with push plugin will intercept Link headers and push resources for you

@mholt
Copy link
Member

mholt commented Dec 17, 2016

@tdewolff just made https://github.com/tdewolff/push which does the automatic push based on parsing contents. If we can find a way to cache the parse results so the parsing doesn't have to happen on every request, I think we should be able to add this to Caddy. (But it doesn't need to be a part of this PR.)

@wendigo
Copy link
Author

wendigo commented Dec 17, 2016

Looking into it @mholt :)

@brasilikum
Copy link

@mholt
Copy link
Member

mholt commented Dec 21, 2016

@brasilikum Yes, see my comment two above yours 😉

@abh
Copy link

abh commented Dec 27, 2016

Regarding how to use server push, it turns out it's hard. These links might be of interest:
https://www.ietf.org/mail-archive/web/httpbisa/current/msg27742.html (and the ensuing discussion).
https://tools.ietf.org/html/draft-ietf-httpbis-cache-digest-01

@mholt mholt removed the under review 🧐 Review is pending before merging label Feb 17, 2017
@mholt mholt merged commit cdf7cf5 into master Feb 17, 2017
@mholt
Copy link
Member

mholt commented Feb 17, 2017

@wendigo This should help get things started, take a look at this issue: tdewolff/push#1

Basically, we should be caching (for some configurable amount of time) the results of parsing the resources so that we don't have to do it on every single request. It would be interesting to first implement it very simply without caching, and just do the bare-bones auto-push, and then benchmark it using wrk or something. Then after that, implement the caching and see how it performs. Getting it set up for the first benchmarks should only be a few lines of code then. Does that make sense?

@wendigo wendigo deleted the 1.8_http2_push branch February 17, 2017 17:59
@nwidger
Copy link

nwidger commented Feb 17, 2017

Should this feature implicitly copy certain request headers into the the *http.PushOptions's Header field passed to the http.Pusher's Push method? The most important one in my mind would be to copy the original request's Accept-Encoding header if it exists. Otherwise, the pushed resources will not be gzip'd even if the client previously indicated support for compression.

@mholt
Copy link
Member

mholt commented Feb 17, 2017

@nwidger Do you mean the auto-push (TODO) or the manually-configured push? I don't believe the Accept-Encoding header, which is on the request, needs to be pushed into the response anyway, though.

@wendigo
Copy link
Author

wendigo commented Feb 17, 2017

@mholt as far as I understand @nwidger it should. If client request has "Accept-Encoding" it should be passed via PushOptions so any backend (fastcgi or fileserver) can handle this request and push (gzipped or whatever) resource to the client. I will think about solution.

@mholt
Copy link
Member

mholt commented Feb 17, 2017

But a push response goes down to the client, not to a backend.

@nwidger
Copy link

nwidger commented Feb 17, 2017

Other important headers to copy might include Authorization, WWW-Authenticate and Cookie for authentication and session info.

@nwidger
Copy link

nwidger commented Feb 17, 2017

@mholt The target parameter, opts.Method and opts.Header passed to the http.Pusher.Push call constitute a synthetic request that gets serviced by the HTTP server just like any other. The response to that synthetic request is what gets pushed to the client. That is my understanding at least.

@mholt
Copy link
Member

mholt commented Feb 17, 2017

Ah, I see what you mean. Yes... perhaps, see what the proxy middleware sends upstream to get a decent list.

@nwidger
Copy link

nwidger commented Feb 18, 2017

If I'm reading the code right, it looks like Apache's mod_http2 implementation copies the User-Agent, Accept, Accept-Encoding, Accept-Language and Cache-Control headers from the original request: https://github.com/icing/mod_h2/blob/master/mod_http2/h2_push.c#L280

@nwidger
Copy link

nwidger commented Feb 18, 2017

Similarly, nghttp2 appears to copy the Accept-Encoding, Accept-Language, Cache-Control, Host and User-Agent headers: https://github.com/nghttp2/nghttp2/blob/master/src/shrpx_http2_upstream.cc#L1976

@wendigo
Copy link
Author

wendigo commented Feb 18, 2017

@nwidger thx, working on it right now

return h.Next.ServeHTTP(w, r)
}

// Serve file first
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description for http.Pusher contains the following text:

        // Handlers that wish to push URL X should call Push before sending any
        // data that may trigger a request for URL X. This avoids a race where the
        // client issues requests for X before receiving the PUSH_PROMISE for X.

By serving the original request first, aren't we violating this rule? The client will receive the Link headers and may send out additional requests for those resources before we send out the push promises for them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see that comment - if it's a rule - it should not be violated :) Fixed in #1453

@vladbondarenko
Copy link

Hi,

Just tried http2 push but its not working:

my.domain:443 {
push / {
/static/_cache/merged/d1da6bf60541e004556afd2b8358df2d.min.js
/static/_cache/merged/911bac75bca05a0baeedac6b4ebef29a.min.css
}
gzip
log /var/log/my.domain-access.log
errors /var/log/my.domain-error.log
proxy / 127.0.0.1:80 {
header_upstream Host "my.domain"
header_upstream X-Forwarded-Proto "https"
}
tls my.domain.crt my.domain.key
root /var/www/html
}

But no Link headers with pushed resources while GETting / and them are not pushed.
Is push work in proxy mode?

@lishengzxc
Copy link

@wendigo

www.lishengcn.cn {
  root /root/img
  push / https://img.lishengcn.cn/progress?percent=20
}

2017/04/12 21:15:30 Caddyfile:7 - Parse error: Unknown directive 'push'

Caddy 0.9.5

I may be something wrong?

@vladbondarenko
Copy link

use last code from git.

Push is known directive but it not working at all.

BTW, not sure but you cant use push with other domain then yours. you should push resources from www.lishengcn.cn and not img.

@lishengzxc
Copy link

lishengzxc commented Apr 12, 2017

@vladbondarenko

I installed caddy by curl https://getcaddy.com | bash and the version is 0.9.5, I think this is the latest version.

not img? you mean only javascript and css?

@mholt
Copy link
Member

mholt commented Apr 12, 2017

@lishengzxc That's the latest release but the version that has the push feature is only build-from-source for now.

@lishengzxc
Copy link

@mholt Thx, I will try

@vladbondarenko
Copy link

@mholt i`m using latest from git. Please check #1573

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.