-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC 115: Enabling HTTP/2 on GOV.UK #115
Conversation
a0835eb
to
d269015
Compare
Thanks for the write up @Nooshu To manage expectations this line is not quite realistic (at least in the short term):
There are 3 types of things that are served off assets.publishing.service.gov.uk
Of these 3, the first one looks the easiest to move to the www hostname and should provide the main benefits you're looking to achieve with this since CSS, JS and core layout assets would all be served off the main assets. As these are one time resources we don't need to worry about old links or redirects. The second, asset manager, would be much hairier to move (as links are embedded within content) and would probably be something to consider as a later/distinct project. The third, static templates, would have no impact on the browser issues as those assets are loaded server side. Only thing is that it seems a bit strange that they're on an assets hostname and may make sense to be on a dedicated The fact that we have these 3 rather different types of things hosted on this one hostname has been a long term source of confusion. Some quick consequences I can think of: |
Thanks for clarification @kevindew. The priority is most definitely number 1. Serving the CSS, JS and fonts from the origin domain will be the biggest win for web performance. 2 & 3 could be a nice to have at a later date (depending on actual complexity). In theory these can be streamed over the same TCP connection using connection coalescing for many users. For others that don't support it they will still receive an improved performance from the changes made in point 1. I will reword the RFC accordingly. |
d269015
to
202ed04
Compare
202ed04
to
0a481b1
Compare
Drive-by comment as mentioned on slack, there's probably a security benefit to keeping the uploaded files (your 2) on a separate origin as defence in depth against uploaded malicious files. eg (and I suspect this it currently mitigated absent a bug) there are situations where files can be both something you might accept for upload (eg jpg) and html which could contain malicious js. demo: http://lcamtuf.coredump.cx/squirrel/ |
Whole RFC rewritten to incorporate #114 details, and current knowledge about CORS. |
4cbae70
to
31f6949
Compare
This sounds good Matt - thanks for the write up, and for delving into the complex world of CORS again. I think the next steps for this RFC should be to add more details of consequences to the proposal section and then to set a deadline. Since I've been working with you quite closely on this I'm probably in a good position to write about some of them. I'll pepper it with questions too.
Consequence of this would be that we would no longer be serving the following headers:
on every single response from the assets domain and anything that relies on this for a resource other than fonts could break. We don't know of anything that would break at this moment in time. We would move the setting of the Access-Control-Allow-Origin header from the assets nginx configuration to instead be moved to the nginx configuration for an applications individual assets, since that is where other asset pipeline nginx configuration is. We'll plan to add this header to fonts by matching the file extension with a rule somewhat like Would we continue service Access-Control-Allow-Methods and Access-Control-Allow-Headers with fonts too? and/or are they needed for other assets?
The RFC mostly covers the consequences of this. I guess the main one is the removal of the HTTP headers which is covered above. Any other impacts this may cause?
This one is probably the most consequential. To achieve this we would intend to move the serving of frontend applications static assets from https://assets.publishing.service.gov.uk/{app-name}/ to https://www.gov.uk/assets/{app-name}. For example: https://assets.publishing.service.gov.uk/static/govuk-template-print-1076519521c2fffbbf75ab3b0d3b32ee2d96ac7e9778f1cdfac1771eefd1a1c0.css -> https://www.gov.uk/assets/static/govuk-template-print-1076519521c2fffbbf75ab3b0d3b32ee2d96ac7e9778f1cdfac1771eefd1a1c0.css What this would involve is:
This would mean that all frontend apps except static would serve assets on a relative path (which simplifies a couple of things). We however would need static to continue serving assets on an absolute path in order to continue supporting heroku review apps (for example: government-frontend - to resolve this I'd propose setting an environment variable for static of STATIC_ASSET_ROOT which is set to appropriate gov.uk root for environment (e.g. www.gov.uk for prod, draft-orgin.gov.uk for draft prod, www.staging.publishing.service.gov.uk for staging). We'd then be left with two things served off the assets hostname: Asset Manager assets and static templates, where I think static feels an odd choice for this domain. It's already kind of strange because internal apps communicate with it on a static hostname whereas external apps (I presume exclusively heroku review apps) use assets.publishing. So I'd think as part of this we should try move static off the assets domain so that it can be exclusively for asset manager assets - this however may fall under nice to have as it just solves confusion problems about hosts purpose rather than distinct technical goals. Finally someone would need to do a bit of clean up of the ASSETS_ROOT env var in puppet and beyond to just check it's no longer used in the context of an apps static assets and only for asset manager. Consequences of this change would be:
Some of these should be alleviated by doing this gradually on an app by app basis rather than it being a full switch over one day. Other thing that might be useful here is to explain in the RFC is what the switch to HTTP2 looks like? is that just a change on fastly or will we need to make any changes within GOV.UK infrastructure or apps? Would it be done for both www and assets hostnames? Finally, it'd be good to flag a deadline on this either a week or 2 weeks is pretty normal and then we can do some shouting to try have this receive attention from people who might be able to identify any spots where things would break. |
EOT files are no longer going to be served to Internet Explorer once GOV.UK has migrated fully to the Design System. To EOT can be removed in the future. There are no plans that I am aware of to create another webfont format, so we won't be needing to add any more for the foreseeable future.
The lack of
We need feedback and ideas on this issue to understand if this is even possible to fix, and how it can be solved. The only viable solution I can think of at the moment is to remove Subresource Integrity (SRI) from the CSS / JS. That way we can keep the
I will add this. For HTTP/2 there are no changes that need to be made on any of the apps. We simply contact Fastly and they enable it on both production and integration (for the origin and assets domains). It isn't possible to enable for only production / only integration due to the Fastly CDN setup is currently configured. This discussion what had back in November 2018 when we first discussed HTTP/2. TODO:
|
Reworked the RFC after the latest information about |
rfc-115-enabling-http2-on-govuk.md
Outdated
|
||
We should look at assets served using this header. Note: Webfonts will need to be served with this header due to their [unique CORS requirements](https://www.w3.org/TR/css-fonts-3/#font-fetching-requirements). | ||
The `<origin>` value is very important. As this is where we would ideally like to specify: `*.gov.uk; *.herokuapp.com`. Unfortunately this header only accepts a **single** origin, and it **doesn't** recognise wildcard values. So we are in a situation where we need to use `*` (for SRI) and a specific origin value (e.g `https://government-frontend.herokuapp.com` for every unique Heroku app URL) at the same time. As far as I know this isn't possible to fulfil, so the `Access-Control-Allow-Origin: *` header must stay as is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a Fastly blog post that suggests it might be possible to dynamically set the Access-Control-Allow-Origin
header based on req.http.Origin
at the CDN level – https://www.fastly.com/blog/caching-cors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent thanks @36degrees 👍
Depending on the complexity involved and the number of external domains the assets are used on, this should allow us to use SRI with crossorigin='use-credentials'
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be a bit tricky as we'd have to set-up dynamic hostname responses - as the heroku apps are generated based on pull requests. We'd also need similar configuration for users developing locally so we'd need the same again for dev.gov.uk hosts. I'm not sure if we need to set-up anything for localhost with a port though, does CORS treat localhost special?
I'd suggest we'd want to be really clear that we're benefiting from SRI before complicating the configuration to support it. It sounds like this RFC has determined we're not using it for the purpose it's intended - have we received any push back on this suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No pushback as of yet. Personally I'm in favor of keeping the Access-Control-Allow-Origin: *
on our fonts to allow Heroku (and any other services using the fonts to keep working), and remove SRI. As you say it isn't being used as intended (third-party scripts) and the path of least resistance / complications is removing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest we'd want to be really clear that we're benefiting from SRI before complicating the configuration to support it.
👍
It'd be a bit tricky as we'd have to set-up dynamic hostname responses - as the heroku apps are generated based on pull requests.
I'm not terribly familiar with VCL, but could you match against a regex to mimic wildcard support for e.g. *.herokuapp.com
? Something like…
if (req.http.Origin ~ "\.herokuapp.com$") {
set resp.http.Access-Control-Allow-Origin = req.http.Origin;
}
Changed made to the RFC following feedback:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me 👍 We should look about setting up a group of people to meet semi-regularly so we can implement this fully.
Agreed, Thanks @kevindew. I'm happy to raise the SRI / crossorigin changes this week. This initial step will allow us to enable H2 at the edge. Then we can look at prioritising the further optimisation points (Access-Control-Allow-Origin, assets domain changes). |
Thanks to the excellent work by @kevindew, all 13 of the GOV.UK apps now serve their CSS, fonts, JavaScript, and images from This comes with the following advantages:
For completeness I've included a set of performance differences before and after the change below. These tests were using a Samsung S4 on a 3G connection, at P95. This is a low specification device that has been used by approximately 55,000 'users' in the past 3 months. CalculatorsCollectionsEmail Alert FrontendFeedbackFinder FrontendFrontendGovernment FrontendInfo FrontendLicense FinderManuals FrontendService Manual FrontendSmart AnswersWhitehall |
TL;DR; HTTP/1.1 and HTTP/2 (currently disabled) performance on GOV.UK is sub-optimal due to our current setup. The RFC examines the reasons why this is and proposes changes to fix it.
👉 Rendered version
Deadline for comments: 27th January 2020 (2 weeks)