Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC 115: Enabling HTTP/2 on GOV.UK #115

Merged
merged 5 commits into from
Jan 27, 2020
Merged

RFC 115: Enabling HTTP/2 on GOV.UK #115

merged 5 commits into from
Jan 27, 2020

Conversation

Nooshu
Copy link

@Nooshu Nooshu commented Dec 18, 2019

TL;DR; HTTP/1.1 and HTTP/2 (currently disabled) performance on GOV.UK is sub-optimal due to our current setup. The RFC examines the reasons why this is and proposes changes to fix it.

👉 Rendered version

Deadline for comments: 27th January 2020 (2 weeks)

@Nooshu Nooshu force-pushed the remove-assets-domain branch from a0835eb to d269015 Compare December 19, 2019 06:57
@kevindew
Copy link
Member

Thanks for the write up @Nooshu

To manage expectations this line is not quite realistic (at least in the short term):

This RFC is a proposal to remove the asset domain (assets.publishing.service.gov.uk) completely and serve all GOV.UK assets off the main origin (www.gov.uk).

There are 3 types of things that are served off assets.publishing.service.gov.uk

  1. Static assets that are bundled and fingerprinted as part of an application, typically css/js/images/fonts - for example: https://assets.publishing.service.gov.uk/frontend/application-3a50640831cfc5b6056f9a26f0fe1c382711e09c758434756996c9fc18014d05.css served on https://www.gov.uk
  2. Files uploaded by government publishers that are served by asset manager - for example: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/image_data/file/94211/s300_Charles_Hay.jpg served on https://www.gov.uk/government/news/uk-hopes-to-further-strengthen-existing-ties-with-malaysia
  3. Compiled HTML templates served by static, which are pretty whacky - for example: https://assets.publishing.service.gov.uk/templates/core_layout.html.erb and these are typically consumed by applications in development e.g. https://government-frontend.herokuapp.com/

Of these 3, the first one looks the easiest to move to the www hostname and should provide the main benefits you're looking to achieve with this since CSS, JS and core layout assets would all be served off the main assets. As these are one time resources we don't need to worry about old links or redirects.

The second, asset manager, would be much hairier to move (as links are embedded within content) and would probably be something to consider as a later/distinct project.

The third, static templates, would have no impact on the browser issues as those assets are loaded server side. Only thing is that it seems a bit strange that they're on an assets hostname and may make sense to be on a dedicated static.publishing.service.gov.uk - though we've been trying to kill static off for years so might not warrant more effort put into it.

The fact that we have these 3 rather different types of things hosted on this one hostname has been a long term source of confusion.


Some quick consequences I can think of:

  • We'd see a large increase in requests to www hostname, which would impact CDN log sizes and thus any CDN log processing tasks we do
  • May need to tweak nginx rate limiting rules per hostname, as these vary between assets and www, and traffic would likely increase

@Nooshu
Copy link
Author

Nooshu commented Dec 19, 2019

Thanks for clarification @kevindew.

The priority is most definitely number 1. Serving the CSS, JS and fonts from the origin domain will be the biggest win for web performance.

2 & 3 could be a nice to have at a later date (depending on actual complexity). In theory these can be streamed over the same TCP connection using connection coalescing for many users. For others that don't support it they will still receive an improved performance from the changes made in point 1.

I will reword the RFC accordingly.

@Nooshu Nooshu changed the title RFC 115 remove the GOV.UK assets domain RFC 115 serve CSS,JS, and fonts from the origin (www.gov.uk) Dec 19, 2019
@Nooshu Nooshu changed the title RFC 115 serve CSS,JS, and fonts from the origin (www.gov.uk) RFC 115 serve CSS, JS, and fonts from the origin (www.gov.uk) Dec 19, 2019
@Nooshu Nooshu changed the title RFC 115 serve CSS, JS, and fonts from the origin (www.gov.uk) RFC 115 serve static asssets (CSS, JS, images, fonts) from the origin (www.gov.uk) Dec 19, 2019
@Nooshu Nooshu force-pushed the remove-assets-domain branch from d269015 to 202ed04 Compare December 19, 2019 11:35
@Nooshu Nooshu changed the title RFC 115 serve static asssets (CSS, JS, images, fonts) from the origin (www.gov.uk) RFC 115 serve static assets (CSS, JS, images, fonts) from the origin (www.gov.uk) Dec 19, 2019
@Nooshu Nooshu changed the title RFC 115 serve static assets (CSS, JS, images, fonts) from the origin (www.gov.uk) RFC 115: Serve static assets (CSS, JS, images, fonts) from the origin (www.gov.uk) Dec 19, 2019
@Nooshu Nooshu force-pushed the remove-assets-domain branch from 202ed04 to 0a481b1 Compare December 20, 2019 08:09
@david-ncsc
Copy link

Drive-by comment as mentioned on slack, there's probably a security benefit to keeping the uploaded files (your 2) on a separate origin as defence in depth against uploaded malicious files. eg (and I suspect this it currently mitigated absent a bug) there are situations where files can be both something you might accept for upload (eg jpg) and html which could contain malicious js. demo: http://lcamtuf.coredump.cx/squirrel/

@Nooshu Nooshu changed the title RFC 115: Serve static assets (CSS, JS, images, fonts) from the origin (www.gov.uk) RFC 115: Enabling HTTP/2 on GOV.UK Jan 8, 2020
@Nooshu
Copy link
Author

Nooshu commented Jan 8, 2020

Whole RFC rewritten to incorporate #114 details, and current knowledge about CORS.

@Nooshu Nooshu force-pushed the remove-assets-domain branch from 4cbae70 to 31f6949 Compare January 8, 2020 23:24
@kevindew
Copy link
Member

This sounds good Matt - thanks for the write up, and for delving into the complex world of CORS again.

I think the next steps for this RFC should be to add more details of consequences to the proposal section and then to set a deadline.

Since I've been working with you quite closely on this I'm probably in a good position to write about some of them. I'll pepper it with questions too.

  • Change the NGINX config to only serve font files (WOFF2, WOFF, EOT) with Access-Control-Allow-Origin: *.

Consequence of this would be that we would no longer be serving the following headers:

Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, OPTIONS
Access-Control-Allow-Headers: origin, authorization

on every single response from the assets domain and anything that relies on this for a resource other than fonts could break. We don't know of anything that would break at this moment in time.

We would move the setting of the Access-Control-Allow-Origin header from the assets nginx configuration to instead be moved to the nginx configuration for an applications individual assets, since that is where other asset pipeline nginx configuration is.

We'll plan to add this header to fonts by matching the file extension with a rule somewhat like location ~* \.(eot|otf|ttf|woff|woff2)$ thus a consequence would be that supporting new font formats with this header will require a change to this rule and if we were to serve fonts without an extension they'd also miss. Do we already know the list of extensions we'd support or do we need to dig into that?

Would we continue service Access-Control-Allow-Methods and Access-Control-Allow-Headers with fonts too? and/or are they needed for other assets?
Do we anticipate these changes causing any problems for Heroku review apps that use GOV.UK assets, for example: https://government-frontend.herokuapp.com/examples/answer/answer ?

  • Change crossorigin='anonymous' to crossorigin='use-credentials' for all CSS and JavaScript.

The RFC mostly covers the consequences of this. I guess the main one is the removal of the HTTP headers which is covered above. Any other impacts this may cause?

  • Serve static assets (CSS, JS, Fonts) from the origin (www.gov.uk).

This one is probably the most consequential. To achieve this we would intend to move the serving of frontend applications static assets from https://assets.publishing.service.gov.uk/{app-name}/ to https://www.gov.uk/assets/{app-name}. For example: https://assets.publishing.service.gov.uk/static/govuk-template-print-1076519521c2fffbbf75ab3b0d3b32ee2d96ac7e9778f1cdfac1771eefd1a1c0.css -> https://www.gov.uk/assets/static/govuk-template-print-1076519521c2fffbbf75ab3b0d3b32ee2d96ac7e9778f1cdfac1771eefd1a1c0.css

What this would involve is:

This would mean that all frontend apps except static would serve assets on a relative path (which simplifies a couple of things).

We however would need static to continue serving assets on an absolute path in order to continue supporting heroku review apps (for example: government-frontend - to resolve this I'd propose setting an environment variable for static of STATIC_ASSET_ROOT which is set to appropriate gov.uk root for environment (e.g. www.gov.uk for prod, draft-orgin.gov.uk for draft prod, www.staging.publishing.service.gov.uk for staging).

We'd then be left with two things served off the assets hostname: Asset Manager assets and static templates, where I think static feels an odd choice for this domain. It's already kind of strange because internal apps communicate with it on a static hostname whereas external apps (I presume exclusively heroku review apps) use assets.publishing. So I'd think as part of this we should try move static off the assets domain so that it can be exclusively for asset manager assets - this however may fall under nice to have as it just solves confusion problems about hosts purpose rather than distinct technical goals.

Finally someone would need to do a bit of clean up of the ASSETS_ROOT env var in puppet and beyond to just check it's no longer used in the context of an apps static assets and only for asset manager.

Consequences of this change would be:

  • Traffic to www hostname would increase whereas assets traffic would decrease, which would cause differences in aggregate traffic monitoring and could potentially trigger some alerts;
  • May need to adjust nginx rules about rates/connections as they currently differ between www and assets;
  • Any other apps that hotlink to our static assets would get 404s, which shouldn't be an actual concern as this is not supported or encouraged.

Some of these should be alleviated by doing this gradually on an app by app basis rather than it being a full switch over one day.


Other thing that might be useful here is to explain in the RFC is what the switch to HTTP2 looks like? is that just a change on fastly or will we need to make any changes within GOV.UK infrastructure or apps? Would it be done for both www and assets hostnames?


Finally, it'd be good to flag a deadline on this either a week or 2 weeks is pretty normal and then we can do some shouting to try have this receive attention from people who might be able to identify any spots where things would break.

@Nooshu
Copy link
Author

Nooshu commented Jan 10, 2020

Do we already know the list of extensions we'd support or do we need to dig into that?
We only need to support the following extensions:

  • EOT - For Internet Explorer only (6-11)
  • WOFF - For all modern versions of "evergreen" browsers and IE9-11.
  • WOFF2 - All modern browsers since ~2016. it's usage trumps the use of WOFF.

EOT files are no longer going to be served to Internet Explorer once GOV.UK has migrated fully to the Design System. To EOT can be removed in the future. There are no plans that I am aware of to create another webfont format, so we won't be needing to add any more for the foreseeable future.

Would we continue service Access-Control-Allow-Methods and Access-Control-Allow-Headers with fonts too? and/or are they needed for other assets?

Access-Control-Allow-Methods is only used for preflight request (checks to see if the CORS protocol is understood and a server is aware using specific methods and headers.). And as I believe we are only using "simple requests" (HEAD, GET, POST), I don't believe it will be needed.
Access-Control-Allow-Headers is another header that is used in preflight requests, so as above I don't think it will be needed.

Do we anticipate these changes causing any problems for Heroku review apps that use GOV.UK assets...

The lack of Access-Control-Allow-Origin: * on JS assets is going to cause a problem for Heroku previews. Since we are asking the browser to fetch JS on Heroku that comes from a totally different domain. The Access-Control-Allow-Origin only allows a single domain in this field, and each app is on a different Heroku app. So we'd need to have the header updated on "integration" to specify the Heroku app is being used in. e.g.:

Access-Control-Allow-Origin: https://government-frontend.herokuapp.com
Access-Control-Allow-Origin: https://govuk-frontend-app-pr-1234.herokuapp.com

We need feedback and ideas on this issue to understand if this is even possible to fix, and how it can be solved.

The only viable solution I can think of at the moment is to remove Subresource Integrity (SRI) from the CSS / JS. That way we can keep the Access-Control-Allow-Origin: * header included, and remove the need for crossorigin='anonymous' completely. This will then allow H2 connection coalescing as is the minimum web performance requirement for us to enable HTTP/2. If this is the case then RFC needs to be rewritten as the NGINX changes won't be (as) required (although I do think we should be more granular with these headers, so there's an advantage to doing the work)

Other thing that might be useful here is to explain in the RFC is what the switch to HTTP2 looks like? is that just a change on fastly or will we need to make any changes within GOV.UK infrastructure or apps? Would it be done for both www and assets hostnames?

I will add this. For HTTP/2 there are no changes that need to be made on any of the apps. We simply contact Fastly and they enable it on both production and integration (for the origin and assets domains). It isn't possible to enable for only production / only integration due to the Fastly CDN setup is currently configured. This discussion what had back in November 2018 when we first discussed HTTP/2.

TODO:

  • Add more details of consequences to the proposal section.
  • Set a deadline of 2 weeks.
  • Add details about enabling H2.
  • Details about the impact of SRI and how it causes more work.

@Nooshu
Copy link
Author

Nooshu commented Jan 13, 2020

Reworked the RFC after the latest information about Access-Control-Allow-Origin: * and Heroku usage. Note: the NGINX setup for fonts in the comments above is no longer required so can be dismissed.


We should look at assets served using this header. Note: Webfonts will need to be served with this header due to their [unique CORS requirements](https://www.w3.org/TR/css-fonts-3/#font-fetching-requirements).
The `<origin>` value is very important. As this is where we would ideally like to specify: `*.gov.uk; *.herokuapp.com`. Unfortunately this header only accepts a **single** origin, and it **doesn't** recognise wildcard values. So we are in a situation where we need to use `*` (for SRI) and a specific origin value (e.g `https://government-frontend.herokuapp.com` for every unique Heroku app URL) at the same time. As far as I know this isn't possible to fulfil, so the `Access-Control-Allow-Origin: *` header must stay as is.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a Fastly blog post that suggests it might be possible to dynamically set the Access-Control-Allow-Origin header based on req.http.Origin at the CDN level – https://www.fastly.com/blog/caching-cors

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent thanks @36degrees 👍

Depending on the complexity involved and the number of external domains the assets are used on, this should allow us to use SRI with crossorigin='use-credentials'.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be a bit tricky as we'd have to set-up dynamic hostname responses - as the heroku apps are generated based on pull requests. We'd also need similar configuration for users developing locally so we'd need the same again for dev.gov.uk hosts. I'm not sure if we need to set-up anything for localhost with a port though, does CORS treat localhost special?

I'd suggest we'd want to be really clear that we're benefiting from SRI before complicating the configuration to support it. It sounds like this RFC has determined we're not using it for the purpose it's intended - have we received any push back on this suggestion?

Copy link
Author

@Nooshu Nooshu Jan 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No pushback as of yet. Personally I'm in favor of keeping the Access-Control-Allow-Origin: * on our fonts to allow Heroku (and any other services using the fonts to keep working), and remove SRI. As you say it isn't being used as intended (third-party scripts) and the path of least resistance / complications is removing it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest we'd want to be really clear that we're benefiting from SRI before complicating the configuration to support it.

👍

It'd be a bit tricky as we'd have to set-up dynamic hostname responses - as the heroku apps are generated based on pull requests.

I'm not terribly familiar with VCL, but could you match against a regex to mimic wildcard support for e.g. *.herokuapp.com? Something like…

if (req.http.Origin ~ "\.herokuapp.com$") {
    set resp.http.Access-Control-Allow-Origin = req.http.Origin;
  }

@Nooshu
Copy link
Author

Nooshu commented Jan 15, 2020

Changed made to the RFC following feedback:

  • Added information about the fonts we are serving.
  • Clarified the need for crossorigin, Access-Control-Allow-Origin: *, only with SRI.
  • Clarified that we only need the Access-Control-Allow-Origin: * header sent with our font files.
  • Included information about the potential to set the Access-Control-Allow-Origin header dynamically on the CDN, and why it may be a complex change.
  • Updated the MUST, SHOULD, and Consequences to reflect changes.

Copy link
Member

@kevindew kevindew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me 👍 We should look about setting up a group of people to meet semi-regularly so we can implement this fully.

@Nooshu
Copy link
Author

Nooshu commented Jan 27, 2020

Agreed, Thanks @kevindew. I'm happy to raise the SRI / crossorigin changes this week. This initial step will allow us to enable H2 at the edge. Then we can look at prioritising the further optimisation points (Access-Control-Allow-Origin, assets domain changes).

@Nooshu Nooshu merged commit d8a77be into master Jan 27, 2020
@Nooshu
Copy link
Author

Nooshu commented Jun 5, 2020

Thanks to the excellent work by @kevindew, all 13 of the GOV.UK apps now serve their CSS, fonts, JavaScript, and images from www.gov.uk rather than assets.publishing.service.gov.uk.

This comes with the following advantages:

  • All asset downloads can utilise the TCP connection that has already been negotiated. This minimises the impact of TCP slow start
  • Since all assets are on the same domain, HTTP/2 prioritisation can work for all assets
  • There's no need to establish a 2nd TCP connection. A single connection is all that is required.
  • HTTP/2 connection coalescing is no longer a requirement to take advantage of HTTP/2 features. Some browsers still have issues with this.

For completeness I've included a set of performance differences before and after the change below. These tests were using a Samsung S4 on a 3G connection, at P95. This is a low specification device that has been used by approximately 55,000 'users' in the past 3 months.

Calculators

calculators

Collections

collections

Email Alert Frontend

email-alert-frontend

Feedback

feedback

Finder Frontend

finder-frontend

Frontend

frontend

Government Frontend

government-frontend

Info Frontend

info-frontend

License Finder

licence-finder

Manuals Frontend

manuals-frontend

Service Manual Frontend

service-manual-frontend

Smart Answers

smart-answers

Whitehall

whitehall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants