Finalize assignments: Chapter 20. HTTP/2 #22

rviscomi · 2019-05-21T01:55:28Z

Section	Chapter	Authors	Reviewers
IV. Content Distribution	20. HTTP/2	@bazzadp	@bagder @rmarx @dotjs

Due date: To help us stay on schedule, please complete the action items in this issue by June 3.

To do:

Assign subject matter expert (author)
Assign peer reviewers
Finalize metrics

Current list of metrics:

Adoption rate of HTTP/2 by site (home page only) and by requests (all request on page) over the years. Trend graph over all available years.
Measure of HTTP version negotiated (0.9, 1.0, 1.1, 2, gQUIC) for main page of all sites, and for HTTPS sites. Table for last crawl. For example:

Version	All sites	HTTPS only sites
HTTP/0.9	0%	0%
HTTP/1.0	2%	0%
HTTP/1.1	48%	20%
HTTP/2	44%	70%
gQUIC	6%	10%

For gQUIC it will be sites that return Alt-Svc HTTP Header which starts with quic.

Average percentage of resources loaded over HTTP/2 (or gQUIC) versus HTTP/1.1 per site. Trend graph over all available years.
Number of HTTP (not HTTPS) sites which return upgrade HTTP header containing h2. Once off stat for last crawl.
Number of HTTPS sites using HTTP/2 which return upgrade HTTP header containing h2. Once off stat for last crawl.
Number of HTTPS sites not using HTTP/2 which return upgrade HTTP header containing h2. Once off stat for last crawl.
% of sites affected by CDN prioritization issues (H2 and served by CDN) - https://github.com/andydavies/http2-prioritization-issues#cdns--cloud-hosting-services. If not possible then maybe just list sites by CDN and can then manually vlookup from table in Andy's github issue? Once off stat for last crawl.
Count of HTTP/2 sites grouped by server HTTP header value but strip version numbers (e.g. Apache and Apache 2.4.28 and Apache 2.4.29 should all report as Apache, but Apache Tomcat should report as Tomcat. Probably need to massive the results to achieve this). Once off stat for last crawl.
Count of non-HTTP/2 sites grouped by server HTTP header value but strip version numbers. Once off stat for last crawl.
Count of HTTP/2 sites which use HTTP/2 Push. Trend graph over all available years.
Average number of HTTP/2 Pushed resources and average bytes. Once off stat for last crawl.
Count and number of bytes pushed by asset type (CSS, JS, Images...etc.). Once off stat for last crawl.
Count of preload HTTP Headers with nopush attribute set. Once off stat for last crawl.
Is it possible to see HTTP/2 Pushed resources which are not used on the page load?
Measure number of TCP Connections per site. Average number of domains per site still going down year on year as per HTTP Archive State of the Web report? Trend graph over all available years.
Measure average number of TCP Connections per site for HTTP/1.1 sites versus HTTP/2 sites. Once off stat for last crawl.
Count of HTTP/2 sites grouped by SETTINGS_MAX_CONCURRENT_STREAMS (including HTTP/2 sites which don't set this value). Note this was added recently as per Finalize assignments: Chapter 20. HTTP/2 #22 (comment). Once off stat for last crawl.

👉 AI (@bazzadp): Finalize which metrics you might like to include in an annual "state of HTTP/2" report powered by HTTP Archive. Community contributors have initially sketched out a few ideas to get the ball rolling, but it's up to you, the subject matter experts, to know exactly which metrics we should be looking at. You can use the brainstorming doc to explore ideas.

The metrics should paint a holistic, data-driven picture of the HTTP/2 landscape. The HTTP Archive does have its limitations and blind spots, so if there are metrics out of scope it's still good to identify them now during the brainstorming phase. We can make a note of them in the final report so readers understand why they're not discussed and the HTTP Archive team can make an effort to improve our telemetry for next year's Almanac.

Next steps: Over the next couple of months analysts will write the queries and generate the results, then hand everything off to you to write up your interpretation of the data.

Additional resources:

The text was updated successfully, but these errors were encountered:

rviscomi · 2019-05-21T02:11:38Z

Hey @mnot, Paul says he reached out to you and you may be interested in being the designated subject matter expert for the H2 chapter. You can learn more about the Almanac project here.

Let me know if you have any questions about the time/deliverable expectations and if you're able to commit.

tunetheweb · 2019-05-21T22:01:27Z

I’m happy to help out here if you need help in this section? Just published a book on HTTP/2 (https://www.manning.com/books/http2-in-action) so spent last couple of years digging into this topic and what it’s meant in real world since launch. I don’t know HTTP Archive or BigQuery though, but know SQL very well so sure I can learn it with a few pointers in the right direction. Or just help interpret the results others serve up, or review writing or whatever.

rviscomi · 2019-05-21T22:29:47Z

Hey @bazzadp thanks for reaching out, we'd love to have you!

Sounds like you're a great fit for the subject matter expert role, driving the direction of the metrics included in the chapter and writing your interpretations. I'll put your name down.

tunetheweb · 2019-05-22T16:44:09Z

Potential metrics (just rough thoughts for now but will update):

Adoption rate (by site =~ 35% and by traffic =~ 60 %)
Measure of all HTTP versions (0.9, 1.0, 1.1, 2). What about gQUIC - see below?
How are QUIC sites (e.g. www.google.com on Chrome) reported? HTTP/2? QUIC+HTTP/2? Other?
% of sites affected by CDN prioritization issues (H2 and served by CDN)
Which CDN / server software are people using for HTTP/2? Check server HTTP header value of sites that support HTTP/2.
Push usage? Expect it to be low but would be good to actually quantify.
Domain sharding - is it becoming less common with HTTP/2? Average number of domains per site going down? Measure number of domains used by HTTP/2 sites this year and in previous years when not HTTP/2 enabled?
QUIC and HTTP/3 support (either h3 or quic in TLS ALPN record, or h3 or quic in Alt-Svc HTTP header)? Or one for next year? gQUIC has been here for some time and seen an uptick in CDN support for that so think we should include this year.
Are HTTP/2 settings data exposed in HTTP archive? E.g. max number of concurrent streams? Header Table Size?
Anyway of measuring of HPACK? Maybe link to compression chapter?

Also some stats here for validation that in right ballpark: https://http2.netray.io/stats.html but obviously as an HTTP Archive report we should use HTTP Archive stats.

rviscomi · 2019-05-22T17:07:17Z

That's a great list, big +1 to everything. @pmeenan could you help answer some of Barry's questions? (also let us know if you'd be interested in reviewing this chapter)

Adoption rate (by site =~ 35% and by traffic =~ 60 %)

Did you have something in mind to weigh adoption by traffic? The HTTP Archive dataset doesn't currently include popularity signals.

tunetheweb · 2019-05-22T17:15:04Z

And there's where my lack of HTTP Archive knowledge comes into play! :-)

This shows sites: https://w3techs.com/technologies/details/ce-http2/all/all
This shows usage: https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&measure=HTTP_RESPONSE_VERSION

So you can say 60% of web traffic is HTTP/2 but that's dominated by the big boys. Or you can say 35% of sites are HTTP/2. Both are correct but depends what metric you want.

This is something that probably should be decided at a project level as will affect all other chapters too. And if, as you say, HTTP Archive only has one then that's an easy question to answer!

But that begs the next question (also a project level question): should we just use HTTP Archive stats? Or also include other stats like the two examples above? I can understand if want to just use HTTP Archive but thought I'd ask the question in case artificially limiting myself here!

rviscomi · 2019-05-22T17:34:05Z

Great questions.

So you can say 60% of web traffic is HTTP/2 but that's dominated by the big boys. Or you can say 35% of sites are HTTP/2. Both are correct but depends what metric you want.

Let's stick with "% of websites" or "% of all requests". We include a chart of the latter in our State of the Web report on the website.

should we just use HTTP Archive stats? Or also include other stats like the two examples above? I can understand if want to just use HTTP Archive but thought I'd ask the question in case artificially limiting myself here!

I'd say let's exhaust all of the stats we can extract from the HTTP Archive dataset first, and if we still can't paint a complete picture, then it makes sense that we should pull in outside research and cite it accordingly. Things like % of sites vs traffic are just matters of perspective, but if we're missing out on a key metric then that's worth outsourcing.

bagder · 2019-05-23T19:50:00Z

Just a note on the "say 60% of web traffic" @bazzadp: that's 60% of all browser traffic. It might be worth considering that HTTP/2 is only ever attempted when doing HTTPS, which is now on around 80% of the browser page loads so that makes the amount of HTTPS loads done by Firefox that uses HTTP/2 to be 75%.

tunetheweb · 2019-05-23T20:09:52Z

Very valid point!

@bagder has also agreed to review this section @rviscomi so could you update the original comment here and the other matrix?

rviscomi · 2019-05-23T20:13:29Z

Sounds great, I've added @bagder and sent an invitation to the @HTTPArchive/reviewers team.

tunetheweb · 2019-05-23T20:16:02Z

You’ve a typo in his username: @bagder Am sure he gets this a lot - I know I’ve mistyped it like that before! :-)

bagder · 2019-05-23T20:16:37Z

I'm mr typo. 😁

rviscomi · 2019-05-23T20:17:09Z

Hah! That explains the autocomplete fail :)

andydavies · 2019-05-23T20:24:41Z

Back in Jan about 26% of the traffic over Akamai's network was H2 - https://developer.akamai.com/blog/2019/01/31/http2-discover-performance-impacts-effective-prioritization

bagder · 2019-05-23T20:27:00Z

regarding "Average number of domains per site going down?", I know HTTParchive already tracks number of TCP connections needed, which is of course more a result of number of domains used and/or HTTP/2-"unsharding" of them.

tunetheweb · 2019-05-23T20:34:23Z

Back in Jan about 26% of the traffic over Akamai's network was H2 - https://developer.akamai.com/blog/2019/01/31/http2-discover-performance-impacts-effective-prioritization

I was wondering why so low? As would have expected CDN traffic to be ahead of average not behind assuming it’s on my default for HTTPS users. But I see they only enabled that since March (https://blogs.akamai.com/2019/03/http2-will-be-automatically-enabled-by-default-on-the-akamai-intelligent-edge-platform.html). Wonder what that percentage is now since that change?

rmarx · 2019-05-23T21:21:59Z

I am also interested in acting as a reviewer for this chapter :)

pmeenan · 2019-05-24T13:53:08Z

How are QUIC sites (e.g. www.google.com on Chrome) reported? HTTP/2? QUIC+HTTP/2? Other?

QUIC and HTTP/3 support (either h3 or quic in TLS ALPN record, or h3 or quic in Alt-Svc HTTP header)? Or one for next year? gQUIC has been here for some time and seen an uptick in CDN support for that so think we should include this year.

I think the "alt-svc" response header is the only thing we capture that can help with this (at least without processing the raw trace files). The ALPN details for a connection aren't kept though it might be possible to add later.

Push usage? Expect it to be low but would be good to actually quantify.

You might be shocked and dismayed because of the automatic translation from "preload" response header to PUSH, it happens WAY more often than I'd like. It may be a small overall * of sites but would be interesting to deep dive into the distribution of number of pushed resources and bytes for those that do use push.

Are HTTP/2 settings data exposed in HTTP archive? E.g. max number of concurrent streams? Header Table Size?

No, not currently anyway.

Anyway of measuring of HPACK? Maybe link to compression chapter?

As in number of slots available or something else?

For the ALPN and H2 settings that might be of interest, if you file an issue with wptagent I may be able to add the connection-level protocol details (or at least whatever I can extract from the netlog).

pmeenan · 2019-05-24T18:31:28Z

I just added a few connection-level fields to the WebPageTest data collection. They will only be reported for the first request on a given connection (same request that has the connect timings):

"http2_server_settings":{
  "SETTINGS_MAX_HEADER_LIST_SIZE": 16384,
  "SETTINGS_MAX_CONCURRENT_STREAMS": 100,
  "SETTINGS_INITIAL_WINDOW_SIZE": 1048576
},
"tls_resumed": "False",
"tls_next_proto": "h2",
"tls_cipher_suite": 4865,
"tls_version": "TLS 1.3",

The June 1 crawl will include the data. I didn't see any other TLS or H2 session-level settings exposed in the netlog but hopefully this helps.

pmeenan · 2019-05-24T19:59:15Z

More thinking what's the saving thanks to HPACK? A bit of an update to this blog post. Can you think of any way to report on this?

If the netlog included the raw HTTP/2 frame events I could calculate the size of the HEADERS frame relative to the decoded headers but looking through the raw netlog events it doesn't look like it does. At best I could infer it from the socket bytes events right beside it but that may include other frames as well and feels too fragile.

mnot · 2019-05-28T07:27:43Z

It looks like this is in good hands; I'm going to put some suggestions in a few other places.

rviscomi · 2019-05-28T14:15:11Z

Thanks @mnot! You're still welcome to contribute to this chapter as a coauthor or reviewer if interested.

tunetheweb · 2019-05-28T14:53:45Z

Definitely! Or if you've any thoughts on what stats to measure then let us know.

dotjs · 2019-05-29T16:06:06Z

Happy to be added as a reviewer.

rviscomi · 2019-05-29T16:11:29Z

Thanks @dotjs! Happy to add you as a reviewer. Is your first/last name public anywhere?

dotjs · 2019-05-29T16:22:22Z

Yes, Andrew Galloni - Updated my profile

rviscomi · 2019-05-29T16:29:58Z

👍 👍 Thanks!

tunetheweb · 2019-05-30T22:01:51Z

I've updated the stats to the following:

Adoption rate (by site =~ 35% and by requests)
Measure of all HTTP versions (0.9, 1.0, 1.1, 2) for all sites.
Measure of all HTTP versions (0.9, 1.0, 1.1, 2) for HTTPS sites.
Number of HTTP (not HTTPS) sites which return upgrade HTTP header containing h2
Number of HTTPS sites using HTTP/2 which return upgrade HTTP header containing h2
Number of HTTPS sites not using HTTP/2 which return upgrade HTTP header containing h2
% of sites affected by CDN prioritization issues (H2 and served by CDN) - https://github.com/andydavies/http2-prioritization-issues#cdns--cloud-hosting-services
Count of HTTP/2 sites grouped by server HTTP header value but strip version numbers.
Count of non-HTTP/2 sites grouped by server HTTP header value but strip version numbers.
Count of HTTP/2 sites which use HTTP/2 Push.
Average number of HTTP/2 Pushed resources and average bytes. By Desktop and Mobile.
Is it possible to see HTTP/2 Pushed resources which are not used on the page load?
Domain sharding - is it becoming less common with HTTP/2? Measure number of TCP Connections per site. Average number of domains per site going down? Is it possible to measure the number of domains used by HTTP/2 sites this year and in previous years when not HTTP/2 enabled?
QUIC - % of sites which return Alt-Svc HTTP Header which starts with quic.
Count of HTTP/2 sites grouped by SETTINGS_MAX_CONCURRENT_STREAMS (including sites which don't set this value).

The current HTTP Archive State of the Web lists mobile and desktop but think only the number and bytes pushed should differ between mobile and desktop.

Any other suggestions from anyone? Particularly the reviewers (@bagder , @rmarx , @dotjs)?

@rviscomi , @pmeenan - can you see any of these being a problem? Note sure I'll use all of these, depending on whether they show interesting information or not, so if any are particularly hard to get, or there's too many stats, then let me know.

dotjs · 2019-05-31T15:34:51Z

Percentage of requests over h2 compared to h1
Number of requests per TCP connection by type and for h2 how many hosts.
What I would like to get handle on is how many connections and what type are utilised for various performance measures.

Server push

bytes by content type. Interested in what is being pushed
number of client resets
number of no push headers sent

HPACK

+1 for a way to measure compression ratio for headers
table sizes

mnot · 2019-05-31T23:27:52Z

Capturing the mix of h2 and h1 on a single page load would also be interesting, as would the total number of connections per page load in relation to that.

rviscomi · 2019-06-01T00:25:11Z

Is it possible to see HTTP/2 Pushed resources which are not used on the page load?

Not familiar with server push or how it appears in WPT results. Two questions:

What are the distinguishing characteristics of a pushed resource?
When the client attempts to use a pushed resource, is there some kind of artifact left in the network logs, similar to how a 304 response tells the client to use what it has in cache?

Is it possible to measure the number of domains used by HTTP/2 sites this year and in previous years when not HTTP/2 enabled?

To complicate things, we've been increasing our sample size ~8x since last year, so many of of the sites in today's dataset were not available last year. So this metric might not be reliable.

tunetheweb · 2019-06-01T11:35:48Z

Incorporated some of these.

@dotjs some comments ion yours:

"number of client resets" not sure HTTP Archive would be best measure of this as it's a crawler and presumably crawls with an empty cache? Think we'd need some sort of RUM measurement here (Moxilla Firefox Telemetry?) to get a meaningful stats out of this.
"table sizes" not sure how to measure these, or what you're trying to get out of this? Nginx for example which doesn't use dynamic table as I understand, doesn't send a table size of 0 in it's SETTINGS frame.

@rviscomi, a pushed resource has a "SERVER PUSHED" attribute in WebPageTest as shown below.

When a client uses a pushed resource it sets the Initiator to "Push/Other" in Chrome Dev tools Network tab:

When an asset is pushed that is NOT actually needed by the page, it doesn't show in Dev Tools Network tab at all (but is hidden in the net-externals page). Though this is complicated if the preload header is used (which is often also used as a signal to push). In this case the very presence of the preload header means Chrome thinks it is needed by the page and so does show it in the Network tab. Sigh it's complicated...

WebPagetest seems to always show pushed resources (whether preload header is included or not), but can't see any way it indicates if an asset is pushed, but then not subsequently referenced on the page.

@pmeenan not sure if you've any thoughts on whether possible to measure unnecessarily pushed resources?

dotjs · 2019-06-03T14:47:05Z

I was considering the encoding table sizes. For nginx there is a patch that sets the default table size to 4096 https://github.com/cloudflare/sslconfig/blob/hpack_1.13.1/patches/nginx_1.13.1_http2_hpack.patch
for example.

rviscomi · 2019-06-03T17:35:30Z

@bazzadp thanks for the context. It sounds like detecting unused pushes should be possible using the resource metadata in HTTP Archive.

access to headers to detect SERVER PUSHED
access to request initiators to detect used pushes
access to headers to detect preload

So we can check for resource that have been pushed but not initiated or preloaded.

@pmeenan does this sound accurate?

tunetheweb · 2019-06-03T20:36:36Z

I was considering the encoding table sizes. For nginx there is a patch that sets the default table size to 4096 https://github.com/cloudflare/sslconfig/blob/hpack_1.13.1/patches/nginx_1.13.1_http2_hpack.patch
for example.

Yeah, as I say I was aware of that and so I tested that on an unpatched Nginx, expecting a table size of 0 in the initial connection settings but didn't see that. I presume therefore that, without this patch, Nginx handles indexed headers on incoming requests, but just doesn't use them on responses? If so there is no need to explicitly set a table size of 0 if it never uses "indexed header" type in responses, which would explain my observations: the table size is left at the default but just never used for responses. Which means it is not possible to measure this metric (though I agree it would be a good one if we could!).

There's a lot of assumptions in there, so happy to be proven wrong if someone actually knows this or can explain it better?

tunetheweb · 2019-06-03T20:43:37Z

@bazzadp thanks for the context. It sounds like detecting unused pushes should be possible using the resource metadata in HTTP Archive.

Excellent if that's the case! I've left that in there as one of the metrics. So that list in the first comment is all I can think of, so have marked the "Finalise metrics" tickbox as done.

If anyone has any other comments or suggestions in next few hours (or even if after!), then let us know.

@rviscomi should I close this issue?

rviscomi · 2019-06-03T20:48:14Z

Woohoo, I think you're the first author to finish your metrics (even before me! 😅). Yes, we're ready to close this issue.

Next step will be for the analysts to review your metrics more carefully. That process will happen on the HTTP Archive discussion forum at https://discuss.httparchive.org. For now, it would be great for you to create an account there if you haven't already so we can @ you in the discussion if needed. I'll also be creating a new tracking issue (and corresponding spreadsheet) to monitor the progress of each metric, which I'll share with you and tag you in when ready.

tunetheweb · 2019-06-03T20:58:57Z

For now, it would be great for you to create an account there if you haven't already so we can @ you in the discussion if needed. I'll also be creating a new tracking issue (and corresponding spreadsheet) to monitor the progress of each metric, which I'll share with you and tag you in when ready.

Done. My username in there is "tunetheweb". Probably should change my GitHub username too to match what I more commonly go by nowadays, but have it referenced in a few places so would prefer not to. Hope it doesn't cause too much confusion!

tunetheweb · 2019-06-24T22:26:02Z

Hi @paulcalvano, just had a nosey at how you were getting on triaging these metrics and wanted to clarify a few stats that you currently have down as Not feasible/Needs more info:

20.2 - Measure of all HTTP versions (0.9, 1.0, 1.1, 2, QUIC) for main page of all sites, and for HTTPS sites. Table for last crawl.
We can only see the negotiated protocol, not all of the versions supported.

Badly worded on my part so have reworded in first comment above: #22 (comment). I meant the negotiated version for all home pages crawled and not necessary all the versions supported by that page/site (I presume we will negotiate maximum supported version and every site will support all versions beneath the negotiated version with the possible exception of 0.9). See the example table I created to show what I'm looking for:

Version	All sites	HTTPS only sites
HTTP/0.9	0%	0%
HTTP/1.0	2%	0%
HTTP/1.1	48%	20%
HTTP/2	44%	70%
gQUIC	6%	10%

As you can see I don't list that HTTP/1.0 is probably supported by 100% of sites but only the 2% of sites that negotiate using that (these are totally made up stats btw but don't think they will be too far off). It's somewhat similar to the first stat requested (the adoption of HTTP/2 over time) but also looks at sites on older versions of HTTP and newer version (QUIC), and I also wanted to look at HTTP/2 usage by HTTP versus HTTPS. I just didn't want to cloud the first stat graph with all that noise hence why I put these in a separate second stat.

As per above example table, I suspect most will be HTTP/1.1 or HTTP/2 with a smaller number on gQUIC. Mozilla telemetry suggests some sites still use HTTP/1.0 but they might be internal sites or assets rather than main page so wouldn't be surprised if they don't show in our stats at all. And I don’t expect any to use HTTP/0.9.

20.7 - % of sites affected by CDN prioritization issues (H2 and served by CDN).
Not sure if this is possible with HA data.

Yeah this one wasn't mine but would be interesting to know. Maybe just list HTTP/2 sites by top CDNs (similar stats to 17.1 and 17.2?) and then can manually vlookup based on the known bad ones from Andy's github listing? Not sure how we know if a site is server by a CDN (server header? IP address range?) but if you can get it for the CDN chapter in stats 17.1 and 17.2 then presume there is some way :-)

20.14 - Is it possible to see HTTP/2 Pushed resources which are not used on the page load?
We only see the resources that were used in HA data since rejected push promises are not logged in the network panel.

Fair enough thought this one might be difficult. There were some comments above in #22 (comment) but it sounds tricky to be honest so happy to skip.

Count of HTTP/2 sites grouped by SETTINGS_MAX_CONCURRENT_STREAMS (including sites which don't set this value). Once off stat for last crawl.
We don't have H2 frame data

@pmeenan added this stat as per #22 (comment) above so stats should be in June crawl. Note if this value is not explicitly set at connection set up like in that example, then it defaults to unlimited so will need to account for that. Also this stat should only be captured for HTTP/2 sites.

Hope that clarifies some things and allows us to get some more of these. Give me a shout if anything is not clear. And off course if they are still too difficult to get them can live without them.

Thanks,
Barry

pmeenan · 2019-06-24T22:39:04Z

For the prioritization issues, WebPageTest runs it's CDN detection as part of the crawl and should get pretty good coverage. One possible issue will be what origin(s) to look at for a given page? Just checking the pages origin is probably safest but will miss the cases where the static content is served by a different CDN (like all of shopify for example).

…

On Mon, Jun 24, 2019 at 6:26 PM Barry Pollard ***@***.***> wrote: Hi @paulcalvano <https://github.com/paulcalvano>, just had a nosey at how you were getting on triaging these metrics and wanted to clarify a few stats that you currently have down as Not feasible/Needs more info: 20.2 - Measure of all HTTP versions (0.9, 1.0, 1.1, 2, QUIC) for main page of all sites, and for HTTPS sites. Table for last crawl. *We can only see the negotiated protocol, not all of the versions supported.* Badly worded on my part so have reworded in first comment above: #22 (comment) <#22 (comment)>. I meant the negotiated version for all home pages crawled and not necessary all the versions supported by that page/site (I presume we will negotiate maximum supported version and every site will support all versions beneath the negotiated version with the possible exception of 0.9). See the example table I created to show what I'm looking for: Version All sites HTTPS only sites HTTP/0.9 0% 0% HTTP/1.0 2% 0% HTTP/1.1 48% 20% HTTP/2 44% 70% gQUIC 6% 10% As you can see I don't list that HTTP/1.0 is probably supported by 100% of sites but only the 2% of sites that negotiate using that (these are totally made up stats btw but don't think they will be too far off). It's somewhat similar to the first stat requested (the adoption of HTTP/2 over time) but also looks at sites on older versions of HTTP and newer version (QUIC), and I also wanted to look at HTTP/2 usage by HTTP versus HTTPS. I just didn't want to cloud the first stat graph with all that noise hence why I put these in a separate second stat. As per above example table, I suspect most will be HTTP/1.1 or HTTP/2 with a smaller number on gQUIC. Mozilla telemetry suggests some sites still use HTTP/1.0 <https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&measure=HTTP_RESPONSE_VERSION> but they might be internal sites or assets rather than main page so wouldn't be surprised if they don't show in our stats at all. And I don’t expect any to use HTTP/0.9. 20.7 - % of sites affected by CDN prioritization issues (H2 and served by CDN). *Not sure if this is possible with HA data.* Yeah this one wasn't mine but would be interesting to know. Maybe just list HTTP/2 sites by top CDNs (similar stats to 17.1 and 17.2?) and then can manually vlookup based on the known bad ones from Andy's github listing <https://github.com/andydavies/http2-prioritization-issues#cdns--cloud-hosting-services>? Not sure how we know if a site is server by a CDN (server header? IP address range?) but if you can get it for the CDN chapter in stats 17.1 and 17.2 then presume there is some way :-) 20.14 - Is it possible to see HTTP/2 Pushed resources which are not used on the page load? *We only see the resources that were used in HA data since rejected push promises are not logged in the network panel.* Fair enough thought this one might be difficult. There were some comments above in #22 (comment) <#22 (comment)> but it sounds tricky to be honest so happy to skip. Count of HTTP/2 sites grouped by SETTINGS_MAX_CONCURRENT_STREAMS (including sites which don't set this value). Once off stat for last crawl. *We don't have H2 frame data* @pmeenan <https://github.com/pmeenan> added this stat as per #22 (comment) <#22 (comment)> above so stats should be in June crawl. Note if this value is not explicitly set at connection set up like in that example, then it defaults to unlimited so will need to account for that. Also this stat should only be captured for HTTP/2 sites. Hope that clarifies some things and allows us to get some more of these. Give me a shout if anything is not clear. And off course if they are still too difficult to get them can live without them. Thanks, Barry — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#22?email_source=notifications&email_token=AADMOBMXTRKFT2GOSICD6YTP4FCX7A5CNFSM4HOOMKG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYONAGA#issuecomment-505204760>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AADMOBIBZJ6E3ZPNGUSBDH3P4FCX7ANCNFSM4HOOMKGQ> .

tunetheweb · 2019-06-24T23:37:26Z

For the prioritization issues, WebPageTest runs it's CDN detection as part of the crawl and should get pretty good coverage.

This is based on identifying the CDN via these server headers being set I presume? Was always curious how this worked!

One possible issue will be what origin(s) to look at for a given page? Just checking the pages origin is probably safest but will miss the cases where the static content is served by a different CDN (like all of shopify for example).

Yeah think best we can probably do is test the website home page and accept it's not 100% accurate. Trying to figure out the "most used CDN" for a web page to identify the shopify scenario is probably overly complicated.

pmeenan · 2019-06-24T23:48:25Z

The headers are a fallback. Main method is the CNAME mappings right above that (and reverse-IP lookup).

…

On Mon, Jun 24, 2019 at 7:37 PM Barry Pollard ***@***.***> wrote: For the prioritization issues, WebPageTest runs it's CDN detection as part of the crawl and should get pretty good coverage. This is based on identifying the CDN via these server headers <https://github.com/WPO-Foundation/wptagent/blob/master/internal/optimization_checks.py#L197> being set I presume? Was always curious how this worked! One possible issue will be what origin(s) to look at for a given page? Just checking the pages origin is probably safest but will miss the cases where the static content is served by a different CDN (like all of shopify for example). Yeah think best we can probably do is test the website home page and accept it's not 100% accurate. Trying to figure out the "most used CDN" for a web page to identify the shopify scenario is probably overly complicated. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#22?email_source=notifications&email_token=AADMOBLHDJL2KPBZVZNEOKLP4FLDVA5CNFSM4HOOMKG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYOQ3ZI#issuecomment-505220581>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AADMOBN3ZX6HUW45ZNTCGBDP4FLDVANCNFSM4HOOMKGQ> .

rviscomi assigned rviscomi and paulcalvano May 21, 2019

rviscomi transferred this issue from HTTPArchive/httparchive.org May 21, 2019

rviscomi added this to the Chapter planning complete milestone May 21, 2019

rviscomi changed the title ~~[Web Almanac] Finalize assignments: Chapter 20. HTTP/2~~ Finalize assignments: Chapter 20. HTTP/2 May 21, 2019

rviscomi assigned tunetheweb and unassigned rviscomi and paulcalvano May 21, 2019

rviscomi mentioned this issue May 23, 2019

Assign subject matter experts and peer reviewers to each chapter #2

Closed

tunetheweb mentioned this issue Jun 1, 2019

Finalize assignments: Chapter 8. Security #10

Closed

3 tasks

tunetheweb closed this as completed Jun 3, 2019

rviscomi mentioned this issue Jul 23, 2019

Query metrics: Chapter 20. HTTP/2 #101

Closed

14 tasks

rviscomi mentioned this issue Sep 25, 2019

Write content: Chapter 20. HTTP/2 #173

Closed

3 tasks

tunetheweb mentioned this issue Jun 29, 2020

HTTP/2 2020 #921

Closed

10 tasks

tunetheweb mentioned this issue Jul 26, 2020

HTTP/2 2020 queries #1098

Merged

17 tasks

Finalize assignments: Chapter 20. HTTP/2 #22

Finalize assignments: Chapter 20. HTTP/2 #22

Comments

rviscomi commented May 21, 2019 • edited by tunetheweb Loading

rviscomi commented May 21, 2019

tunetheweb commented May 21, 2019 • edited Loading

rviscomi commented May 21, 2019

tunetheweb commented May 22, 2019 • edited Loading

rviscomi commented May 22, 2019

tunetheweb commented May 22, 2019 • edited Loading

rviscomi commented May 22, 2019

bagder commented May 23, 2019

tunetheweb commented May 23, 2019

rviscomi commented May 23, 2019 • edited Loading

tunetheweb commented May 23, 2019

bagder commented May 23, 2019

rviscomi commented May 23, 2019

andydavies commented May 23, 2019

bagder commented May 23, 2019

tunetheweb commented May 23, 2019

rmarx commented May 23, 2019

pmeenan commented May 24, 2019

pmeenan commented May 24, 2019

pmeenan commented May 24, 2019

mnot commented May 28, 2019

rviscomi commented May 28, 2019

tunetheweb commented May 28, 2019

dotjs commented May 29, 2019

rviscomi commented May 29, 2019

dotjs commented May 29, 2019

rviscomi commented May 29, 2019

tunetheweb commented May 30, 2019 • edited Loading

dotjs commented May 31, 2019 • edited Loading

mnot commented May 31, 2019

rviscomi commented Jun 1, 2019 • edited Loading

tunetheweb commented Jun 1, 2019

dotjs commented Jun 3, 2019

rviscomi commented Jun 3, 2019

tunetheweb commented Jun 3, 2019

tunetheweb commented Jun 3, 2019

rviscomi commented Jun 3, 2019

tunetheweb commented Jun 3, 2019

tunetheweb commented Jun 24, 2019

pmeenan commented Jun 24, 2019 via email

tunetheweb commented Jun 24, 2019

pmeenan commented Jun 24, 2019 via email

rviscomi commented May 21, 2019 •

edited by tunetheweb

Loading

tunetheweb commented May 21, 2019 •

edited

Loading

tunetheweb commented May 22, 2019 •

edited

Loading

tunetheweb commented May 22, 2019 •

edited

Loading

rviscomi commented May 23, 2019 •

edited

Loading

tunetheweb commented May 30, 2019 •

edited

Loading

dotjs commented May 31, 2019 •

edited

Loading

rviscomi commented Jun 1, 2019 •

edited

Loading