Media 2024 #3596

nrllh · 2024-03-02T18:29:02Z

Media 2024

If you're interested in contributing to the Media chapter of the 2024 Web Almanac, please reply to this issue and indicate which role or roles best fit your interest and availability: author, reviewer, analyst, and/or editor. You might be interested in exploring the changes to this year's version here.

Content team

Lead	Authors	Reviewers	Analysts	Editors	Coordinator
@stefanjudis	@stefanjudis	@svgeesus, @nhoizey, @eeeps	@foolip, @nucliweb, @eeeps, @scottjehl	@MichaelLewittes	@turban1988

Expand for more information about each role 👀

The content team lead is the chapter owner and responsible for setting the scope of the chapter and managing contributors' day-to-day progress.
Authors are subject matter experts and lead the content direction for each chapter. Chapters typically have one or two authors. Authors are responsible for planning the outline of the chapter, analyzing stats and trends, and writing the annual report.
Reviewers are also subject matter experts and assist authors with technical reviews during the planning, analyzing, and writing phases.
Analysts are responsible for researching the stats and trends used throughout the Almanac. Analysts work closely with authors and reviewers during the planning phase to give direction on the types of stats that are possible from the dataset, and during the analyzing/writing phases to ensure that the stats are used correctly.
Editors are technical writers who have a penchant for both technical and non-technical content correctness. Editors have a mastery of the English language and work closely with authors to help wordsmith content and ensure that everything fits together as a cohesive unit.
The section coordinator is the overall owner for all chapters within a section like "User Experience" or "Page Content" and helps to keep each chapter on schedule.

Note: The time commitment for each role varies by the chapter's scope and complexity as well as the number of contributors.

For an overview of how the roles work together at each phase of the project, see the Chapter Lifecycle doc.

Milestone checklist

0. Form the content team

📆 April 15 Complete program and content committee - 🔑 Organizing committee
- The content team has at least one author, reviewer, and analyst.

1. Plan content

📆 May 1 First meeting to outline the chapter contents - 🔑 Content team
- The content team has completed the chapter outline.

2. Gather data

📆 June 1 Custom metrics completed - 🔑 Analysts
- Analysts have added all necessary custom metrics and drafted a PR (example) to track query progress.
📆 June 1 HTTP Archive Crawl - 🔑 HA Team
- HTTP Archive runs the June crawl.

3. Validate results

📆 August 15 Query Metrics & Save Results - 🔑 Analysts
- Analysts have queried all metrics and saved the output.

4. Draft content

📆 September 15 First Draft of Chapter - 🔑 Authors
- Authors has written the chapter.
📆 October 10 Review & Edit Chapter - 🔑 Reviewers & Editors
- Reviewers and Editors has processed the the chapter.

5. Publication

📆 October 15 Chapter Publication (Markdown & PR) - 🔑 Authors
- Authors has converted the chapter to markdown and drafted a PR.
📆 November 1 Launch of 2024 Web Almanac 🚀 - 🔑 Organizing committee

6. Virtual conference

📆 November 20 Virtual Conference - 🔑 Content Team

Chapter resources

Refer to these 2024 Media resources throughout the content creation process:
📄 Google Docs for outlining and drafting content
🔍 SQL files for committing the queries used during analysis
📊 Google Sheets for saving the results of queries
📝 Markdown file for publishing content and managing public metadata
💻 Collab notebook for collaborative coding in Python - if needed
💬 #web-almanac-media on Slack for team coordination

The text was updated successfully, but these errors were encountered:

MichaelLewittes · 2024-03-03T04:55:44Z

I'd be happy to be the editor of the media section again. Know the drill -- and ready to do it once more.

foolip · 2024-03-03T09:17:12Z

I'm interested in contributing as an analyst.

I've been poking around at the question "what sizes and qualities do images on the web tend to be?" I've been running some of the queries in https://almanac.httparchive.org/en/2022/media again and have identified what I think are a few interesting angles.

First, the distribution of images sizes is much more uneven and interesting than https://almanac.httparchive.org/en/2022/media#image-dimensions suggests. Here's a histogram from a quick experiment I did in February:

Instead of megapixels I'm using sqrt(megapixels), so the equivalent width of a square image. I found this much easier to reason about, since much of the interesting action is in the 0-0.25 megapixel range. 300x300 images are the most common.

Second, BPP (bits/pixel) strongly depends on image size, with smaller images having higher BPP. The reasons I can see are (1) container overhead (2) more incentive to compress large images and (3) less detail in large images, as many small images are downscaled versions of the large ones.

I think it would be interesting to try to understand quality both through BPP while taking these effects into account, but also by estimating the encoder settings used. I suspect the latter varies less with size, and at least from JPEG an estimation is possible due to how the format works. A first attempt yielded this:

I also shared this in #3572 (comment) and there are some words of caution about using ImageMagick's detected quality, but I think something useful could be done here.

foolip · 2024-03-03T09:19:39Z

A colleague made this useful observation:

noting that 300x300 is the default (medium) image size in WordPress and 82 is the default quality. This lines up exactly with the "most common" size and quality. Since WordPress sites are a large part of the dataset (~30%) they may be influencing the results. It might be interesting to see what images on non WordPress sites looks like.

svgeesus · 2024-03-05T15:28:43Z

I noticed this in the 2022 Media report

One caveat: AVIF and PNG allow tagging images with wide-gamut color spaces using format-specific shorthands, without using ICC profiles. We started down the path of trying to detect wide-gamut AVIFs and PNGs that don’t use ICC profiles, but accounting for the various ways they are encoded—and the ways our tooling reported on them—proved a bit too complex to tackle this year. Maybe next year!

Coding Independent Code Points (CICP) is a simple to understand and use method, originally from the broadcast and video world, also applicable to still images and short animations.

Given that:

AVIF is using (and seems to prefer) CICP rather than ICC profiles
PNG now supports CICP (see also this explainer)
JPEG XL supports CICP
Chrome implements CICP in PNG and AVIF
Firefox intends to implement CICP for PNG (they already do for AVIF)
Safari implements CICP (but only on the most recent macOS, they depend on the platform)
Labeling HDR images is easy with CICP and currently hard with ICC, at least with ICC.1

Then the "various ways they are encoded" becomes a much more tractable "look for CICP in images" and I suggest this metric for the 2024 Media survey.

Originally raised in

Detecting non-ICC wide-gamut PNG images in the wild w3c/png#211

svgeesus · 2024-03-05T15:59:51Z

I volunteer ~~as tribute~~ as a reviewer

nucliweb · 2024-04-04T16:49:27Z

Hi, I would love to contribute as an analyst.

nrllh · 2024-04-09T22:38:26Z

Hey @eeeps @akshay-ranganath @nhoizey @yoavweiss @MichaelLewittes - awesome contributors from previous years 🙂 Are you interested in joining us again this year?

MichaelLewittes · 2024-04-10T02:53:43Z

Would be honored to join again as the editor.

…

On Tue, Apr 9, 2024 at 6:38 PM Nurullah Demir ***@***.***> wrote: Hey @eeeps <https://github.com/eeeps> @akshay-ranganath <https://github.com/akshay-ranganath> @nhoizey <https://github.com/nhoizey> @yoavweiss <https://github.com/yoavweiss> @MichaelLewittes <https://github.com/MichaelLewittes> - awesome contributors from previous years 🙂 Are you interested in joining us again this year? — Reply to this email directly, view it on GitHub <#3596 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AW6KSXJHEQFVMNCEFMKQCPLY4RUXTAVCNFSM6AAAAABEDJBJZ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBWGE2TGNJTGE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

nhoizey · 2024-04-10T08:53:50Z

Hi @nrllh, I can indeed join this year once again, as a reviewer.

eeeps · 2024-04-10T16:12:17Z

@nrllh I can join as an Analyst and Reviewer, but do not have the bandwidth to Lead or Author again this year.

nrllh · 2024-04-11T09:11:26Z

thank you, @MichaelLewittes, @nhoizey, @eeeps!

turban1988 · 2024-04-22T10:53:57Z

Hi @rey-dal,
Thank you very much for volunteering to lead the writing of this chapter! Could you please organize a kick-off meeting for this chapter (example: #3603 (comment)) to organize the writing of the chapter?

Furthermore, it would be helpful if you and all other contributors (@svgeesus, @nhoizey, @eeeps , @foolip,@nucliweb,@MichaelLewittes) could join the slack channel of the HTTPArchive (https://join.slack.com/t/httparchive/shared_invite/zt-2hfkn28ts-~uXN4UGS0mXsKpzzhtZcow)

Thanks!

scottjehl · 2024-05-21T20:50:02Z

I'd love to see included in this year's report if there's been any uptick in responsive video usage now that support has returned across browsers. That is, how many sites are using video source elements with media attributes and what sizes are they commonly serving? Happy to help if there's any way I can!

nrllh · 2024-06-14T09:53:15Z

Unfortunately, we currently have no authors for this chapter. Is anyone of you (@svgeesus, @nhoizey, @eeeps, @foolip, @nucliweb, @eeeps, @MichaelLewittes) interested in contributing to this chapter as an author?

svgeesus · 2024-06-17T16:36:09Z

Unfortunately I am overcommitted right now, so can't take this on.

stefanjudis · 2024-07-01T19:49:07Z

@svgeesus, @nhoizey, @eeeps, @foolip, @nucliweb, @MichaelLewittes, @turban1988

Hi friends! 👋 I am very excited about this and looking forward to collaborating with all of you! As I know, we're a bit late with the original deadline, so let's kick things off quickly (if possible).

To start off, I would like to schedule a 30-60 minute meeting to start the planning and brainstorming process. So please provide your availability here for the next two weeks: https://doodle.com/meeting/participate/id/erMwP4kd

I checked the present timezones and choose options in the european evening that should work for the US.

Also, here is an agenda for what we might want to discuss on the kickoff call: https://docs.google.com/document/d/11lk8wSjs9PQXlWhv1FYeDynhrBxQNOUza85fjC5oB1k/edit (please request access). Feel free to add points — because I haven't led any Web Almanac related activities so far. 😅

The goal of the meeting will be to quickly get to know each other, set new deadlines and define our preferred workflow.

Speaking of the chapter content: I'll summarize all statistics and data points from the previous year in the gdoc above. Ideally, you could give some thought on which metrics you'd like to drop / adjust but also what new things we should add. :)

Also a gentle reminder to join the #web-almanac-media channel on Slack (https://join.slack.com/t/httparchive/shared_invite/zt-2lx22qow3-pkcEJltSqtyP9_86V4uTZQ)

nhoizey · 2024-07-02T06:38:44Z

Thanks @stefanjudis for taking the lead! 🙏

stefanjudis · 2024-07-05T06:50:17Z

@nhoizey @eeeps @nucliweb @MichaelLewittes

Thank you for filling out the doodle. We had a very clear winner and it's next week Thursday 11 AM GTM - 7 (SF) / 2PM GTM -4 (NYC) / 8 PM GMT+2 (Berlin). 🎉

@svgeesus @foolip @turban1988

If you want to join please let me know and I also invite you. :)

The folks joining the Kick-off call already have access to a living document that we'll use going forward. If someone wants to have access, too, just request it via Google. :)

Looking forward to catching up with you all!

svgeesus · 2024-07-05T17:03:10Z

I just filled out the doodle

stefanjudis · 2024-07-19T09:22:35Z

Hey friends, I'm a bit late — sorry for the slight delay but here's a summary of what we discussed in our kick off call and how we'll proceed. Thank you @nhoizey @eeeps @MichaelLewittes @svgeesus for attending the kick off call.

And FYI @foolip @nucliweb.

1. PROJECT MANAGEMENT: We decided that we'll use this Github issue for project management. So you'll hear from me here about anything timeline and task related. Actually content discussions will happen in the existing Google Doc (request access if you don't have it yet).

2. DATA SOURCE: As we're late and the latest data set was already aggregated we decided that we'll primarily work off the last Almanac's Media Chapter. We try to keep it lean for now and will get into newly added data points if there's time and capacity.

Possible new data point ideas

Scott Jehl: I'd love to see included in this year's report if there's been any uptick in responsive video usage now that support has returned across browsers. That is, how many sites are using video source elements with media attributes and what sizes are they commonly serving?
Chris Lilley: Then the "various ways they are encoded" becomes a much more tractable "look for CICP in images" and I suggest this metric for the 2024 Media survey. CICP is used in PNG, JPEG-XL and AVIF (that I know of, maybe others)
Stefan Judis: is loading=lazy a thing yet. Are libraries adopting this convention already?
Nicoloas Hoizey: Maybe we could add (next year?) details about which types are used in s, in which order. (I've seen type switching with JPEG first… 😅)
Nicoloas Hoizey: The number of s for art direction?
Nicoloas Hoizey: whether the latest is useful or would be enough, for both use cases (another frequent mistake)?

3. DATA ACCESS: For the people that want to crunch data please get yourself into the HTTPArchive GCP project by reaching out to Christian. More info in the HTTP Archive slack.

I just did and @eeeps did so too.

4. NEW / SAME SCHEDULE: When looking at the set dates we're super late. We missed the first three points but I think we will catch up to hit the following.

Validate resultes
- 📆 August 15 Query Metrics & Save Results — 🔑 Analysts have queried all metrics and saved the output.
Draft content
- 📆 September 15 First Draft of Chapter — 🔑 Authors has written the chapter.
- 📆 October 10 Review & Edit Chapter - 🔑 Reviewers & Editors have processed the the chapter.
Publication
- 📆 October 15 Chapter Publication (Markdown & PR) — 🔑 Authors has converted the chapter to markdown and drafted a PR.
- 📆 November 1 Launch of 2024 Web Almanac 🚀

As agreed, to come back on track and make Aug 15, the analysts (@eeeps @foolip @nucliweb) have to evaluate if all the queries from 2y ago still work. Eric offered to show me and possible others the way around to accessing and crunching all the data. Here's another doodle. 👇

https://doodle.com/meeting/participate/id/erMPW6Kd

So far I think it'll be only Erik and me so I'll pick some times that are a bit later in the European time zone.

If we then still have time, we'll evaluate if / how new metrics could be added on the June data set.

That's it for now, I'll report back once we have some numbers. And if you have any questions, please let me know. Have a great weekend all! 👋

stefanjudis · 2024-08-05T19:37:26Z

Hey friends, @eeeps and I just had our first big query session. The queries from two years ago seem to work 🎉, and we made a game plan.

Looking at the deadline of finishing data crunching by Aug 15, we probably won't make it but I'm targeting Sep 1 to have run all the queries and put the results into the official HTTP Archive Google Sheet. We're primarily reusing the existing queries, and if time allows, Eric will look into adding more or query adjustments.

After the data is done, I'll share the writing progress early to receive feedback early. You can expect me to hand over a fairly rough draft. 🫣

@nhoizey @eeeps @MichaelLewittes @svgeesus @foolip @nucliweb

foolip · 2024-08-08T15:55:36Z

I am going on parental leave, which is great, but unfortunately that means I won't be able to contribute to this effort at all.

y-guyon · 2024-08-12T08:36:38Z

Hi all, I would like to suggest a precision to the next "Media - Bits per pixel by format" section of the Web Almanac. I referred a lot to that great resource for meaningful image codec performance comparisons so far, but the current version aggregates all image sizes, making the results hard to draw conclusions from.

For example .jpg is listed with a median 2.1bpp, whereas images between 1 and 2 megapixels should be closer to 1.3bpp.
Would it be possible to split the results into a few buckets based on image dimensions? Categories such as 0 to 1 MP, 1 to 2 MP, 2 to 4 MP and 4+ MP would already be amply useful for the Chrome team.
Please find Philip's demo SQL snippet at https://gist.github.com/foolip/34ed26159b17db30abbbe4890200da36.

Sorry for not having more time to contribute on this project and thanks everyone for the efforts here.
Let me know if this request should be recorded in a different thread.

stefanjudis · 2024-08-13T06:52:32Z

@y-guyon Thanks for proposing another way to look at the data. We might look into it, but we're pretty stretched thin and decided to go with the existing metric set, primarily. :)

stefanjudis · 2024-08-17T17:43:07Z

@foolip @nucliweb @eeeps @scottjehl

I was able to reuse most of the queries and the majority of data points from 2022 are now available for 2024. With that we almost caught up with the data aggregation deadline of Aug 15.

You can find the queried data here.

And here's the opened PR: #3738

Some queries are still running and I'll query / refine more once @eeeps is back to from his deserved vacation.

@svgeesus @nhoizey @eeeps @MichaelLewittes

I'll start drafting the chapter and plan to have something ready until Sep 15.

nrllh added help wanted: reviewers This chapter is looking for reviewers help wanted: analysts This chapter is looking for data analysts help wanted: coauthors This chapter is looking for coauthors 2024 chapter Tracking issue for a 2024 chapter labels Mar 2, 2024

svgeesus mentioned this issue Mar 5, 2024

Detecting non-ICC wide-gamut PNG images in the wild w3c/png#211

Closed

github-actions bot mentioned this issue Mar 26, 2024

2024 Content Progress Tracker 🪄 #3634

Open

cqueern removed help wanted: analysts This chapter is looking for data analysts help wanted: reviewers This chapter is looking for reviewers labels Apr 16, 2024

stefanjudis mentioned this issue Aug 17, 2024

Media 2024 queries #3738

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Media 2024 #3596

Media 2024 #3596

nrllh commented Mar 2, 2024 •

edited

Loading

MichaelLewittes commented Mar 3, 2024 •

edited

Loading

foolip commented Mar 3, 2024

foolip commented Mar 3, 2024

svgeesus commented Mar 5, 2024

svgeesus commented Mar 5, 2024

nucliweb commented Apr 4, 2024 •

edited

Loading

nrllh commented Apr 9, 2024

MichaelLewittes commented Apr 10, 2024 via email

nhoizey commented Apr 10, 2024

eeeps commented Apr 10, 2024

nrllh commented Apr 11, 2024 •

edited

Loading

turban1988 commented Apr 22, 2024

scottjehl commented May 21, 2024

nrllh commented Jun 14, 2024

svgeesus commented Jun 17, 2024

stefanjudis commented Jul 1, 2024

nhoizey commented Jul 2, 2024

stefanjudis commented Jul 5, 2024

svgeesus commented Jul 5, 2024

stefanjudis commented Jul 19, 2024 •

edited

Loading

stefanjudis commented Aug 5, 2024

foolip commented Aug 8, 2024

y-guyon commented Aug 12, 2024

stefanjudis commented Aug 13, 2024

stefanjudis commented Aug 17, 2024 •

edited

Loading

Media 2024 #3596

Media 2024 #3596

Comments

nrllh commented Mar 2, 2024 • edited Loading

Media 2024

Content team

Milestone checklist

MichaelLewittes commented Mar 3, 2024 • edited Loading

foolip commented Mar 3, 2024

foolip commented Mar 3, 2024

svgeesus commented Mar 5, 2024

svgeesus commented Mar 5, 2024

nucliweb commented Apr 4, 2024 • edited Loading

nrllh commented Apr 9, 2024

MichaelLewittes commented Apr 10, 2024 via email

nhoizey commented Apr 10, 2024

eeeps commented Apr 10, 2024

nrllh commented Apr 11, 2024 • edited Loading

turban1988 commented Apr 22, 2024

scottjehl commented May 21, 2024

nrllh commented Jun 14, 2024

svgeesus commented Jun 17, 2024

stefanjudis commented Jul 1, 2024

nhoizey commented Jul 2, 2024

stefanjudis commented Jul 5, 2024

svgeesus commented Jul 5, 2024

stefanjudis commented Jul 19, 2024 • edited Loading

stefanjudis commented Aug 5, 2024

foolip commented Aug 8, 2024

y-guyon commented Aug 12, 2024

stefanjudis commented Aug 13, 2024

stefanjudis commented Aug 17, 2024 • edited Loading

nrllh commented Mar 2, 2024 •

edited

Loading

MichaelLewittes commented Mar 3, 2024 •

edited

Loading

nucliweb commented Apr 4, 2024 •

edited

Loading

nrllh commented Apr 11, 2024 •

edited

Loading

stefanjudis commented Jul 19, 2024 •

edited

Loading

stefanjudis commented Aug 17, 2024 •

edited

Loading