Subset fonts to used character data by aduth · Pull Request #10655 · 18F/identity-idp

aduth · 2024-05-20T14:37:38Z

🛠 Summary of changes

Optimizes fonts to remove unused character data.

This is similar to #6094, but more-aggressively optimizes font files. With #6094, we used predefined Latin character set to reduce the size of fonts. With this approach, we specifically target only the character glyphs that exist in localized string data.

This also removes unused .woff files. As of USWDS v3.4.0, only .woff2 files are used.

For future consideration: We could optimize this even more aggressively if we created per-locale font subsets, to avoid loading French and Spanish character data for English users (and vice-versa, as applicable).

Why?

After application.css, these fonts are the second-largest first-party asset on https://secure.login.gov , each coming in around 21.5kb (vs. 22.5kb for application.css), for a combined total of 42.9kb font data on the homepage.

Performance Impact

The numbers below reflect the combined total of PublicSans-Regular.woff2 and PublicSans-Bold.woff2, the most common font combination loaded on each page (and preloaded by default).

Before: 42.9kb (21.5kb + 21.4kb)
After: 31kb (15.5kb + 15.5kb)
Diff: -11.9kb (-27.7%)

📜 Testing Plan

Verify that text is still rendered as Public Sans:

Go to http://localhost:3000 or any page of the application
Inspect some text using your browser's developer tools
Observe that the rendered font is using the Public Sans network resource
- In Chrome, this is in the bottom "Rendered Fonts" section under "Computed" styles

This was added when considering support for legacy formats with nested hashes. Since we now target only flattened keys (where values are either strings or arrays of strings), use `flatten` instead to simplify

zachmargolis · 2024-05-20T15:20:22Z

app/assets/fonts/glyphs.txt

@@ -0,0 +1 @@
+ !#%&(),-./0123456789:;?@ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz «»¿ÀÁÈÉÊÎÓÚàáâãçèéêëíîïñóôùúû ‑—‘’“”…‹中体文简


Is there a way we could do this dynamically at deploy time? This seems like an annoying thing to fail the build on

I do think it's burdensome, yes. But realistically I also think this is something that will almost never happen.

aduth · 2024-05-20T15:23:54Z

Chatting with @mitchellhenke offline about this, one consideration is if we could account for locale data from Rails and other third-party dependencies.

Some options to consider:

Since additional characters are more likely in non-English locales, subset specifically for English only, and continue to use predefined set for Spanish & French
(If possible) Initialize i18n backend and load all data, then dump string values from in-memory backend
Scrape content from all files loaded via I18n.load_path
Manually verify but dismiss as non-concern if these other sources are unlikely to include additional characters

aduth · 2024-05-20T17:09:52Z

Prompted by #10655 (comment), I was curious what characters aren't included anymore, to get a sense of how likely it'd be for one of them to be reintroduced in the future.

These are the characters:

"$\'*+<=>[\\]^`{|}~¡¢£¤¥¦§¨©ª¬®¯°±²³´µ¶·¸¹º¼½¾ÂÃÄÅÆÇËÌÍÏÐÑÒÔÕÖ×ØÙÛÜÝÞßäåæìðòõö÷øüýþÿ

Edit: After 1f5f81c, this is reduced further to just these characters:

*+<=[\\]^`{|}¡¢£¤¥¦§¨©ª¬®¯°±²³´µ¶·¸¹º¼½¾ÂÃÄÅÆÇËÌÍÏÐÑÒÔÕÖ×ØÙÛÜÝÞßäåæìðòõö÷øüýþÿ

One interesting observation I made based off this is that our current fonts likely don't include the smart-quotes “ ” ‘ ’ that we enforce in our content, since they're not part of the Latin character set. So these changes may improve the appearance of content which includes those.

changelog: Internal, Performance, Optimize size of fonts to include only content character data

zachmargolis · 2024-05-20T17:20:05Z

docs/frontend.md

+1. [Download Public Sans](https://public-sans.digital.gov/) and extract it to your project's `tmp/` directory
+2. Install [glyphhanger](https://github.com/zachleat/glyphhanger) and its dependencies:
+   1. `npm install -g glyphhanger`
+   2. `pip install fonttools brotli zopfli`


should we add a script that wraps this with something like a virtualenv setup?

Similar to my last comment, at this stage I'm not sure it's worth investing too much into the apparatus around updating the fonts, assuming that this should be an exceedingly rare event. If it happens more often than I'm expecting (like more often than once every year or two), then I could see that as a future enhancement worth considering.

1. Avoid extra dependency 2. Faster to run since we're not creating file formats we don't need

aduth · 2024-05-20T18:22:43Z

scripts/yaml_characters

+I18n.backend.eager_load!
+
+data = I18n.backend.translations.slice(*I18n.available_locales - excluded_locales)


IMO we should do something similar to this in spec/i18n_spec.rb. Interestingly i18n-tasks doesn't load all string data eagerly like this, so we miss some strings in those specs. Having a more complete set of strings to test against could help prevent issues like one we encountered a few weeks back.

makes sense to me to add that there

zachmargolis

The code looks good and I can't think of anything else we've missed, so it LGTM

Overall, I am not convinced if we need this PR?

Pro: I totally understand the reduction in filesize is great for first-time loads, but I'd expect that fonts would get cached and so subsequent page loads would not be fresh.
Con: My gut is that this process is on the border of "is juice worth the squeeze" because if we ever do introduce a new character, it appears a fairly annoying manual process to fix.
Con: If a character slips through, I'm not sure that we have any automated feedback mechanisms to help us know (since browsers would just fall back to a different font that has the glyphs, right?)

scripts/yaml_characters

zachmargolis · 2024-05-20T18:41:54Z

scripts/yaml_characters

+I18n.backend.eager_load!
+
+data = I18n.backend.translations.slice(*I18n.available_locales - excluded_locales)


makes sense to me to add that there

aduth · 2024-05-20T19:27:13Z

@zachmargolis Those are all totally valid considerations, and I do think we're reaching the point of diminishing returns on some of the low-hanging fruit with frontend performance. But that's also part of the reason for some of my pushback in comments, since the juice-vs-squeeze here should require this to be minimal investment for it to be worthwhile.

Pro: I totally understand the reduction in filesize is great for first-time loads, but I'd expect that fonts would get cached and so subsequent page loads would not be fresh.

True, but the same argument can (and I assume is) used to justify most apps' resource bloat, and when loading does need to occur, font loading is one of the most visually-noticeable changes and contributes to largest contentful paint metrics.

This is also seeking to optimize assuming assets are loaded in parallel, where the page load is as long as its largest assets, and fonts are currently neck-and-neck with the main application stylesheet in competing for largest page asset.

Con: My gut is that this process is on the border of "is juice worth the squeeze" because if we ever do introduce a new character, it appears a fairly annoying manual process to fix.

True, though besides what I'd already commented about not expecting this to be common, I don't think it's that tedious a process? I also think it's different from something like vulnerable dependency updates failing builds on unrelated pull requests, since at least in this case the build should only fail as a direct result of the changes of the pull request, assuming someone is adding content including a character we don't already include in our fonts.

Con: If a character slips through, I'm not sure that we have any automated feedback mechanisms to help us know (since browsers would just fall back to a different font that has the glyphs, right?)

I think this is actually improved by this pull request. We were already subsetting fonts as of #6094, and as mentioned in #10655 (comment) we already had characters that had slipped through and weren't being included in the subset fonts. At least now we have some test coverage to compare our content with what's included in the fonts.

And I think the other point about fallbacks is a reassurance that even in the worst-case scenario of the character being unavailable, the system will still render a fallback from the system fonts. So the overall risk is relatively low.

May work better in CI

See: #10655 (comment) Co-authored-by: Zach Margolis <zachmargolis@users.noreply.github.com>

See: - https://stackoverflow.com/a/41194935 - http://solutions.davesource.com/20161216.Shebang-That-Calls-Ruby-Rails-Script-With-Arguments.html - rails/rails#665

scripts/yaml_characters

See: 675e0de#r1608353189 Co-authored-by: Zach Margolis <zachmargolis@users.noreply.github.com>

aduth added 2 commits May 20, 2024 10:18

Subset fonts to used character data

757d6db

Optimize fonts

49c1c90

aduth added the performance label May 20, 2024

Remove unnecessary hash_values

7e7fa4d

This was added when considering support for legacy formats with nested hashes. Since we now target only flattened keys (where values are either strings or arrays of strings), use `flatten` instead to simplify

zachmargolis reviewed May 20, 2024

View reviewed changes

aduth added 2 commits May 20, 2024 13:12

Drop Nokogiri dependency, remove any HTML-like

a65fef5

Add changelog

24620e7

changelog: Internal, Performance, Optimize size of fonts to include only content character data

aduth marked this pull request as draft May 20, 2024 17:14

zachmargolis reviewed May 20, 2024

View reviewed changes

aduth added 6 commits May 20, 2024 13:38

Use glyphs from loaded locale data

f498c7f

Simplify hash_values

aebf6ee

Support excluding data from gem load paths

ad920a8

Use case for type-checking value

253ec84

Limit formats to woff2

fcbc46b

1. Avoid extra dependency 2. Faster to run since we're not creating file formats we don't need

Regenerate fonts after updated glyphs

1f5f81c

aduth marked this pull request as ready for review May 20, 2024 18:08

aduth commented May 20, 2024

View reviewed changes

zachmargolis approved these changes May 20, 2024

View reviewed changes

aduth and others added 4 commits May 20, 2024 15:32

Try referencing Rails directly

8755a93

May work better in CI

Add banner with usage help for script

ec92027

See: #10655 (comment) Co-authored-by: Zach Margolis <zachmargolis@users.noreply.github.com>

Try bundle exec shebang

77daeda

Exec Rails by code instead of shebang

675e0de

See: - https://stackoverflow.com/a/41194935 - http://solutions.davesource.com/20161216.Shebang-That-Calls-Ruby-Rails-Script-With-Arguments.html - rails/rails#665

zachmargolis reviewed May 21, 2024

View reviewed changes

scripts/yaml_characters Outdated Show resolved Hide resolved

aduth and others added 2 commits May 21, 2024 09:41

Require environment.rb to load Rails

bf89353

See: 675e0de#r1608353189 Co-authored-by: Zach Margolis <zachmargolis@users.noreply.github.com>

Remove redundant file extension

a643ddc

aduth merged commit d753d55 into main May 22, 2024

aduth deleted the aduth-subset-content branch May 22, 2024 12:17

matthinz mentioned this pull request May 23, 2024

Deploy RC 382 to Production #10692

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subset fonts to used character data#10655

Subset fonts to used character data#10655
aduth merged 17 commits intomainfrom
aduth-subset-content

aduth commented May 20, 2024 •

edited

Loading

Uh oh!

zachmargolis May 20, 2024

Uh oh!

aduth May 20, 2024

Uh oh!

aduth commented May 20, 2024 •

edited

Loading

Uh oh!

aduth commented May 20, 2024 •

edited

Loading

Uh oh!

zachmargolis May 20, 2024

Uh oh!

aduth May 20, 2024

Uh oh!

aduth May 20, 2024

Uh oh!

zachmargolis May 20, 2024

Uh oh!

zachmargolis left a comment

Uh oh!

Uh oh!

zachmargolis May 20, 2024

Uh oh!

aduth commented May 20, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -0,0 +1 @@
		!#%&(),-./0123456789:;?@ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz «»¿ÀÁÈÉÊÎÓÚàáâãçèéêëíîïñóôùúû ‑—‘’“”…‹中体文简

		I18n.backend.eager_load!

		data = I18n.backend.translations.slice(*I18n.available_locales - excluded_locales)

Conversation

aduth commented May 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛠 Summary of changes

📜 Testing Plan

Uh oh!

zachmargolis May 20, 2024

Choose a reason for hiding this comment

Uh oh!

aduth May 20, 2024

Choose a reason for hiding this comment

Uh oh!

aduth commented May 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aduth commented May 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zachmargolis May 20, 2024

Choose a reason for hiding this comment

Uh oh!

aduth May 20, 2024

Choose a reason for hiding this comment

Uh oh!

aduth May 20, 2024

Choose a reason for hiding this comment

Uh oh!

zachmargolis May 20, 2024

Choose a reason for hiding this comment

Uh oh!

zachmargolis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zachmargolis May 20, 2024

Choose a reason for hiding this comment

Uh oh!

aduth commented May 20, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aduth commented May 20, 2024 •

edited

Loading

aduth commented May 20, 2024 •

edited

Loading

aduth commented May 20, 2024 •

edited

Loading