Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serve WOFF 1.0 font files compressed? #42

Closed
alrra opened this issue Sep 3, 2014 · 4 comments
Closed

Serve WOFF 1.0 font files compressed? #42

alrra opened this issue Sep 3, 2014 · 4 comments

Comments

@alrra
Copy link
Member

alrra commented Sep 3, 2014

According to the people from Zoompf (see blog post), it might be worth serving WOFF 1.0 font files compressed (even thought they are compressed by default):

Yes, WOFF files are natively compressed. However, their compression is not very good. I have WOFF files reduced 30-50% using HTTP compression. Clearly the native compression falls short. In fact, one of the primary reasons WOFF File Format 2 was even created was “to provide improved compression and thus lower use of network bandwidth, while still allowing fast decompression even on mobile devices. This is achieved by combining a content-aware preprocessing step and improved entropy coding, compared to the Flate compression used in WOFF 1.0"

See also: http://www.w3.org/TR/WOFF/


Being curious about this, I wanted to see some more data, so I generate some (the results are posted in the following comments).

How the data was generated

  1. Downloaded the fonts (+ variations of the fonts) from Google Fonts in TrueType format - .ttf (didn't wanted to use the WOFF version, as Google recently switched to using Zopfli, which makes the size of the .woff files smaller then normal).
  2. Converted the .tff files to .woff (+ got the size of the .woff file - Size (Original))
  3. Added the .woff files on Apache/2.4.7 (Ubuntu), enabled compression (+ made requests to get the size of the compressed .woff files - Size (Compressed))

My results

Graphs

Expanding the 0-10% from the above graph, it looks like this:

My verdict

  • It general it doesn't seem to be worth it.

Thoughts?

@igrigorik
Copy link

@alrra do you have a CSV of the results handy? I'd like to see a distribution of savings.

@alrra
Copy link
Member Author

alrra commented Sep 6, 2014

do you have a CSV

@igrigorik Yes of course, here it is (the download link was included in my first comment, but it seems that it wasn't obvious enough, so I made some changes).

@igrigorik
Copy link

@alrra thanks! Digging into the data...

 quantile(difference, c(.25, .50, .75, .85, .90, .95, .98)) 
     25%      50%      75%      85%      90%      95%      98% 
   30.00    50.00   233.25   937.75  2605.90  7335.20 11469.48 

So, median savings is ~50 bytes, and ~1k at 85th percentile. That said, I think it's useful to look at the outliers, since that reveals an interesting pattern:

fonts <- subset(data, data$Difference > 10*1024, select = c(Font, Difference))
fonts[order(-fonts$Difference),]

                                            Font Difference
                     Jura (Normal + All Subsets)      40675
                    PT Sans (Bold + All Subsets)      40587
            PT Sans Caption (Bold + All Subsets)      40246
                  Jura (Semi-Bold + All Subsets)      39440
                      Jura (Light + All Subsets)      38862
             PT Sans Narrow (Bold + All Subsets)      38244
             PT Sans (Bold Italic + All Subsets)      37513
           PT Sans (Normal Italic + All Subsets)      37032
                  PT Sans (Normal + All Subsets)      36482
           PT Sans Narrow (Normal + All Subsets)      34833
          PT Sans Caption (Normal + All Subsets)      32183
         PT Serif Caption (Normal + All Subsets)      30578
                   PT Serif (Bold + All Subsets)      30557
            PT Serif (Bold Italic + All Subsets)      30112
  PT Serif Caption (Normal Italic + All Subsets)      30034
                 PT Serif (Normal + All Subsets)      29218
          PT Serif (Normal Italic + All Subsets)      29096
                  Open Sans (Bold + All Subsets)      22495
                 Open Sans (Light + All Subsets)      22481
                Open Sans (Normal + All Subsets)      22479
             Open Sans (Semi-Bold + All Subsets)      22479
            Open Sans (Extra Bold + All Subsets)      22472
       Open Sans Condensed (Light + All Subsets)      22470
     Open Sans (Extra Bold Italic + All Subsets)      22466
          Open Sans (Light Italic + All Subsets)      22455
      Open Sans (Semi-Bold Italic + All Subsets)      22454
         Open Sans (Normal Italic + All Subsets)      22452
           Open Sans (Bold Italic + All Subsets)      22443
Open Sans Condensed (Light Italic + All Subsets)      22425
                 Ek Mukta (Normal + All Subsets)      17448
        Open Sans Condensed (Bold + All Subsets)      16765
                  Ek Mukta (Light + All Subsets)      15710
                   Ek Mukta (Bold + All Subsets)      13533
             Ek Mukta (Extra Bold + All Subsets)      13523
              Ek Mukta (Semi-Bold + All Subsets)      13481
                  Trocchi (Normal + All Subsets)      13387
                       Lora (Bold + All Subsets)      13221
                Lora (Bold Italic + All Subsets)      12346
                                  Vibur (Normal)      11700
                    Vibur (Normal + All Subsets)      11700
                                  Cabin (Normal)      11652
                    Cabin (Normal + All Subsets)      11652
                                  Alice (Normal)      11577
                    Alice (Normal + All Subsets)      11577
                        Cabin (Semi-Bold Italic)      11321
          Cabin (Semi-Bold Italic + All Subsets)      11321
              Lora (Normal Italic + All Subsets)      11152
                     Lora (Normal + All Subsets)      11085
                                    Cabin (Bold)      11065
                      Cabin (Bold + All Subsets)      11065
                           Cabin (Normal Italic)      11043
             Cabin (Normal Italic + All Subsets)      11043
                             Cabin (Bold Italic)      11037
               Cabin (Bold Italic + All Subsets)      11037
                                     Lora (Bold)      10825
                          Cabin Condensed (Bold)      10676
            Cabin Condensed (Bold + All Subsets)      10676
                     Cabin Condensed (Semi-Bold)      10669
       Cabin Condensed (Semi-Bold + All Subsets)      10669
                        Cabin Condensed (Normal)      10647
          Cabin Condensed (Normal + All Subsets)      10647
                              Lora (Bold Italic)      10503
                                    Buda (Light)      10497
                      Buda (Light + All Subsets)      10497
                               Cabin (Semi-Bold)      10338
                 Cabin (Semi-Bold + All Subsets)      10338

Note that the major savings are all on "All subsets" variants - aka, full, unsubsetted font files. Intuitively, this makes sense, since gzip is likely finding some redundancies across the embedded font tables (WOFF 1.0 compresses each font table separately) and hence the filesize win.

Does this warrant a general "gzip WOFF 1.0" recommendation? Personally, I think it's a good to know (tm), but not worth the double compress/decompress overhead for majority of fonts:

  • If you're subsetting your fonts then this is a moot discussion, modulo tens of bytes. And if you care about performance, you should be subsetting your fonts...
  • If you have multiple font subsets, there are two cases:
    1. For browsers that support unicode-range you can keep each subset as a separate resource and declare which codepoints each is responsible for. As a bonus, using separate resources also improves caching. Finally, browsers that support unicode-range also support WOFF 2.0, so that's another ~30% win on compression. In short, use unicode-range + WOFF 2.0.
    2. For older browsers that don't support unicode-range, you'll need to embed multiple subsets into a single resource, which means there may be some benefit to second layer of compression. The actual win will vary based on number of subsets, and glyphs in each subset: test it on your resource, see if the win is worth the extra layer of compression (CPU) overhead, and selectively apply gzip where appropriate.

@alrra alrra changed the title Serve WOFF font files compressed? Serve WOFF 1.0 font files compressed? Sep 9, 2014
@alrra
Copy link
Member Author

alrra commented Sep 9, 2014

Digging into the data...

@igrigorik Thank you, I sincerely appreciate that you took the time to look over this!

the major savings are all on "All subsets" variants - aka, full, unsubsetted font files

As expected, but in my humble opinion, that isn't the usual case, and I would be quite surprised to see that a significant number of people use something like that in production (I included the "+ All subsets" variants mainly to see if in those "extreme" cases I can get the 30-50% reduction, but the biggest was 33.44%, also being the only one over 30%).

Personally, I think it's a good to know (tm), but not worth the double compress/decompress overhead for majority of fonts

I agree.

For older browsers that don't support unicode-range, you'll need to embed multiple subsets into a single resource, which means there may be some benefit to second layer of compression. The actual win will vary based on number of subsets, and glyphs in each subset: test it on your resource, see if the win is worth the extra layer of compression (CPU) overhead, and selectively apply gzip where appropriate.

Plus, I think people should also consider using Zopfli before doing anything else.

Closing this issue as it general (re)compressing the .woff files doesn't seem to be worth it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants