You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since the Zopfli output (for the gzip option) is valid gzip content, there doesn't seem to be a straightforward and foolproof way to identify files compressed with Zopfli.
There is no way to tell for sure. Adding information to the output to indicate zopfli, would actually add bits to the output so such thing is not done :) Any compressor can set the FLG, MTIME, and so on to anything it wants, and users of zopfli can also change the MTIME bytes that zopfli had output to an actual time.
One heuristic to tell that it was compressed with zopfli or another dense deflate compressor is to compress it with regular gzip -9 (which is fast), and compare that the size of the file to test is for example more than 3% smaller.
ID1 = 1f and ID2 = 8b - these are the magic numbers that uniquely identify the content as being gzip.
CM = 8 - this is a value customarily used by gzip
FLG and MTIME are usually non-zero values.
XFL will be either 0, 2, or 4:
0 - default, compressor used intermediate levels of compression (when any of the -2 ... -8 options are used).
2 - the compressor used maximum compression, slowest algorithm (when the -9 or --best option is used).
4 - the compressor used fastest algorithm (when the -1 or --fast option is used).
Zopfli
On thing that Zopfli does is that it sets FLG and MTIME to zero, XFL to 2, and OS to 3, so basically files compressed with Zopfli will most likely start with 1f8b 0800 0000 0000 0203, unless things are changed by the user (which in general doesn't seem very likely to happen).
Now, regular gzip output might also start with that, even thought the chance of doing so is smaller:
Most web servers (e.g.: Apache, NGINX), by default, will not opt users into the best compression level, therefore, the output shouldn't have XFL set to 2.
Most utilities that output regular gzip will have non-zero values for MTIME and FLG.
So, if a file doesn't start with 1f8b 0800 0000 0000 0203, it's a good (not perfect) indication that Zopfli wasn't used, but it's a fast check compared to compressing files and comparing file sizes. However, if a file does start with that, it can be either Zopfli or gzip, and we cannot really make assumptions here.
The text was updated successfully, but these errors were encountered:
Since the
Zopfli
output (for thegzip
option) is validgzip
content, there doesn't seem to be a straightforward and foolproof way to identify files compressed withZopfli
.From an email discussion with @lvandeve:
Other notes:
gzip
A
gzip
member header has the following structurewhere:
ID1
=1f
andID2
=8b
- these are the magic numbers that uniquely identify the content as beinggzip
.CM
=8
- this is a value customarily used bygzip
FLG
andMTIME
are usually non-zero values.XFL
will be either0
,2
, or4
:0
- default, compressor used intermediate levels of compression (when any of the-2
...-8
options are used).2
- the compressor used maximum compression, slowest algorithm (when the-9
or--best
option is used).4
- the compressor used fastest algorithm (when the-1
or--fast
option is used).Zopfli
On thing that
Zopfli
does is that it setsFLG
andMTIME
to zero,XFL
to2
, andOS
to3
, so basically files compressed withZopfli
will most likely start with1f8b 0800 0000 0000 0203
, unless things are changed by the user (which in general doesn't seem very likely to happen).Now, regular
gzip
output might also start with that, even thought the chance of doing so is smaller:XFL
set to2
.gzip
will have non-zero values forMTIME
andFLG
.So, if a file doesn't start with
1f8b 0800 0000 0000 0203
, it's a good (not perfect) indication thatZopfli
wasn't used, but it's a fast check compared to compressing files and comparing file sizes. However, if a file does start with that, it can be eitherZopfli
orgzip
, and we cannot really make assumptions here.The text was updated successfully, but these errors were encountered: