Skip to content

Commit a83cac6

Browse files
committed
zip: Fix incorrect time/date, add extended timestamp and refactor
MSDOS time/date was read in wrong order and also did not take into account that the bit ranges in the shortis are in little-endian. Remodel modification_time/date to be one struct with fat_time, fat_date LE shorts and then synthetic values for day, hours, minute etc and also a unix field with the timestamp as unix time. Also refactor and clenaup extra fields/extended code a bit. Fixes #792
1 parent 1a3823f commit a83cac6

11 files changed

+1350
-825
lines changed

doc/formats.md

+6
Original file line numberDiff line numberDiff line change
@@ -1395,9 +1395,15 @@ Decode value as zip
13951395

13961396
Supports ZIP64.
13971397

1398+
## Timestamp and time zones
1399+
1400+
The timestamp accessed via `.local_files[].last_modification` is encoded in ZIP files using [MS-DOS representation](https://learn.microsoft.com/en-us/windows/win32/api/oleauto/nf-oleauto-dosdatetimetovarianttime) which lacks a known time zone. Probably the local time/date was used at creation. The `unix_guess` field in `last_modification` is a guess assuming the local time zone was UTC at creation.
1401+
13981402
### References
13991403
- https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
14001404
- https://opensource.apple.com/source/zip/zip-6/unzip/unzip/proginfo/extra.fld
1405+
- https://formats.kaitai.io/dos_datetime/
1406+
- https://learn.microsoft.com/en-us/windows/win32/api/oleauto/nf-oleauto-dosdatetimetovarianttime
14011407

14021408

14031409
[#]: sh-end

format/zip/testdata/bigzero-zip.zip.fqtest

+42-27
Original file line numberDiff line numberDiff line change
@@ -20,14 +20,16 @@ $ fq -o uncompress=false dv bigzero-zip.zip
2020
0x0000| 00 | . | language_encoding: false 0x7.4-0x7.5 (0.1)
2121
0x0000| 00 | . | unused1: 0 0x7.5-0x8 (0.3)
2222
0x0000| 08 00 | .. | compression_method: "deflated" (8) 0x8-0xa (2)
23-
| | | last_modification_date{}: 0xa-0xc (2)
24-
0x0000| c8 | . | hours: 25 0xa-0xa.5 (0.5)
25-
0x0000| c8 78 | .x | minutes: 3 0xa.5-0xb.3 (0.6)
26-
0x0000| 78 | x | seconds: 24 0xb.3-0xc (0.5)
27-
| | | last_modification_time{}: 0xc-0xe (2)
28-
0x0000| 84 | . | year: 66 0xc-0xc.7 (0.7)
29-
0x0000| 84 45 | .E | month: 2 0xc.7-0xd.3 (0.4)
30-
0x0000| 45 | E | day: 5 0xd.3-0xe (0.5)
23+
| | | last_modification{}: 0xa-0xe (4)
24+
0x0000| c8 78 | .x | fat_time: 0x78c8 0xa-0xc (2)
25+
| | | second: 16 (8)
26+
| | | minute: 6
27+
| | | hour: 15
28+
0x0000| 84 45 | .E | fat_date: 0x4584 0xc-0xe (2)
29+
| | | day: 4
30+
| | | month: 12
31+
| | | year: 2014 (34)
32+
| | | unix_guess: 1417705576 (2014-12-04T15:06:16)
3133
0x0000| 54 81| T.| crc32_uncompressed: 0xae158154 0xe-0x12 (4)
3234
0x0010|15 ae |.. |
3335
0x0010| 4e 28 00 00 | N(.. | compressed_size: 10318 0x12-0x16 (4)
@@ -38,13 +40,19 @@ $ fq -o uncompress=false dv bigzero-zip.zip
3840
0x0020|67 7a 65 72 6f 2e 7a 69 70 |gzero.zip |
3941
| | | extra_fields[0:2]: 0x29-0x45 (28)
4042
| | | [0]{}: extra_field 0x29-0x36 (13)
41-
0x0020| 55 54 | UT | header_id: 0x5455 (extended timestamp) 0x29-0x2b (2)
42-
0x0020| 09 00 | .. | data_size: 9 0x2b-0x2d (2)
43-
0x0020| 03 57 6a| .Wj| data: raw bits 0x2d-0x36 (9)
44-
0x0030|80 54 7e 6a 80 54 |.T~j.T |
43+
0x0020| 55 54 | UT | tag: 0x5455 (extended timestamp) 0x29-0x2b (2)
44+
0x0020| 09 00 | .. | size: 9 0x2b-0x2d (2)
45+
| | | flags{}: 0x2d-0x2e (1)
46+
0x0020| 03 | . | unused: 0 0x2d-0x2d.5 (0.5)
47+
0x0020| 03 | . | creation_time_present: false 0x2d.5-0x2d.6 (0.1)
48+
0x0020| 03 | . | access_time_present: true 0x2d.6-0x2d.7 (0.1)
49+
0x0020| 03 | . | modification_time_present: true 0x2d.7-0x2e (0.1)
50+
0x0020| 57 6a| Wj| modification_time: 1417701975 (2014-12-04T14:06:15Z) 0x2e-0x32 (4)
51+
0x0030|80 54 |.T |
52+
0x0030| 7e 6a 80 54 | ~j.T | access_time: 1417702014 (2014-12-04T14:06:54Z) 0x32-0x36 (4)
4553
| | | [1]{}: extra_field 0x36-0x45 (15)
46-
0x0030| 75 78 | ux | header_id: 0x7875 (UNIX UID/GID) 0x36-0x38 (2)
47-
0x0030| 0b 00 | .. | data_size: 11 0x38-0x3a (2)
54+
0x0030| 75 78 | ux | tag: 0x7875 (UNIX UID/GID) 0x36-0x38 (2)
55+
0x0030| 0b 00 | .. | size: 11 0x38-0x3a (2)
4856
0x0030| 01 04 74 00 00 00| ..t...| data: raw bits 0x3a-0x45 (11)
4957
0x0040|04 14 00 00 00 |..... |
5058
0x0040| ed dd bf aa 03 df bf df e7 ef 9c| ...........| compressed: raw bits 0x45-0x2893 (10318)
@@ -70,15 +78,17 @@ $ fq -o uncompress=false dv bigzero-zip.zip
7078
0x2890| 00 | . | language_encoding: false 0x289c.4-0x289c.5 (0.1)
7179
0x2890| 00 | . | unused1: 0 0x289c.5-0x289d (0.3)
7280
0x2890| 08 00 | .. | compression_method: "deflated" (8) 0x289d-0x289f (2)
73-
| | | last_modification_date{}: 0x289f-0x28a1 (2)
74-
0x2890| c8| .| hours: 25 0x289f-0x289f.5 (0.5)
75-
0x2890| c8| .| minutes: 3 0x289f.5-0x28a0.3 (0.6)
81+
| | | last_modification{}: 0x289f-0x28a3 (4)
82+
0x2890| c8| .| fat_time: 0x78c8 0x289f-0x28a1 (2)
7683
0x28a0|78 |x |
77-
0x28a0|78 |x | seconds: 24 0x28a0.3-0x28a1 (0.5)
78-
| | | last_modification_time{}: 0x28a1-0x28a3 (2)
79-
0x28a0| 84 | . | year: 66 0x28a1-0x28a1.7 (0.7)
80-
0x28a0| 84 45 | .E | month: 2 0x28a1.7-0x28a2.3 (0.4)
81-
0x28a0| 45 | E | day: 5 0x28a2.3-0x28a3 (0.5)
84+
| | | second: 16 (8)
85+
| | | minute: 6
86+
| | | hour: 15
87+
0x28a0| 84 45 | .E | fat_date: 0x4584 0x28a1-0x28a3 (2)
88+
| | | day: 4
89+
| | | month: 12
90+
| | | year: 2014 (34)
91+
| | | unix_guess: 1417705576 (2014-12-04T15:06:16)
8292
0x28a0| 54 81 15 ae | T... | crc32_uncompressed: 0xae158154 0x28a3-0x28a7 (4)
8393
0x28a0| 4e 28 00 00 | N(.. | compressed_size: 10318 0x28a7-0x28ab (4)
8494
0x28a0| b9 9a 3f 00 | ..?. | uncompressed_size: 4168377 0x28ab-0x28af (4)
@@ -94,12 +104,17 @@ $ fq -o uncompress=false dv bigzero-zip.zip
94104
0x28c0| 62 69 67 7a 65 72 6f 2e 7a 69 70 | bigzero.zip | file_name: "bigzero.zip" 0x28c1-0x28cc (11)
95105
| | | extra_fields[0:2]: 0x28cc-0x28e4 (24)
96106
| | | [0]{}: extra_field 0x28cc-0x28d5 (9)
97-
0x28c0| 55 54 | UT | header_id: 0x5455 (extended timestamp) 0x28cc-0x28ce (2)
98-
0x28c0| 05 00| ..| data_size: 5 0x28ce-0x28d0 (2)
99-
0x28d0|03 57 6a 80 54 |.Wj.T | data: raw bits 0x28d0-0x28d5 (5)
107+
0x28c0| 55 54 | UT | tag: 0x5455 (extended timestamp) 0x28cc-0x28ce (2)
108+
0x28c0| 05 00| ..| size: 5 0x28ce-0x28d0 (2)
109+
| | | flags{}: 0x28d0-0x28d1 (1)
110+
0x28d0|03 |. | unused: 0 0x28d0-0x28d0.5 (0.5)
111+
0x28d0|03 |. | creation_time_present: false 0x28d0.5-0x28d0.6 (0.1)
112+
0x28d0|03 |. | access_time_present: true 0x28d0.6-0x28d0.7 (0.1)
113+
0x28d0|03 |. | modification_time_present: true 0x28d0.7-0x28d1 (0.1)
114+
0x28d0| 57 6a 80 54 | Wj.T | modification_time: 1417701975 (2014-12-04T14:06:15Z) 0x28d1-0x28d5 (4)
100115
| | | [1]{}: extra_field 0x28d5-0x28e4 (15)
101-
0x28d0| 75 78 | ux | header_id: 0x7875 (UNIX UID/GID) 0x28d5-0x28d7 (2)
102-
0x28d0| 0b 00 | .. | data_size: 11 0x28d7-0x28d9 (2)
116+
0x28d0| 75 78 | ux | tag: 0x7875 (UNIX UID/GID) 0x28d5-0x28d7 (2)
117+
0x28d0| 0b 00 | .. | size: 11 0x28d7-0x28d9 (2)
103118
0x28d0| 01 04 74 00 00 00 04| ..t....| data: raw bits 0x28d9-0x28e4 (11)
104119
0x28e0|14 00 00 00 |.... |
105120
| | | file_comment: "" 0x28e4-0x28e4 (0)

format/zip/testdata/help_zip.fqtest

+9
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,16 @@ Decode examples
2020

2121
Supports ZIP64.
2222

23+
Timestamp and time zones
24+
========================
25+
The timestamp accessed via .local_files[].last_modification is encoded in ZIP files using MS-DOS representation
26+
(https://learn.microsoft.com/en-us/windows/win32/api/oleauto/nf-oleauto-dosdatetimetovarianttime) which lacks a known time zone.
27+
Probably the local time/date was used at creation. The unix_guess field in last_modification is a guess assuming the local time zone
28+
was UTC at creation.
29+
2330
References
2431
==========
2532
- https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
2633
- https://opensource.apple.com/source/zip/zip-6/unzip/unzip/proginfo/extra.fld
34+
- https://formats.kaitai.io/dos_datetime/
35+
- https://learn.microsoft.com/en-us/windows/win32/api/oleauto/nf-oleauto-dosdatetimetovarianttime

0 commit comments

Comments
 (0)