Skip to content

Commit ec13b51

Browse files
committed
feat(handler): fix issues with MSI handler
Seems to work on both vanilla and padded MSI files. This could be migrated to a fully Python-based implementation in the future using: * https://github.com/nightlark/pymsi * https://github.com/decalage2/olefile As of v0.47, olefile does not handle padded MSIs properly so we re-implement CFBF header parsing and compute the archive size ourselves.
1 parent 551bc49 commit ec13b51

File tree

1 file changed

+7
-13
lines changed
  • python/unblob/handlers/archive

1 file changed

+7
-13
lines changed

python/unblob/handlers/archive/msi.py

Lines changed: 7 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,7 @@
11
"""MSI Handler
22
3-
Extracts uses 7z but could migrate to a fully Python-based implementation:
4-
5-
https://github.com/nightlark/pymsi
6-
https://github.com/decalage2/olefile
7-
8-
As of v0.47, olefile does not handle padded MSIs properly so we re-implement
9-
CFBF header parsing and compute the archive size ourselves.
3+
Extracts MSIs using 7z with custom CFBF header parsing to compute the full
4+
archive size.
105
"""
116

127
import io
@@ -106,8 +101,7 @@ def calculate_chunk(self, file: File, start_offset: int) -> Optional[ValidChunk]
106101

107102
max_used_sector = 0
108103

109-
full_fat = []
110-
for i, sect in enumerate(header.sectFat):
104+
for sector_id, sect in enumerate(header.sectFat):
111105
# skip empty
112106
if sect == 0xFFFFFFFF:
113107
continue
@@ -116,13 +110,13 @@ def calculate_chunk(self, file: File, start_offset: int) -> Optional[ValidChunk]
116110
raw_sector = file.read(sector_size)
117111
entries = struct.unpack(f'<{entries_per_sector}I', raw_sector)
118112

119-
base_sector_id = i * entries_per_sector
120-
for i in range(len(entries) - 1, -1, -1):
121-
if entries[i] == 0xFFFFFFFF:
113+
base_sector_id = sector_id * entries_per_sector
114+
for entry_id in range(len(entries) - 1, -1, -1):
115+
if entries[entry_id] == 0xFFFFFFFF:
122116
continue
123117

124118
# Found the highest id on this page
125-
max_id = base_sector_id + i
119+
max_id = base_sector_id + entry_id
126120

127121
if max_id > max_used_sector:
128122
max_used_sector = max_id

0 commit comments

Comments
 (0)