Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encrypted ZIP File with .docx Documents #81

Closed
trustinveritas opened this issue Oct 14, 2022 · 4 comments
Closed

Encrypted ZIP File with .docx Documents #81

trustinveritas opened this issue Oct 14, 2022 · 4 comments
Labels

Comments

@trustinveritas
Copy link

So I've found an old ZIP File on my computer with two docx files in it.

Unfortunately I set a password years ago...

7z l -slt

`Path = Example_of_C_Code.docx
Folder = -
Size = 1101098
Packed Size = 1055092
Attributes = A
Encrypted = +
Comment =
CRC = 88DA98B2
Method = ZipCrypto Deflate
Host OS = FAT
Version = 20
Volume Index = 0

Path = How_I_made_it.docx
Folder = -
Size = 36693
Packed Size = 30468
Attributes = A
Encrypted = +
Comment =
CRC = 3FC01F92
Method = ZipCrypto Deflate
Host OS = FAT
Version = 20
Volume Index = 0`

So now I make these steps as I understood:

  1. A plain.txt file with following content in it: 50 4B 03 04 14 00
  2. I make a ZIP plain.zip from plain.txt, filename plain.zip | Deflate | Maximum
  3. I run bkcrack

bkcrack -C path_to_encrypted.zip -c Example_of_C_Code.docx -P plain.zip -p plain.docx

bkcrack tells me could not find the keys.

What am I doing wrong?

Thank you for the help, much appreciated in advance!

@kimci86
Copy link
Owner

kimci86 commented Oct 14, 2022

Hi, the problem you face is to guess some compressed plaintext. Your files were compressed and then encrypted when added to the zip file. The known plaintext attack uses a portion of data right before encryption, that is to say compressed data in your case.

Now, you may wonder how to guess such compressed data?
Compressing the first few bytes of a file as you tried generally does not give the first few bytes of the compressed file. For files of a size like yours, deflate compressed data most probably starts with a statistical model (Huffman tree, itself encoded in a compact format) of a large chunk of data. You cannot guess such compressed data without knowing a big part of the file.
On the other hand, if you have a big part of a file, or event better, a complete file, it is more likely that compressing it will give suitable compressed data. Even when the whole file is known, getting correct compressed plaintext can be a challenge because deflate compression can be tweaked with some parameters.
Another approach is to study docx file format to see there is some pattern that would make compression predictable, but I doubt it.
For more information about deflate compression, you can check this article on zlib's website An Explanation of the Deflate Algorithm and RFC1951 - DEFLATE Compressed Data Format Specification for all the details.

If you cannot guess plaintext, then you might consider using a password cracker such as john the ripper or hashcat.

@trustinveritas
Copy link
Author

Hi, the problem you face is to guess some compressed plaintext. Your files were compressed and then encrypted when added to the zip file. The known plaintext attack uses a portion of data right before encryption, that is to say compressed data in your case.

Now, you may wonder how to guess such compressed data? Compressing the first few bytes of a file as you tried generally does not give the first few bytes of the compressed file. For files of a size like yours, deflate compressed data most probably starts with a statistical model (Huffman tree, itself encoded in a compact format) of a large chunk of data. You cannot guess such compressed data without knowing a big part of the file. On the other hand, if you have a big part of a file, or event better, a complete file, it is more likely that compressing it will give suitable compressed data. Even when the whole file is known, getting correct compressed plaintext can be a challenge because deflate compression can be tweaked with some parameters. Another approach is to study docx file format to see there is some pattern that would make compression predictable, but I doubt it. For more information about deflate compression, you can check this article on zlib's website An Explanation of the Deflate Algorithm and RFC1951 - DEFLATE Compressed Data Format Specification for all the details.

If you cannot guess plaintext, then you might consider using a password cracker such as john the ripper or hashcat.

First of all @kimci86 what for a response time do you have ?!?!?
Wow - Thank you very much for the detailed answer and for getting into it.

I figured that without a complete file this would be difficult to impossible.
bkcrack I tried today with a demo.zip | CompressionMethod: Store |, without problems bkcrack was able to recover the keys for the docx.

Would it be theoretically possible for bkcrack to have one of the two docx files as original and create this as a ZIP, using the method Deflate Maximum?
Can I see if I zipped the file exactly the same by the CRC number?

@kimci86
Copy link
Owner

kimci86 commented Oct 15, 2022

Can I see if I zipped the file exactly the same by the CRC number?

No, the CRC value is computed on uncompressed data, so it does not give information about compression.
However, it is useful to check that your uncompressed file has great chances to actually be the right file.

If the CRC matches, then you have to guess how the file was compressed: what compression program with what parameters. Looking at the compressed size is a simple way to discard wrong parameters. Note that encryption adds a 12 bytes encryption header to ciphertext, so the compressed size you would try to get is the encrypted file's size minus 12.

Would it be theoretically possible for bkcrack to have one of the two docx files as original and create this as a ZIP, using the method Deflate Maximum?

Yes, this process of trying compression tools and parameters could be automated to some extent. It is not implemented at the moment but it would be nice to have. It would not be exhaustive but at least trying with zlib deflate implementation could be a good start.

@kimci86
Copy link
Owner

kimci86 commented Dec 4, 2022

I am closing this as I believe your questions have been answered. Feel free to reopen if I am mistaken or open a new issue if you have other questions or feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants