Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fileinfo produces import-table hash of the empty string from a PE binary with imports #460

Closed
s3rvac opened this issue Jan 9, 2019 · 1 comment

Comments

@s3rvac
Copy link
Member

s3rvac commented Jan 9, 2019

fileinfo produces import-table hash of the empty string from a PE binary with imports.

Input

Run

$ retdec-fileinfo -v FILE

where FILE is:

Output

[..]
Import table
------------
Number of imports: 127
MD5              : d41d8cd98f00b204e9800998ecf8427e
SHA256           : e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
[..]

Expected output

If there are 127 imports, why are the MD5 and SHA256 values hashes of the empty string?

$ echo -n '' | md5sum
d41d8cd98f00b204e9800998ecf8427e  -
$ echo -n '' | sha256sum
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855  -

I would expect the hashes to be different from the ones of the empty string. Or, alternatively, I would expect the import table to be empty.

Configuration

  • Commit: c31c316 (current master)
  • 64b Arch Linux, GCC 8.2.1, Debug build of RetDec
@PeterMatula
Copy link
Collaborator

127 import entries were parsed, but it looks like they are all junk. Hashes are computed from imports' function names and library names. If these are not valid, import is skipped - not used in hash calculation. In this case, all imports were skipped and the resulting string used for computation was empty.

We don't want to be too clever and try to remove what looks like corrupted imports.

Our hash computation algorithm should be the same as the one in YARA, so we looked into what YARA comes up with. It computes a different hashes, because it uses a different junk data. Imports are also corrupted in objdump and LIEF. It would be very hard to try to implement the same import parsing as YARA, to get the same data to work with.

Therefore, we decided to detect this case (an empty string used in hash computation) and not produce hashes at all. There is also a new method in fileformat called ImportTable::invalidImpHash(). Fileinfo will not produce hash entries if they are empty, so they won't be in output JSON and consumers cannot work with them.

Possible future problems. Since it is not guaranteed that we parse the same imports as YARA, the source data for hash computation may differ. If there is at least one valid import entry, hash gets computed (source string is not empty), but it may differ from hash YARA would compute. If this happens and becomes a problem, we should not try to duplicate YARA's import parsing mechanism, but we should try to detect that imports are corrupted and mark the computed hashes as invalid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants