Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix corrupted file_alignment #332

Closed
wants to merge 117 commits into from
Closed

Fix corrupted file_alignment #332

wants to merge 117 commits into from

Conversation

anfedotoff
Copy link
Contributor

Fix for #331

@m4b
Copy link
Owner

m4b commented Sep 17, 2022

@koutheir does this look ok to you?

src/pe/utils.rs Outdated Show resolved Hide resolved
src/pe/debug.rs Outdated Show resolved Hide resolved
src/pe/export.rs Outdated Show resolved Hide resolved
src/pe/utils.rs Outdated Show resolved Hide resolved
src/pe/utils.rs Outdated Show resolved Hide resolved
@koutheir
Copy link

Thank you for your efforts, @anfedotoff

@m4b
Copy link
Owner

m4b commented Sep 25, 2022

Some notes:

  1. We won't be adding anyhow/thiserror/error library to goblin, probably ever. So if the assumption of these result returning changes is they'll be more usable in a follow up PR that uses more context via anyhow/whatever, want to be clear this is not a good assumption to make :) I want to keep the dependencies as thin as possible.
  2. i think the signature of find_offset returning Option was fine, and more flexible (allowing users to convert that into a more specific error if need be), for a utils library function (that maybe/probably shouldn't be exposed, but it is useful). I'm 50/50 on the result signature, because that forces our error type onto user (and technically a string allocation) that uses this function directly. I also don't have any data on if anyone is even using this function so also maybe doesn't matter that much. I'm not convinced though that semantically it's an error returning function (just like HashMap::get doesn't return an error, or various find methods in stdlib generally don't return error).

So I'm wondering why couldn't all the places find_offset was used with the new Result signature not have used the function find_offset_or or left it with ok_or semantics? This is probably annoying question since all the work was done to make it use the result form now, but just curious :) I'm just wondering if this is not the best breaking change, specifically:

  1. It is an avoidable break as far as I can see
  2. It doesn't really address some necessary change like a field going from T -> Option
  3. It doesn't add much in the way of future proofing (this is also similar to 1., in that the user can just convert the option to error if they want)

Anyway, I think it's probably fine if this ends up being better maintainable code, but it is a breaking change, so it'll get merged in a little later (along with the other breaking change that's been pending).

On that note, may be a good time to think of any other breaking changes for PE stuff that would be good to get in (and it's fine if there are none!)

src/pe/utils.rs Outdated
}
}

pub fn find_offset_or(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just fyi: this function existed precisely to wrap the Option returning pub function so that callees that want an error could call this instead (like all the changes in export.rs, for example) without causing breaking changes by altering type signature of find_offset.
The changes are breaking now, however. I guess it's fine since we have another breaking change though.

src/pe/import.rs Outdated
&mut utils::find_offset(import_directory_table_rva, sections, file_alignment, opts)
.ok_or_else(|| {
error::Error::Malformed(format!(
"Cannot create ImportData; cannot map import_directory_table_rva {:#x} into offset",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we've lost more specific error messages with the new find_offset generic error, although I'm not sure it matters much.

let offset =
utils::find_offset(rva, sections, self.file_alignment, opts).ok_or_else(|| {
error::Error::Malformed(format!(
"cannot map exception rva ({:#x}) into offset",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto all these seem to have lost more specific error messages

@m4b
Copy link
Owner

m4b commented Oct 24, 2022

@anfedotoff i'd be ok with merging this but i think we have to roll back the breaking change to the find_offset function; it seems unnecessary to me. If you want a result in the callsites, I'd suggest using find_offset_or and restore the error messages. Then we can merge and it won't be breaking, otherwise I think this PR will just languish.

Thanks for your patience!

@anfedotoff
Copy link
Contributor Author

anfedotoff commented Oct 24, 2022

@anfedotoff i'd be ok with merging this but i think we have to roll back the breaking change to the find_offset function; it seems unnecessary to me. If you want a result in the callsites, I'd suggest using find_offset_or and restore the error messages. Then we can merge and it won't be breaking, otherwise I think this PR will just languish.

Thanks for your patience!

Sorry, for late reply. It's ok if you don't want breaking changes. If we don't want to use anyhow crate, then, from my point of view, all changes have no sense in this pull request.
I think, better is to roll back to one line fix, before calling function with bad arguments. Or we could find some better place to insert this one if. What do you think?

@m4b
Copy link
Owner

m4b commented Oct 31, 2022

@anfedotoff Ok that works for me!

dureuill and others added 28 commits November 3, 2022 13:06
* Add missing subtraction according to the referenced post
* Added detailed explanation for section_read_size function
* Breaking change: make Export.offset optional
* complete E_MACHINE constants
* complete E_TYPE constants
* defined SHF_EXCLUDE
because all earlier ones are not used anyway.
* Guard every with_capacity call with a bounds check
The crate does a lot of pre-allocation based on untrusted input, which
can lead to unbounded allocation.
* Introduce a new Error Variant for Buffer bounds checks
* avoid arithmetic overflow and panics
* avoid slice panic in load cmd
* avoid underflow in symbol table parsing
* validate export trie load command
* Warn on invalid offsets
* Fixes missing exports.
* Add `new_impl` to `ExportTrie`
* Expose more of the `ProgramHeader` implementation regardless
  of whether the `alloc` feature is enabled.  For example,
  `ProgramHeader::from_raw_parts` is a wrapper around
  `core::slice::from_raw_parts` that has `alloc` dependency.
* Run `cargo fmt`.

Based on an internal patch by Keith Wesolowski (@wesolows).

Signed-off-by: Dan Cross <[email protected]>
The well intentioned check on debug directory size unfortunately results
in an error when debug information is embedded in the image.

Tools like Digital Mars embed CV debug info in the image instead of
creating a PDB, and some older tools embed COFF debug info in the image.
* Fix issue #309 - Advance offset by string table field size

There's a bug in master where the code ends up misreading the symbol table because we don't advance the offset prior to reading the strings.

This change fixes the issue by adding the correct value to the offset and also includes a unit test that covers this case.
This happens when the number of sections in an ELF object reaches 2^16,
at which point the u16 shstrndx takes the special value SHN_XINDEX.

Fix based on algorithm from:
github.com/gimli-rs/object/blob/c476071/src/read/elf/file.rs#L582-L600
fixes #318)

* Adjust coff symbol offset to account for the strtable length field
There is little point in using non-loadable segments for the VM address
translation. Ignore non-loadable segments.
Multi-arch containers can be made up of archives or Mach-O
binaries. This adds support for archives. It is a breaking change
because previously the `MachO` struct was returned and now we're
returning a new enum: `SingleArch`.

This required some refactoring of the `lib.rs` file to share the
required functions and data structures for parsing the hint at the top
of files. `take_hint_bytes`, `Hint` and `HintData` dont't require any
special features but I think were inside the `if_everything!` because
that's the only case they were used in. I did expand the API by making
`take_byte_hints` public, this was mainly because I thought it was
better solution than trying to maintain the various combinations of
features required to stop the compiler warning about the function
being unused. It's also a function that may be useful for goblin users.

Also add tests for parsing fat binaries.

This tests parsing fat binaries made up of Mach-O binaries and
archives. I've checked in the binaries to make testing easier as
they're quite small (both are built from the hello_world.c file in the
same directory). Above the tests themselves are instructions for how
to compile the binaries they use.
@anfedotoff
Copy link
Contributor Author

@anfedotoff Ok that works for me!

I've done something strange, playing with git history and doing rebase master. I'll better reopen new PR from fresh master. Fix will be very short, sorry)

@anfedotoff anfedotoff closed this Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.