-
-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial support for BCSymbolMap #336
Conversation
symbolic-debuginfo/src/macho.rs
Outdated
/// let object_data = | ||
/// std::fs::read("tests/fixtures/2d10c42f-591d-3265-b147-78ba0868073f.dwarf-hidden") | ||
/// .unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example is great! It could be confusing to see the "test/fixtures..." path in a doc comment, though, so what if we hide that line (and same for the symbolmap below)?
/// let object_data = | |
/// std::fs::read("tests/fixtures/2d10c42f-591d-3265-b147-78ba0868073f.dwarf-hidden") | |
/// .unwrap(); | |
/// # let object_data = | |
/// # std::fs::read("tests/fixtures/2d10c42f-591d-3265-b147-78ba0868073f.dwarf-hidden") | |
/// # .unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it would make the object_data
variable come out of nowhere with no idea on what its type would be. I prefer this as it's fully self contained, I don't think it's a big leap to infer that's just a path name. Plus anyone who knows doctests, which will be most people reading this, will know exactly what's going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could be convinced to do something ugly like:
/// let object_data = std::fs::read("some/path/object.dwarf");
/// # let object_data = <the full thing that actually works>
I guess, if you really prefer. But I don't see it as much better to be honest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out, in the standard library they annotate such examples with norun
and simply ensure it compiles, since there will be tests covering this anyway. Given that this is quite an odd path to read, we could do that, too?
See std::fs::File
source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to keep the example as a test-case too. I've gone with some compromise where I fake-read the data from a path like it appears in an .xcarchive in a comment while hiding where we actually load it from.
This is still copying data, so not great.
This is for object files, these are not object files.
We now actually look up the hidden symbols correctly
And fixup utf-8 errors
this test is weird
Doesn't belong here, certainly not in the public api
I kind of liked it, but whatever.
Since the iterator does not depend on the lifetime of the object we need to clone and there's no reasonable way around. This clones a vector of pointers, so it is not as bad as it could be.
this simplifies the macho symbol iterator as well now which always returns slices of the same lifetime.
This makes usage more convenient, see the huge symplification on the MachObject symbols iterator.
I don't understand why this only surfaced in this PR, but whatever.
/// | ||
/// If the name matches the `__hidden#NNN_` pattern that indicates a [`BCSymbolMap`] | ||
/// lookup it will be looked up the resolved name will be returned. Otherwise the name | ||
/// is returned unchanged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Bonus points for giving the lookup methods an illustrative # Example
.
symbolic-debuginfo/src/macho.rs
Outdated
/// let object_data = | ||
/// std::fs::read("tests/fixtures/2d10c42f-591d-3265-b147-78ba0868073f.dwarf-hidden") | ||
/// .unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out, in the standard library they annotate such examples with norun
and simply ensure it compiles, since there will be tests covering this anyway. Given that this is quite an odd path to read, we could do that, too?
See std::fs::File
source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻 This looks really straight forward so far.
Can you double check my question regarding filenames?
if pattern.len() > bytes.len() { | ||
pattern = &pattern[..bytes.len()]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means that a truncated file would pass this check. Is this really what we want? Or is this a limitation that test
does get called with a small slice of the file only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, actually all you want to do is bytes.starts_with(pattern)
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the assumption is that this only gets called with a small slice. This is kind of mimicking what ObjectLike::test
does, this test is sufficient to decide between "is this an object? (Archive::test
)", "is this a BCSymbolMap (BCSymbolMap::test
)" or "is this a PList? (PList::test
)". So in a way the API assumption here is very weak and test is only useful in relation to the other test functions.
On the other hand to call ::parse()
, which is the only thing that actually gives you a full answer to this question, you need to read all the data from disk. So this is useful as a quick check if it's even worth reading the entire file from disk. Again this only makes sense if a BCSymbolMap is regularly larger than the filesystem block size, but it seems that is the case.
Anyway, TL;DR is that this is symmetric with ObjectLike
and makes sense when using it in some of the workflows we do.
} | ||
|
||
#[test] | ||
fn test_data_lifetime() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
/Users/philipphofmann/git-repos/sentry-cocoa/Sources/SentryCrash/Recording/Monitors/SentryCrashMonitor_Signal.c | ||
Sources/SentryCrash/Recording/Monitors/SentryCrashMonitor_Signal.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m a bit surprised to find (absolute) filenames here. Are those being obfuscated as well? Do we need to resolve them when we extract file/line info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, intersting. i must admit i did not check this file... Yes, I imagine that this is also obfuscated. I guess question is whether we need to de-obfuscate other things than just symbols as well I guess. We should look at this sometime.
If users want this they'll have to newtype it.
Co-authored-by: Arpad Borsos <[email protected]>
Use some indicators of the pathnames inside an xcarchive instead which is more commonly what people might be finding themselves navigating.
Otherwise we clone the entire vector again which could be a large number of entries, even if the entries are small. There is no real downside to using an Arc here as this is entirely internal.
There is one TODO left to decide, but otherwise this is pretty much done.