Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plans for macOS Big Sur system libraries in the dyld shared cache? #268

Closed
mstange opened this issue Dec 17, 2020 · 4 comments · Fixed by #308
Closed

Plans for macOS Big Sur system libraries in the dyld shared cache? #268

mstange opened this issue Dec 17, 2020 · 4 comments · Fixed by #308

Comments

@mstange
Copy link
Contributor

mstange commented Dec 17, 2020

Hi,

starting with macOS Big Sur, it has become a fair bit harder to obtain symbols from most macOS system libraries, e.g. for profilers or stack traces: The libraries are no longer present as files in the file system, instead they are in a system-wide dyld shared cache file. See these Google results for more information.
So one way to get at the symbols would be to extract the libraries from the cache. Another option, when running inside the process that has the relevant libraries loaded, would be to find the mappings in the process memory and get the symbol table from there.
Do you know if somebody has already started working on rust code to parse the dyld_shared_cache format and write an extractor? Or whether there are any other plans to make getting symbols for macOS system libraries easier in the rust ecosystem?

I imagine this would also be needed to get proper symbols in rust panic stack traces, for example. (Though those seem to be broken at the moment even before Big Sur, see rust-lang/rust#79831 ).

@philipc
Copy link
Contributor

philipc commented Dec 17, 2020

Do you know if somebody has already started working on rust code to parse the dyld_shared_cache format and write an extractor? Or whether there are any other plans to make getting symbols for macOS system libraries easier in the rust ecosystem?

No I don't. If it's something that is needed for backtraces, then it might be appropriate to add parts of the parsing to this crate.

(Though those seem to be broken at the moment even before Big Sur, see rust-lang/rust#79831

I wasn't aware of that. It might be better to create that issue in the backtrace-rs crate. Sounds like a bad bias calculation somewhere.

@jrmuizel
Copy link
Contributor

It would be interesting to check what lldb does.

@philipc
Copy link
Contributor

philipc commented Jan 15, 2021

From a quick look, lldb gets symbols from the images loaded in memory of the target process, rather than parsing the shared cache itself. It also has the ability to use the images in the host process if the UUID matches, since this is faster. There is some dyld shared cache parsing code, but it is only for iOS. (I've mostly based this on https://reviews.llvm.org/D83023).

@mstange
Copy link
Contributor Author

mstange commented May 17, 2021

I've started making an attempt at this in https://github.com/mstange/object/commits/dyld_cache .
(Edit: This seems to require a lot more work before it can spit out reasonable symbols.)
(Edit: Nevermind, it just needed a new API MachOFile::parse_at_offset so that absolute offsets like symoff are treated correctly - this cannot be done just by subsetting the input. Now it's working correctly.)

mstange added a commit to mstange/object that referenced this issue May 18, 2021
This allows parsing Mach-O images inside dyld shared cache files (gimli-rs#268):
The dyld shared cache contains multiple images at different offsets; all these
images share the same address space for absolute offsets such as symoff. Due to
these absolute offsets, one cannot just parse the images by subsetting the input
slice and parsing at header offset zero.

This patch is a breaking change because it adds a header_offset argument to the
MachHeader methods load_commands and uuid, and MachHeader is part of the public API.
mstange added a commit to mstange/object that referenced this issue May 18, 2021
This allows parsing Mach-O images inside dyld shared cache files (gimli-rs#268):
The dyld shared cache contains multiple images at different offsets; all these
images share the same address space for absolute offsets such as symoff. Due to
these absolute offsets, one cannot just parse the images by subsetting the input
slice and parsing at header offset zero.

This patch is a breaking change because it adds a header_offset argument to the
MachHeader methods load_commands and uuid, and MachHeader is part of the public API.
mstange added a commit to mstange/object that referenced this issue May 18, 2021
This allows parsing Mach-O images inside dyld shared cache files (gimli-rs#268):
The dyld shared cache contains multiple images at different offsets; all these
images share the same address space for absolute offsets such as symoff. Due to
these absolute offsets, one cannot just parse the images by subsetting the input
slice and parsing at header offset zero.

This patch is a breaking change because it adds a header_offset argument to the
MachHeader methods load_commands and uuid, and MachHeader is part of the public API.
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Sep 1, 2021
…from the dyld shared cache. r=canaltinova

This lets us obtain symbols for macOS system libraries on macOS 11+
even if these symbols are not present on the Mozilla symbol server.

Some background for this is described in gimli-rs/object#268 .

This patch makes use of the syntax `dyldcache:<dyldcachepath>:<librarypath>`.
There is some code in the profiler-get-symbols wasm integration which parses this syntax
and turns it into a `CandidatePathInfo::InDyldCache` enum value.
And profiler-get-symbols itself will then check the dyld shared cache for the requested
library, and parse it from there.

You can run the following pieces of code in the Firefox error console to
test whether this patch is working, on macOS 11 or above:

```
var { createLocalSymbolicationService } = ChromeUtils.import("resource://devtools/client/performance-new/symbolication.jsm.js");
var service = createLocalSymbolicationService(Services.profiler.sharedLibraries, []);
var appkit = Services.profiler.sharedLibraries.find(l => l.name == "AppKit");
var [addrs, index, buffer] = await service.getSymbolTable(appkit.debugName, appkit.breakpadId);
addrs.length
```

```
var { createLocalSymbolicationService } = ChromeUtils.import("resource://devtools/client/performance-new/symbolication.jsm.js");
var service = createLocalSymbolicationService(Services.profiler.sharedLibraries, []);
var appkit = Services.profiler.sharedLibraries.find(l => l.name == "AppKit");
JSON.parse(await service.querySymbolicationApi("/symbolicate/v5", JSON.stringify({memoryMap:[[appkit.name,appkit.breakpadId]],stacks:[[[0,0x12f00d]]]}))).results[0].stacks[0][0]
```

Before this patch, getSymbolTable would throw an error (file not found), and
querySymbolicationApi would return an object without a function name.
With this patch, getSymbolTable finds all the symbols in AppKit, and
querySymbolicationApi returns the correct function name.

Depends on D123815

Differential Revision: https://phabricator.services.mozilla.com/D123816
spinda pushed a commit to PLSysSec/cachet-firefox that referenced this issue Sep 8, 2021
…from the dyld shared cache. r=canaltinova

This lets us obtain symbols for macOS system libraries on macOS 11+
even if these symbols are not present on the Mozilla symbol server.

Some background for this is described in gimli-rs/object#268 .

This patch makes use of the syntax `dyldcache:<dyldcachepath>:<librarypath>`.
There is some code in the profiler-get-symbols wasm integration which parses this syntax
and turns it into a `CandidatePathInfo::InDyldCache` enum value.
And profiler-get-symbols itself will then check the dyld shared cache for the requested
library, and parse it from there.

You can run the following pieces of code in the Firefox error console to
test whether this patch is working, on macOS 11 or above:

```
var { createLocalSymbolicationService } = ChromeUtils.import("resource://devtools/client/performance-new/symbolication.jsm.js");
var service = createLocalSymbolicationService(Services.profiler.sharedLibraries, []);
var appkit = Services.profiler.sharedLibraries.find(l => l.name == "AppKit");
var [addrs, index, buffer] = await service.getSymbolTable(appkit.debugName, appkit.breakpadId);
addrs.length
```

```
var { createLocalSymbolicationService } = ChromeUtils.import("resource://devtools/client/performance-new/symbolication.jsm.js");
var service = createLocalSymbolicationService(Services.profiler.sharedLibraries, []);
var appkit = Services.profiler.sharedLibraries.find(l => l.name == "AppKit");
JSON.parse(await service.querySymbolicationApi("/symbolicate/v5", JSON.stringify({memoryMap:[[appkit.name,appkit.breakpadId]],stacks:[[[0,0x12f00d]]]}))).results[0].stacks[0][0]
```

Before this patch, getSymbolTable would throw an error (file not found), and
querySymbolicationApi would return an object without a function name.
With this patch, getSymbolTable finds all the symbols in AppKit, and
querySymbolicationApi returns the correct function name.

Depends on D123815

Differential Revision: https://phabricator.services.mozilla.com/D123816
mcbegamerxx954 pushed a commit to mcbegamerxx954/object that referenced this issue Jun 15, 2024
This allows parsing Mach-O images inside dyld shared cache files (gimli-rs#268):
The dyld shared cache contains multiple images at different offsets; all these
images share the same address space for absolute offsets such as symoff. Due to
these absolute offsets, one cannot just parse the images by subsetting the input
slice and parsing at header offset zero.

This patch is a breaking change because it adds a header_offset argument to the
MachHeader methods load_commands and uuid, and MachHeader is part of the public API.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants