-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly handle dyld caches on macOS 13 and above #642
Correctly handle dyld caches on macOS 13 and above #642
Conversation
This allows successful parsing of dyld caches on macOS 13 and above on Intel Macs. The main dyld cache file on macOS contains an array of subcache info structs, each of which specifies the UUID (and some other information) of each subcache. `DyldCache::parse` checks that the subcache UUIDs match these expected UUIDs. In macOS 13, the format of the subcache info struct changed: it gained an additional field after the UUID field. This means that as soon as you had more than one subcache, our UUID check would fail, because the second subcache UUID would be read from the wrong offset. I didn't notice this on my Apple Silicon Mac, because the arm64e dyld cache only has one subcache: `dyld_shared_cache_arm64e.01`. But on Intel Macs, there are currently four subcaches: `dyld_shared_cache_x86_64.01`, `.02`, `.03`, and `.04`. In practice this means that my software hasn't been able to symbolicate macOS system libraries on Intel Macs since the release of macOS 13. This commit adds the new struct definition and makes the UUID check work correctly. This is a breaking change to the public API. I added a `DyldSubCacheSlice` enum, but I'm not particularly fond of it. I'm also not a big fan of the new allocation for the Vec of UUIDs, but it seemed better than the alternatives I tried, which all had a bunch of code duplication.
Are you aware of any easy way to test this without owning Apple hardware (both Apple silicon and Intel)? The best I have currently is using whatever machines CI provides. |
Currently the examples (objdump and dyldcachedump) only load subcaches with either an integer suffix or ".symbols" suffix, and |
The two ways I can think of involve quite a bit of effort:
Oh, I had forgotten about these examples. You're right that they currently won't work - on macOS 13, the suffixes now have a leading zero. With the current state of this PR, users of |
The dyldcachedump example was already taking the leading zero into account, so it was working, modulo the UUID verification issue. The objdump example was not using a leading zero so it failed to load the subcaches and said "Failed to parse file: Unsupported file format". |
5901677
to
accd7e5
Compare
dyldcachedump was working correctly on macOS 13+ because it was trying the "leading zero" suffix format as well as the "no leading zero" suffix format. This commit changes it to read the suffix from the main cache header. objdump was not able to parse dyld shared cache files on macOS 13+ because it was only using the "no leading zero" suffix format, and thus not finding the subcaches.
accd7e5
to
a24f21c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I assume you want a release with this soon?
Thank you! A release would be appreciated. |
This allows successful parsing of dyld caches on macOS 13 and above on Intel Macs. The main dyld cache file on macOS contains an array of subcache info structs, each of which specifies the UUID (and some other information) of each subcache. `DyldCache::parse` checks that the subcache UUIDs match these expected UUIDs. In macOS 13, the format of the subcache info struct changed: it gained an additional field after the UUID field. This means that as soon as you had more than one subcache, our UUID check would fail, because the second subcache UUID would be read from the wrong offset. I didn't notice this on my Apple Silicon Mac, because the arm64e dyld cache only has one subcache: `dyld_shared_cache_arm64e.01`. But on Intel Macs, there are currently four subcaches: `dyld_shared_cache_x86_64.01`, `.02`, `.03`, and `.04`. In practice this means that my software hasn't been able to symbolicate macOS system libraries on Intel Macs since the release of macOS 13. This commit adds the new struct definition and makes the UUID check work correctly. This is a breaking change to the public API. I added a `DyldSubCacheSlice` enum, but I'm not particularly fond of it. dyldcachedump was working correctly on macOS 13+ because it was trying the "leading zero" suffix format as well as the "no leading zero" suffix format. This commit changes it to read the suffix from the main cache header. objdump was not able to parse dyld shared cache files on macOS 13+ because it was only using the "no leading zero" suffix format, and thus not finding the subcaches.
This allows successful parsing of dyld caches on macOS 13 and above on Intel Macs.
The main dyld cache file on macOS contains an array of subcache info structs, each of which specifies the UUID (and some other information) of each subcache.
DyldCache::parse
checks that the subcache UUIDs match these expected UUIDs.In macOS 13, the format of the subcache info struct changed: the struct gained an additional field after the UUID field. This means that as soon as you had more than one subcache, our UUID check would fail, because the second subcache UUID would be read from the wrong offset.
I didn't notice this on my Apple Silicon Mac, because the arm64e dyld cache only has one subcache:
dyld_shared_cache_arm64e.01
.But on Intel Macs, there are currently four subcaches:
dyld_shared_cache_x86_64.01
,.02
,.03
, and.04
.In practice this means that my software hasn't been able to symbolicate macOS system libraries on Intel Macs since the release of macOS 13.
This commit adds the new struct definition and makes the UUID check work correctly.
This is a breaking change to the public API. I added a
DyldSubCacheSlice
enum, but I'm not particularly fond of it.I'm also not a big fan of the new allocation for the Vec of UUIDs, but it seemed better than the alternatives I tried, which all had a bunch of code duplication.
dyld source code:
dyld_cache_header
struct definitiondyld_subcache_entry_v1
anddyld_subcache_entry
struct definitionsI've changed
MIN_HEADER_SIZE_SUBCACHES_V1
to be0x1c8
instead of0x1c4
because we use a >= check on it. dyld uses a > check with the offset of thesubCacheArrayCount
field (calledimages_across_all_subcaches_count
in our definition), so what dyld is doing is a equivalent to a> 0x1c4
check. Rather than changing our check to be a > check, I've kept it as a >= because that fits better with theMIN_HEADER_SIZE
name, and instead increased the offset to point to the end of the field instead of the beginning.I didn't touch our definition of
DyldCacheHeader
. It's quite incomplete because the dyld source with the new header fields hadn't been released yet at the time I wrote theobject
code. Now that it's out, we could flesh out our type definition a bit, but I don't have an urgent need for that at the moment.