Skip to content

Conversation

steveisok
Copy link
Member

@steveisok steveisok commented Sep 6, 2025

A fairly recent change in the swift runtime fires the swift backtrace handler by default when a process has an unhandled exception. This applies to most / all of the MacOS versions we support.

This leads to a messy and unclear output of a backtrace on the main host process. We can work around it by setting a noop handler for SIGABRT before we call abort() in PROCAbort.

Contributes to #118823

For reasons yet to be confirmed, when an application throws an unhandled exception, the swift backtrace handler appears to be on for various versions of MacOS that we support.

This leads to a messy and unclear output of a backtrace on the main host process. It can be worked around by setting the env variable SWIFT_BACKTRACE=enable=no.

Contributes to dotnet#118823
@Copilot Copilot AI review requested due to automatic review settings September 6, 2025 18:05
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses an issue where Swift backtrace handler interferes with .NET host process exception handling on macOS, causing unclear output when applications throw unhandled exceptions. The fix disables Swift backtrace by setting the SWIFT_BACKTRACE=enable=no environment variable early in the host process startup.

Key Changes

  • Added macOS-specific environment variable setting to disable Swift backtrace functionality
  • Positioned the fix at the beginning of the main function to ensure it takes effect before any exception handling

@jkotas
Copy link
Member

jkotas commented Sep 10, 2025

  • The code you have changed is not used on NativeAOT. Do we need the same change for NativeAOT as well?

On dummy signal handler vs. set env variable:

  • The environment variable won't kick for hosted scenarios where the Swift runtime is loaded before the .NET runtime. Does the dummy signal solution have the same problem?
  • Environment variables are inherited, so any child processes will get the modified behavior as well. Dummy signal handler would be limited to the specific process only.

@vcsjones
Copy link
Member

The environment variable won't kick for hosted scenarios where the Swift runtime is loaded before the .NET runtime. Does the dummy signal solution have the same problem?

The signal handler gets installed here:

https://github.com/swiftlang/swift/blob/4fabc61c82991c26552c48d5176c939d03e803f1/stdlib/public/runtime/CrashHandlerMacOS.cpp#L113

By _swift_installCrashHandler

Which is called by BacktraceInitializer::BacktraceInitializer() https://github.com/swiftlang/swift/blob/4fabc61c82991c26552c48d5176c939d03e803f1/stdlib/public/runtime/Backtrace.cpp#L535

This is the same place that the environment variable is read.

It seems very likely to me then that signal handler needs to be installed before the Swift runtime is loaded.

@MichalPetryka
Copy link
Contributor

The environment variable won't kick for hosted scenarios where the Swift runtime is loaded before the .NET runtime. Does the dummy signal solution have the same problem?

The signal handler gets installed here:

https://github.com/swiftlang/swift/blob/4fabc61c82991c26552c48d5176c939d03e803f1/stdlib/public/runtime/CrashHandlerMacOS.cpp#L113

By _swift_installCrashHandler

Which is called by BacktraceInitializer::BacktraceInitializer() https://github.com/swiftlang/swift/blob/4fabc61c82991c26552c48d5176c939d03e803f1/stdlib/public/runtime/Backtrace.cpp#L535

This is the same place that the environment variable is read.

It seems very likely to me then that signal handler needs to be installed before the Swift runtime is loaded.

Can't the signal handler be overwritten with another one?

@steveisok
Copy link
Member Author

Can't the signal handler be overwritten with another one?

My read of it is the swift signal handler won't install if one is already set.

@steveisok
Copy link
Member Author

This is the same place that the environment variable is read.

It seems very likely to me then that signal handler needs to be installed before the Swift runtime is loaded.

Fair to conclude we should go with the signal handler approach?

@MichalPetryka
Copy link
Contributor

My read of it is the swift signal handler won't install if one is already set.

Yeah but I mean, couldn't dotnet replace the already installed Swift one?

@jkotas
Copy link
Member

jkotas commented Sep 10, 2025

Fair to conclude we should go with the signal handler approach?

There is a range of options from small hammer to big hammer, with number of options between:

  • Small hammer: Override the ABORT signal handler with no-op signal handler right before calling abort() from our code after we have displayed our unhandled exception information. Everything else (other signals, abort called from other code) gets the default behavior.

  • Big hammer: Set the environment variable unconditionally to disable the Swift handler everywhere, for all signals, including child processes. (The current PR state is a notch below this - it sets the environment variable only if it is not set.)

There is no obvious winner. If I were to pick, I would go with the smallest hammer possible. It works best for runtimes to be as little invasive as possible and avoid "owning the process" since it prevents multiple runtimes from peacefully co-existing in the same process.

@steveisok
Copy link
Member Author

  • Small hammer: Override the ABORT signal handler with no-op signal handler right before calling abort() from our code after we have displayed our unhandled exception information. Everything else (other signals, abort called from other code) gets the default behavior.

I pushed a change that registers a noop handler in PROCAbort. Works nicely. Happy to make whatever adjustments necessary.

@steveisok steveisok changed the title [host] Set SWIFT_BACKTRACE to enable=no on OSX [host] Prevent swift backtrace handler from firing when the runtime aborts on OSX Sep 11, 2025
@lateralusX lateralusX self-requested a review September 12, 2025 11:36
Copy link
Member

@lateralusX lateralusX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@steveisok
Copy link
Member Author

@jkotas @elinor-fung are you two good with this change? I'd like to merge and get the servicing approvals going.

@jkotas
Copy link
Member

jkotas commented Sep 12, 2025

I do not see this comment addressed:

The code you have changed is not used on NativeAOT. Do we need the same change for NativeAOT as well?

Copy link
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not seeing good explanation what makes NativeAOT immune to this problem. There seems to be something missing in the explanation what's going on.

@steveisok
Copy link
Member Author

I am not seeing good explanation what makes NativeAOT immune to this problem. There seems to be something missing in the explanation what's going on.

After more review, I think the explanation is that the swift backtrace handler gets linked out. I suspect that's always going to happen because the handler is never going to get hit by any of our code since it only runs in a global constructor.

To prove this, I modified my local sample to use AesGcm since that plus ChaCha20Poly1305 are the only crypto algorithms that will pull the swift runtime in. I then added a breakpoint for sigaction and observed all of the callers. Only spots in the nativeaot runtime hit it.

@jkotas
Copy link
Member

jkotas commented Sep 16, 2025

After more review, I think the explanation is that the swift backtrace handler gets linked out. I suspect that's always going to happen because the handler is never going to get hit by any of our code since it only runs in a global constructor.

I do not think that's right. I have set a breakpoint at Swift backtrace initialization. I see it being called in both regular CoreCLR and NativeAOT.

In NativeAOT, it is called during process startup at:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = instruction step over
  * frame #0: 0x0000000197754238 libswiftCore.dylib`_GLOBAL__sub_I_Backtrace.cpp + 128
    frame #1: 0x0000000184202cb0 dyld`invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 444
    frame #2: 0x0000000184240730 dyld`invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 324
    frame #3: 0x000000018425f540 dyld`invocation function for block in mach_o::Header::forEachSection(void (mach_o::Header::SectionInfo const&, bool&) block_pointer) const + 312
    frame #4: 0x000000018425c164 dyld`mach_o::Header::forEachLoadCommand(void (load_command const*, bool&) block_pointer) const + 208
    frame #5: 0x000000018425d9fc dyld`mach_o::Header::forEachSection(void (mach_o::Header::SectionInfo const&, bool&) block_pointer) const + 124
    frame #6: 0x0000000184240220 dyld`dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 516
    frame #7: 0x0000000184202a68 dyld`dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 172
    frame #8: 0x000000018420e6a8 dyld`dyld4::PrebuiltLoader::runInitializers(dyld4::RuntimeState&) const + 44
    frame #9: 0x0000000184203214 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&, dyld3::Array<dyld4::Loader const*>&) const + 308
    frame #10: 0x00000001842031b4 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&, dyld3::Array<dyld4::Loader const*>&) const + 212
    frame #11: 0x00000001842031b4 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&, dyld3::Array<dyld4::Loader const*>&) const + 212
    frame #12: 0x0000000184207e50 dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const::$_0::operator()() const + 180
    frame #13: 0x0000000184203530 dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 716
    frame #14: 0x000000018422504c dyld`dyld4::APIs::runAllInitializersForMain() + 400
    frame #15: 0x00000001841e7158 dyld`dyld4::prepare(dyld4::APIs&, mach_o::Header const*) + 3112
    frame #16: 0x00000001841e5d04 dyld`start + 7104

In regular CoreCLR, it gets called when libcoreclr.dylib is loaded at:

* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000184452194 libsystem_c.dylib`sigaction
    frame #1: 0x00000001977547f0 libswiftCore.dylib`_swift_installCrashHandler + 84
    frame #2: 0x0000000197754670 libswiftCore.dylib`_GLOBAL__sub_I_Backtrace.cpp + 1208
    frame #3: 0x0000000184202cb0 dyld`invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 444
    frame #4: 0x0000000184240730 dyld`invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 324
    frame #5: 0x000000018425f540 dyld`invocation function for block in mach_o::Header::forEachSection(void (mach_o::Header::SectionInfo const&, bool&) block_pointer) const + 312
    frame #6: 0x000000018425c164 dyld`mach_o::Header::forEachLoadCommand(void (load_command const*, bool&) block_pointer) const + 208
    frame #7: 0x000000018425d9fc dyld`mach_o::Header::forEachSection(void (mach_o::Header::SectionInfo const&, bool&) block_pointer) const + 124
    frame #8: 0x0000000184240220 dyld`dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 516
    frame #9: 0x0000000184202a68 dyld`dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 172
    frame #10: 0x000000018420e6a8 dyld`dyld4::PrebuiltLoader::runInitializers(dyld4::RuntimeState&) const + 44
    frame #11: 0x0000000184203214 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&, dyld3::Array<dyld4::Loader const*>&) const + 308
    frame #12: 0x00000001842031b4 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&, dyld3::Array<dyld4::Loader const*>&) const + 212
    frame #13: 0x00000001842031b4 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&, dyld3::Array<dyld4::Loader const*>&) const + 212
    frame #14: 0x00000001842031b4 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&, dyld3::Array<dyld4::Loader const*>&) const + 212
    frame #15: 0x0000000184207e50 dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const::$_0::operator()() const + 180
    frame #16: 0x0000000184203530 dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 716
    frame #17: 0x0000000184227ad0 dyld`dyld4::APIs::dlopen_from(char const*, int, void*)::$_0::operator()() const + 1856
    frame #18: 0x000000018421c36c dyld`dyld4::APIs::dlopen_from(char const*, int, void*) + 1104
    frame #19: 0x000000018421be70 dyld`dyld4::APIs::dlopen(char const*, int) + 128
    frame #20: 0x0000000104a552d0 libhostpolicy.dylib`pal::load_library(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const*, void**) + 72
    frame #21: 0x0000000104a25250 libhostpolicy.dylib`coreclr_resolver_t::resolve_coreclr(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, coreclr_resolver_contract_t&) + 128
    frame #22: 0x0000000104a262cc libhostpolicy.dylib`coreclr_t::create(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, char const*, char const*, coreclr_property_bag_t const&, std::__1::unique_ptr<coreclr_t, std::__1::default_delete<coreclr_t>>&) + 64
    frame #23: 0x0000000104a3edf8 libhostpolicy.dylib`(anonymous namespace)::create_coreclr() + 648
    frame #24: 0x0000000104a3e734 libhostpolicy.dylib`corehost_main + 160
    frame #25: 0x00000001049bed50 libhostfxr.dylib`fx_muxer_t::handle_exec_host_command(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, host_startup_info_t const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::unordered_map<known_options, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>, known_options_hash, std::__1::equal_to<known_options>, std::__1::allocator<std::__1::pair<known_options const, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>>>> const&, int, char const**, int, host_mode_t, bool, char*, int, int*) + 1152
    frame #26: 0x00000001049bdcac libhostfxr.dylib`fx_muxer_t::execute(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, int, char const**, host_startup_info_t const&, char*, int, int*) + 1048
    frame #27: 0x00000001049b939c libhostfxr.dylib`hostfxr_main_startupinfo + 128
    frame #28: 0x0000000100004d4c repro`exe_start(int, char const**) + 1368
    frame #29: 0x0000000100005018 repro`main + 184
    frame #30: 0x00000001841e5d54 dyld`start + 7184

Swift backtrace initialization fails in NativeAOT HelloWorld binary. You can see the failure if you enable tracing via environment variable:

jkotas@jkotas-m1 publish % export SWIFT_BACKTRACE=enable=yes,warnings=on
jkotas@jkotas-m1 publish % ./repro                                      
swift runtime: backtrace-on-crash is not supported for privileged executables.
Unhandled exception. System.Exception: Exception of type 'System.Exception' was thrown.
   at Program.<Main>$(String[] args) + 0x34
zsh: abort      ./repro

Here is the code that emits this warning https://github.com/swiftlang/swift/blob/01d458b7a36c1fb971f32f4e6163b56d59b21d73/stdlib/public/runtime/Backtrace.cpp#L376-L392 . NativeAOT binaries are considered privileged since they are missing CS_GET_TASK_ALLOW entitlement checked at https://github.com/swiftlang/swift/blob/01d458b7a36c1fb971f32f4e6163b56d59b21d73/stdlib/public/runtime/Backtrace.cpp#L300 .

(This is what I have found so far. I plan to look into how we deal with CS_GET_TASK_ALLOW a bit more.)

@jkotas
Copy link
Member

jkotas commented Sep 17, 2025

The issue is a regression introduced between .NET 10 Preview 6 and .NET 10 Preview 7 SDKs. The app hosts produced by .NET 10 Preview 7 and newer SDKs have this issue, irrespective of the target runtime. For example, .NET 8 app published by .NET 10 SDK is going to hit this issue as well. See a log from my experiments below.

@jtschuster @agocke This looks like a regression introduced by .NET 10 managed signer (#108992 and follow up PRs). Is it expected that the signatures produced by the managed signer result into behavior changes like this?

(It is certainly an option to workaround the regression in the runtime. The workaround for the regression will have to be backported to .NET 8 and .NET 9 to make sure that .NET 10 SDK can continue target older runtimes. It would be preferable to fix the root cause of the regression instead.)

The issue does not reproduce for apps launched using dotnet myapp.dll. This makes sense since dotnet is not signed by the managed signer.
The issue does not reproduce for NativeAOT compiled apps (unless they are manually codesigned with debug entitlements using codesign tool - I do not think it is a scenario we need to worry about).

.NET 9 app built using .NET SDK 9.0.305 - not affected:
jkotas@jkotas-m1 repro % cat global.json
{
  "sdk": {
    "version": "9.0.305",
    "rollForward": "disable"
  }
}
jkotas@jkotas-m1 repro % dotnet build   
Restore complete (0.2s)
  repro succeeded (0.1s) → bin/Debug/net9.0/repro.dll

Build succeeded in 0.5s
jkotas@jkotas-m1 repro % ./bin/Debug/net9.0/repro
Unhandled exception. System.Exception: Exception of type 'System.Exception' was thrown.
   at Program.<Main>$(String[] args) in /Users/jkotas/repro/Program.cs:line 3
zsh: abort      ./bin/Debug/net9.0/repro
jkotas@jkotas-m1 repro % 

.NET 9 app built using .NET 10 Preview 6 - not affected:

jkotas@jkotas-m1 repro % cat global.json 
{
  "sdk": {
    "version": "10.0.100-preview.6.25358.103",
    "rollForward": "disable"
  }
}
jkotas@jkotas-m1 repro % dotnet build    
Restore complete (0.3s)
You are using a preview version of .NET. See: https://aka.ms/dotnet-support-policy
  repro succeeded (0.2s) → bin/Debug/net9.0/repro.dll

Build succeeded in 0.6s
jkotas@jkotas-m1 repro % ./bin/Debug/net9.0/repro
Unhandled exception. System.Exception: Exception of type 'System.Exception' was thrown.
   at Program.<Main>$(String[] args) in /Users/jkotas/repro/Program.cs:line 3
zsh: abort      ./bin/Debug/net9.0/repro
jkotas@jkotas-m1 repro % 

.NET 9 app built using .NET 10 Preview 7 - affected:

jkotas@jkotas-m1 repro % cat global.json 
{
  "sdk": {
    "version": "10.0.100-preview.7.25380.108",
    "rollForward": "disable"
  }
}
jkotas@jkotas-m1 repro % dotnet build    
Restore complete (0.3s)
    info NETSDK1057: You are using a preview version of .NET. See: https://aka.ms/dotnet-support-policy
  repro succeeded (0.2s) → bin/Debug/net9.0/repro.dll

Build succeeded in 0.6s
jkotas@jkotas-m1 repro % ./bin/Debug/net9.0/repro
Unhandled exception. System.Exception: Exception of type 'System.Exception' was thrown.
   at Program.<Main>$(String[] args) in /Users/jkotas/repro/Program.cs:line 3

💣 Program crashed: Aborted at 0x0000000189dbe5b0

... swift crash reporter spew ...

Self-contained .NET 8 app published using .NET 10 Preview 7 - affected:

jkotas@jkotas-m1 repro % cat global.json 
{
  "sdk": {
    "version": "10.0.100-preview.7.25380.108",
    "rollForward": "disable"
  }
}
jkotas@jkotas-m1 repro % dotnet publish --self-contained
Restore complete (0.3s)
    info NETSDK1057: You are using a preview version of .NET. See: https://aka.ms/dotnet-support-policy
  repro succeeded (0.3s) → bin/Release/net8.0/osx-arm64/publish/

Build succeeded in 0.7s
jkotas@jkotas-m1 repro % bin/Release/net8.0/osx-arm64/publish/repro 
Unhandled exception. System.Exception: Exception of type 'System.Exception' was thrown.
   at Program.<Main>$(String[] args) in /Users/jkotas/repro/Program.cs:line 3

💣 Program crashed: Aborted at 0x0000000189dbe5b0

... swift crash reporter spew ...

@jtschuster
Copy link
Member

This looks like a regression introduced by .NET 10 managed signer (#108992 and follow up PRs). Is it expected that the signatures produced by the managed signer result into behavior changes like this?

If this is related to entitlements, it could be a result of #116659 which preserves the entitlements of the apphost. This is the only expected different behavior between .NET 9 and 10 with the managed signer. If that's the issue, I'd expect it to repro if we re-sign dotnet with an ad-hoc signature while preserving the entitlements (codesign -s - --preserve-metadata=entitlements dotnet) and run the .dll with dotnet MyApp.dll.

@jkotas
Copy link
Member

jkotas commented Sep 17, 2025

I'd expect it to repro if we re-sign dotnet with an ad-hoc signature while preserving the entitlements (codesign -s - --preserve-metadata=entitlements dotnet) and run the .dll with dotnet MyApp.dll.

Yes, this makes dotnet MyApp.dll display the swift debugger as well.

it could be a result of #116659 which preserves the entitlements of the apphost

What is the user facing behavior improved by this PR? (In other words, what is going to break if this PR is reverted?)

@jtschuster
Copy link
Member

What is the user facing behavior improved by this PR? (In other words, what is going to break if this PR is reverted?)

The change was motivated by #113707. Apps that are signed with the hardened runtime would have to re-add the entitlements that were stripped from the apphost, which isn't obvious. There was at least one other internal team that is expecting the entitlements to be preserved after that PR, but I can notify them if we do need to revert the change.

@jkotas
Copy link
Member

jkotas commented Sep 17, 2025

I think we should revert the change in the signer.

Apps that are signed with the hardened runtime would have to re-add the entitlements that were stripped from the apphost

If apps want to be signed with the hardened runtime, they should actively consider what entitlements they actually need and go with minimal set possible. We should not be blindly copying everything for them. It is unsecure default.

which isn't obvious

This should be fixed by better documentation. The documentation should lead with recommending NativeAOT as the form-factor to use for hardened environments. I believe that NativeAOT runtime does not need any additional entitlements to function - is that right?

@jtschuster
Copy link
Member

If apps want to be signed with the hardened runtime, they should actively consider what entitlements they actually need and go with minimal set possible. We should not be blindly copying everything for them. It is unsecure default.

I agree copying all of the entitlements doesn't make sense, but it could make sense to add the jit entitlement by default. I can't think of any way a framework-dependendent or singlefile application wouldn't need it. NativeAOT and interpreter-based applications are the only exceptions. NAOT doesn't use the managed signer, I don't know about interpreter apps.

@jkotas
Copy link
Member

jkotas commented Sep 18, 2025

it could make sense to add the jit entitlement by default

Yes, I agree that it would make sense to add the minimum entitlements that are required for the system to function at all, such as the JIT entitlement for runtimes with the JIT.

@agocke
Copy link
Member

agocke commented Sep 18, 2025

If apps want to be signed with the hardened runtime

Note that this is effectively mandatory. All apps must enable the hardened runtime to be notarized. I think we should ensure that, even if we’re stripping entitlements, we’re adding back the minimum set needed for .NET to function.

However, I would be surprised if we’re adding more entitlements to the apphost than exactly that. I’m not sure which missing entitlement is blocking the swift backtrace.

@filipnavara
Copy link
Member

However, I would be surprised if we’re adding more entitlements to the apphost than exactly that. I’m not sure which missing entitlement is blocking the swift backtrace.

Presumably the com.apple.security.get-task-allow entitlement. That specifies the application can be debugged and it's respected by both lldb and the Swift debugger.

@agocke
Copy link
Member

agocke commented Sep 18, 2025

I guess we could keep the entitlement perseveration change and remove that entitlement, but it wouldn’t help older frameworks. We could add a new set of entitlements during apphost publish. Large change maybe.

@vcsjones
Copy link
Member

Is this "just" an entitlements problem? Presumably someone could apply com.apple.security.get-task-allow to their NAOT binary entitlement themselves, then the Swift debugger will kick in. Is that still desirable?

@steveisok
Copy link
Member Author

Is this "just" an entitlements problem? Presumably someone could apply com.apple.security.get-task-allow to their NAOT binary entitlement themselves, then the Swift debugger will kick in. Is that still desirable?

I suspect not, but I'm waiting until we align on the entitlements before going there.

@agocke
Copy link
Member

agocke commented Sep 18, 2025

I also thought ad hoc signing overwrote all these values. How is backtrace getting blocked when the apphost is ad hoc signed? Do some of the permissions still apply? Clearly JIT does because otherwise nothing from dotnet would work before the entitlements change.

@jkotas
Copy link
Member

jkotas commented Sep 18, 2025

I guess we could keep the entitlement perseveration change and remove that entitlement

It is not clear to me why all hardened apps need com.apple.security.cs.allow-dyld-environment-variables, com.apple.security.cs.disable-library-validation and com.apple.security.cs.debugger entitlements either.

(I understand why runtime w/ JIT needs com.apple.security.cs.allow-jit.)

Is this "just" an entitlements problem? Presumably someone could apply com.apple.security.get-task-allow to their NAOT binary entitlement themselves, then the Swift debugger will kick in. Is that still desirable?

There is no good answer here. If we go with the fix in this PR, somebody may want to get notified about all aborts by subscribing to the signal. This change will break them. I think we should shoot for reasonable default experience (e.g. no swift debugger by default) and stay out of the way as much as possible otherwise.

@jtschuster
Copy link
Member

We could add a new set of entitlements

If we do more than strip all entitlements, I'd lean more towards this rather than tracking what to remove from the apphost entitlements. It also means we don't have to parse and edit the entitlements.

But I think the "right" way to do this would be to create an entitlements itemgroup in the sdk which is passed to the managed signer, and a cross-cutting change like that would be a fair bit of new code, even if we make it off by default. But we also could just keep the minimal set hardcoded in the managed signer and add the itemgroup for 11.

@jkotas
Copy link
Member

jkotas commented Sep 18, 2025

How is backtrace getting blocked when the apphost is ad hoc signed? Do some of the permissions still apply?

It is not easy to reverse engineer all details how this works. (BacktraceInitializer::BacktraceInitializer that is on github https://github.com/swiftlang/swift/blob/main/stdlib/public/runtime/Backtrace.cpp#L326C1-L326C43 does not seem to match exactly what's actually shipping in macOS. When I stepped through it, I run into some extra logic that I was not able to match with github sources.)

@agocke
Copy link
Member

agocke commented Sep 18, 2025

It also means we don't have to parse and edit the entitlements.

Overall, moving from preserving all entitlements to preserving a small, predetermined set makes sense to me. However, I'm worried about the lack of the get-task-allow entitlement blocking the Swift backtrace handler. It seems like it might have other effects. We should double check that you can still do things like attach a debugger to the process.

@janvorli
Copy link
Member

We should double check that you can still do things like attach a debugger to the process.

I don't clearly remember it, but I think that entitlement get-task-allow was actually there to allow debugging.

@steveisok
Copy link
Member Author

We should double check that you can still do things like attach a debugger to the process.

I don't clearly remember it, but I think that entitlement get-task-allow was actually there to allow debugging.

This may explain why our single file SOS tests hit the floor on osx-arm64.

@agocke
Copy link
Member

agocke commented Sep 18, 2025

To be clear, I'm fine with blocking debugging by default for release bits as part of Mac's policy to use hardened runtime by default. What I don't want is for local builds to suddenly become undebuggable.

@steveisok
Copy link
Member Author

Closing as we went with #119824 instead

@steveisok steveisok closed this Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants