-
Notifications
You must be signed in to change notification settings - Fork 11.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reland [asan][windows] Eliminate the static asan runtime on windows #107899
Reland [asan][windows] Eliminate the static asan runtime on windows #107899
Conversation
The profiling runtime is still built with /MT, as it does not work with /MD (and not well supported on windows) Co-authored-by: Amy Wishnousky <[email protected]>
Instead build only ONE asan runtime on windows. File Overview: * asan_malloc_win_thunk.cpp Contains interceptors for malloc/free for applications using the static CRT. These are intercepted with an oldnames style library that takes precedence over the CRT because of linker search order. This is used instead of the library interception routines used for other functions so that we can intercept mallocs that happen before our interceptors run. Some other CRT functions are also included here, because they are provided by the same objects as allocation functions in the debug CRT. * asan_win_common_runtime_thunk.cpp just the common init code for both static and dynamic CRTs * asan_win_static_runtime_thunk.cpp static CRT specific initialization * asan_win_dynamic_runtime_thunk.cpp dynamic crt initialization, most of the content that was here has been moved to the common runtime thunk * asan_win_dll_thunk.cpp This used to provide functionality to redirect calls from DLLs to asan instrumented functions in the main library, but this never worked that well and was a nightmare. It's gone now * sanitizer_common/sanitizer_common_interface.inc: The added functions are for the thunks to be able to delegate to the asan runtime DLL in order to override functions that live in the application executable at initialization. The ASAN dll can't do this itself because it doesn't have a way to get the addresses of these functions. * sanitizer_common/sanitizer_win_immortalize: This is just an implementation of call_once that doens't require the CRT or C++ stdlib. We need this because we need to do this sort of thing before the CRT has initialized. This infrastructure is kinda ugly, we're sorry. * sanitizer_common/sanitizer_win_interception.cpp: Provides the interface inside the sanitizer runtime DLL that instramented apps call to intercept stuff. * sanitizer_common/sanitizer_win_thunk_interception.cpp: this is the code to setup and run the static initializers and/or TLS initializers, implemented basically how any initializers are on windows, these ones are registered before all the CRT initializers. * sanitizer_common/sanitizer_win_thunk_interception.h INTERCEPT_LIBRARY_FUNCTION and REGISTER_WEAK_FUNCTION are the called initializers for each relevant function inside the instrumented binary. Note the optimizer is disabled for weak function registration routines because it detects that the two functions being compared have different names and deduces they must be the same, and no actual codegen for the if is required, causing an infinite loop. Clang does this in debug mode as well as release mode, and the cast to uintptr_t is required to suppress it in debug mode. Co-Authored-By: Amy Wishnousky <[email protected]>
Now that the static runtime is gone the required librares are different. Note that /MD or -D_DLL causes the dynamic runtime to get linked, this is a little gross but emulates the behavior of MSVC. "Real" msvc will link the debug build of asan if you specify /DEBUG or _DEBUG, but that's not really necessary and requires building multiple configurations from the same tree.
weak functions are registered after the asan runtime initializes. This means __asan_default_options isn't available during asan runtime initialization. Thus we split asan flag processing into two parts and register a callback for the second part that's executed after __asan_default_options is registered. Co-Authored-By: Amy Wishnousky <[email protected]>
ALLOCATION_FUNCTION_ATTRIBUTE wasn't used elsewhere, and was just one attribute, so abstracting it through a macro wasn't doing much good now that it's not conditional on runtime type. We're always in the dynamic runtime now so eliminate the preprocessor conditional. The new exported functions are the interface used by the intercepted malloc/free family in the instrumented binary to call the asan versions inside the dll runtime. Co-authored-by: Amy Wishnousky <[email protected]>
…runtime thunks and the fact we're "always" using the dynamic asan runtime. python formatting
… config After the windows static runtime is removed these static=static CRT and dynamic=dynamic CRT, both using the dynamic asan runtime.
This is required because now that there's only one asan runtime dll the dynamic/static CRT wholearchived runtime thunk "flavor" is determined by passing the /MD or /MDd options, or defining -D_DLL.
…which is requried for weak exports to function correctly
…ion, and asan. Keep the other sanitizers as /MT since they don't support OneDLL
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/128/builds/609 Here is the relevant piece of the build log for the reference
|
This PR seems to have broken ASAN in mingw configurations. The symptoms seem to be that the ASAN DLL just locks up, hard, when the process is loaded. (Or more precisely, has entered some infinite loop.) I tried attaching to such a hung process with windbg, and it's showing this:
And a backtrace that looks like this:
(Side note; the PR seems to have added a bunch of new files with CRLF newlines - it'd be nice to clean this up.) To repro the issue for yourself, you can do the following:
It's possible to reproduce the same by building and running the whole compiler-rt testsuite as well, but that's a bit trickier to set up for this configuration. Surprising side note; when I tried the repro procedure above, building compiler-rt with a slightly older Clang release, and 18.x version, the built ASAN seems to not hang. I'm not sure if this is a functional difference, or if it just so happens to link things slightly differently so whatever issue there is seems to just not happen. This commit is kinda complex, as it not only removes the static asan configuration (which never was involved in mingw use cases before), but I guess also sets things up so the dynamic asan can be used even when linking the CRT statically? This issue has caused my nightly builds to start failing: https://github.com/mstorsjo/llvm-mingw/actions/runs/10803005560 |
I think this is also somehow affecting asan tests on macos: https://green.lab.llvm.org/job/llvm.org/job/clang-stage1-RA/2034/ was able to reproduce locally |
previous passing output
failing output with this PR
|
@@ -136,7 +160,7 @@ append_list_if(MINGW "${MINGW_LIBRARIES}" ASAN_DYNAMIC_LIBS) | |||
add_compiler_rt_object_libraries(RTAsan_dynamic | |||
OS ${SANITIZER_COMMON_SUPPORTED_OS} | |||
ARCHS ${ASAN_SUPPORTED_ARCH} | |||
SOURCES ${ASAN_SOURCES} ${ASAN_CXX_SOURCES} | |||
SOURCES ${ASAN_SOURCES} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line seems to be the problem, reverting it fixes the tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be safe to revert, and I'll make a PR reverting it (and probably just commit it if it doesn't get any reviews before I go to sleep).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened #108329 which should revert this particular edit (which was arguably a driveby change anyway)
After investigating it is indeed mangled code due to the UB in register_weak_. I had to step through from the very beginning of process init, as it doesn't seem like the debuginfo generated by 00007ff6`90501440 4883ec28 sub rsp, 28h
00007ff6`90501444 488d0d752c0000 lea rcx, [hello!.refptr._newmode+0x38 (7ff6905040c0)]
00007ff6`9050144b 488d159e1f0000 lea rdx, [hello!__asan_default_options__dll (7ff6905033f0)]
00007ff6`90501452 e8e9060000 call hello!_ZN11__sanitizer13register_weakEPKcy (7ff690501b40)
00007ff6`90501457 89442424 mov dword ptr [rsp+24h], eax
00007ff6`9050145b 8b442424 mov eax, dword ptr [rsp+24h]
00007ff6`9050145f 4883c428 add rsp, 28h
00007ff6`90501463 c3 ret Note that there is no comparison, the call to __sanitizer::register_weak is made unconditionally, even though in this case the local function should be the same as the default implementation from the dll. Worse, it seems like it's not even loading the correct pointer for the local function, so it ends up "intercepting" some random memory! I've fixed this in #108327 |
the new/delete code was removed from RTAsan_dynamic in #107899, but that broke things on macos. This reverts the offending change.
…lvm#107899) This reapplies 8fa66c6 ([asan][windows] Eliminate the static asan runtime on windows) for a second time. That PR bounced off the tests because it caused failures in the other sanitizer runtimes, these have been fixed by only building interception, sanitizer_common, and asan with /MD, and continuing to build the rest of the runtimes with /MT. This does mean that any usage of the static ubsan/fuzzer/etc runtimes will mean you're mixing different runtime library linkages in the same app, the interception, sanitizer_common, and asan runtimes are designed for this, however it does result in some linker warnings. Additionally, it turns out when building in release-mode with LLVM_ENABLE_PDBs the build system forced /OPT:ICF. This totally breaks asan's "new" method of doing "weak" functions on windows, and so /OPT:NOICF was explicitly added to asan's link flags. --------- Co-authored-by: Amy Wishnousky <[email protected]>
the new/delete code was removed from RTAsan_dynamic in llvm#107899, but that broke things on macos. This reverts the offending change.
Windows doesn't have a static runtime after #107899.
With llvm/llvm-project#107899, we need a different set of libraries. Handle both the old and new list for now. Bug: 365980757, 343734021 Change-Id: I94e9e347698bed8040479bff91d25b40c801eeae Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5858358 Reviewed-by: Nico Weber <[email protected]> Commit-Queue: Arthur Eubanks <[email protected]> Reviewed-by: Hans Wennborg <[email protected]> Cr-Commit-Position: refs/heads/main@{#1355864}
After this change we're running into #34485 ("ASan strtol interceptor breaks errno on Windows"):
That's an old bug, but it wasn't a problem in Chromium until this runtime change. It sounds like Visual Studio users also hit this before: https://developercommunity.visualstudio.com/t/Cannot-detect-strtol-range-errors-with-A/10412869 Do you know what the fix for that was? |
(nfc) #107899 Added some files with CRLF line endings. Mixed line endings are somewhat gross, so I've converted them all to unix.
(nfc) llvm#107899 Added some files with CRLF line endings. Mixed line endings are somewhat gross, so I've converted them all to unix.
This fixes two problems with asan's interception of `strtol` on Windows: 1. In the dynamic runtime, the `strtol` interceptor calls out to ntdll's `strtol` to perform the string conversion. Unfortunately, that function doesn't set `errno`. This has been a long-standing problem (llvm#34485), but it was not an issue when using the static runtime. After the static runtime was removed recently (llvm#107899), the problem became more urgent. 2. A module linked against the static CRT will have a different instance of `errno` than the ASan runtime, since that's now always linked against the dynamic CRT. That means even if the ASan runtime sets `errno` correctly, the calling module will not see it. This patch fixes the first problem by making the `strtol` interceptor call out to `strtoll` instead, and do 32-bit range checks on the result. I can't think of any reasonable way to fix the second problem, so we should stop intercepting `strtol` in the static runtime thunk. I checked the list of functions in the thunk, and `strtol` and `strtoll` are the only ones that set `errno`. (`strtoll` was already missing, probably by mistake.)
This fixes two problems with asan's interception of `strtol` on Windows: 1. In the dynamic runtime, the `strtol` interceptor calls out to ntdll's `strtol` to perform the string conversion. Unfortunately, that function doesn't set `errno`. This has been a long-standing problem (#34485), but it was not an issue when using the static runtime. After the static runtime was removed recently (#107899), the problem became more urgent. 2. A module linked against the static CRT will have a different instance of `errno` than the ASan runtime, since that's now always linked against the dynamic CRT. That means even if the ASan runtime sets `errno` correctly, the calling module will not see it. This patch fixes the first problem by making the `strtol` interceptor call out to `strtoll` instead, and do 32-bit range checks on the result. I can't think of any reasonable way to fix the second problem, so we should stop intercepting `strtol` in the static runtime thunk. I checked the list of functions in the thunk, and `strtol` and `strtoll` are the only ones that set `errno`. (`strtoll` was already missing, probably by mistake.)
This fixes two problems with asan's interception of `strtol` on Windows: 1. In the dynamic runtime, the `strtol` interceptor calls out to ntdll's `strtol` to perform the string conversion. Unfortunately, that function doesn't set `errno`. This has been a long-standing problem (llvm#34485), but it was not an issue when using the static runtime. After the static runtime was removed recently (llvm#107899), the problem became more urgent. 2. A module linked against the static CRT will have a different instance of `errno` than the ASan runtime, since that's now always linked against the dynamic CRT. That means even if the ASan runtime sets `errno` correctly, the calling module will not see it. This patch fixes the first problem by making the `strtol` interceptor call out to `strtoll` instead, and do 32-bit range checks on the result. I can't think of any reasonable way to fix the second problem, so we should stop intercepting `strtol` in the static runtime thunk. I checked the list of functions in the thunk, and `strtol` and `strtoll` are the only ones that set `errno`. (`strtoll` was already missing, probably by mistake.)
This fixes two problems with asan's interception of `strtol` on Windows: 1. In the dynamic runtime, the `strtol` interceptor calls out to ntdll's `strtol` to perform the string conversion. Unfortunately, that function doesn't set `errno`. This has been a long-standing problem (llvm#34485), but it was not an issue when using the static runtime. After the static runtime was removed recently (llvm#107899), the problem became more urgent. 2. A module linked against the static CRT will have a different instance of `errno` than the ASan runtime, since that's now always linked against the dynamic CRT. That means even if the ASan runtime sets `errno` correctly, the calling module will not see it. This patch fixes the first problem by making the `strtol` interceptor call out to `strtoll` instead, and do 32-bit range checks on the result. I can't think of any reasonable way to fix the second problem, so we should stop intercepting `strtol` in the static runtime thunk. I checked the list of functions in the thunk, and `strtol` and `strtoll` are the only ones that set `errno`. (`strtoll` was already missing, probably by mistake.)
This reapplies 8fa66c6 ([asan][windows] Eliminate the static asan runtime on windows) for a second time.
That PR bounced off the tests because it caused failures in the other sanitizer runtimes, these have been fixed by only building interception, sanitizer_common, and asan with /MD, and continuing to build the rest of the runtimes with /MT. This does mean that any usage of the static ubsan/fuzzer/etc runtimes will mean you're mixing different runtime library linkages in the same app, the interception, sanitizer_common, and asan runtimes are designed for this, however it does result in some linker warnings.
Additionally, it turns out when building in release-mode with LLVM_ENABLE_PDBs the build system forced /OPT:ICF. This totally breaks asan's "new" method of doing "weak" functions on windows, and so /OPT:NOICF was explicitly added to asan's link flags.