Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setjmp on linux mangles ebp leading to early collection #10625

Open
arnetheduck opened this issue Feb 10, 2019 · 7 comments
Open

setjmp on linux mangles ebp leading to early collection #10625

arnetheduck opened this issue Feb 10, 2019 · 7 comments

Comments

@arnetheduck
Copy link
Contributor

d503368

when the GC calls setjmp, the assumption is that it will put register values on the stack. some registers however, notably RBP, are mangled so as to protect against stack overwrite attacks. This leads to trouble, because mark-and-sweep will miss pointers and prematurely reclaim memory.

RBP is used as a general-purpose register whenever frame pointers are omitted: https://stackoverflow.com/questions/41912684/what-is-the-purpose-of-the-rbp-register-in-x86-64-assembler

Generally, -fomit-frame-pointer is enabled at all optimization levels, in gcc, meaning that even though the test suite might be passing, release compiles will be more likely to crash randomly.

https://github.com/lattera/glibc/blob/895ef79e04a953cac1493863bcae29ad85657ee1/sysdeps/x86_64/setjmp.S#L33 shows how RBP is mangled.

@arnetheduck
Copy link
Contributor Author

One workaround for the rbp issue is to force frame pointers - this removes RBP from the set of general-purpose registers available for optimization..

@Araq
Copy link
Member

Araq commented Feb 10, 2019

I'd prefer to use inline assembler for the "store registers on the stack" operation. We can base the implementations on setjmp's implementation. Hey, we only support about 18 different CPU archs, cannot be hard, right?

@arnetheduck
Copy link
Contributor Author

well, yeah, that would work as well, but strikes me as the less attractive option, at least to begin with - one could start with no-omit-fp in the standard options and then selectively use asm if it turns out to be beneficial for real..

I kind of doubt the impact is that great on x86_64 for example - it has what.. 16 registers already, so one more or less isn't that much of a difference (and probably dwarfed by other codegen bloat, like casts and stack reads/writes due to the many temporaries generated in general).

In addition, on x86 and arm, gcc has the -momit-leaf-frame-pointer option which should be safe and lessen the impace.

@jlokier
Copy link

jlokier commented Apr 14, 2021

This GC bug occurs on MacOS as well. Not sure which version at started at, but disassembling MacOS 10.15's _setjmp shows the same mangling as Glibc:

libsystem_platform.dylib`_setjmp:
->  0x7fff72f4cc9c <+0>:  movq   %rbx, (%rdi)
    0x7fff72f4cc9f <+3>:  movq   %rbp, %rax
    0x7fff72f4cca2 <+6>:  xorq   %gs:0x38, %rax    <- Mangle %rbp
    0x7fff72f4ccab <+15>: movq   %rax, 0x8(%rdi)
    0x7fff72f4ccaf <+19>: movq   %r12, 0x18(%rdi)
    0x7fff72f4ccb3 <+23>: movq   %r13, 0x20(%rdi)
    0x7fff72f4ccb7 <+27>: movq   %r14, 0x28(%rdi)
    0x7fff72f4ccbb <+31>: movq   %r15, 0x30(%rdi)
    0x7fff72f4ccbf <+35>: movq   (%rsp), %rax
    0x7fff72f4ccc3 <+39>: xorq   %gs:0x38, %rax    <- Mangle %rip
    0x7fff72f4cccc <+48>: movq   %rax, 0x38(%rdi)
    0x7fff72f4ccd0 <+52>: leaq   0x8(%rsp), %rax
    0x7fff72f4ccd5 <+57>: xorq   %gs:0x38, %rax    <- Mangle %rsp
    0x7fff72f4ccde <+66>: movq   %rax, 0x10(%rdi)
    0x7fff72f4cce2 <+70>: fnstcw 0x4c(%rdi)
    0x7fff72f4cce5 <+73>: stmxcsr 0x48(%rdi)
    0x7fff72f4cce9 <+77>: xorl   %eax, %eax
    0x7fff72f4cceb <+79>: retq
    ```

@jlokier
Copy link

jlokier commented Apr 14, 2021

On embedded Linux, Eglibc does the same as Glibc, as noticed by CVE-2013-4788 - Eglibc PTR MANGLE vulnerability.

I looked into using getcontext instead of setjmp, as done for SIOD in Fix the bug of garbage collection of siod. There are a number of interesting comments there. One suggests getcontext because it collects more registers, but that seems like guesswork.

In the SIOD discussion, a proposed patch checks for linux to use getcontext instead of setjmp, but as I found when I looked at MacOS, MacOS also needs the fix. getcontext is not a good choice on MacOS because it's very deprecated; it is also gone from POSIX now.

Another proposed patch (at the SIOD link above) checks for x86_64 and explicitly saves just %rbp. It's a lovely one-liner, that I'd consider. It's nice but has a bug: Glibc mangles the frame pointer on multiple architectures, not just x86_64. For example, ARM. Any target which defines the PTR_MANGLE macro in Glibc.

For x86, there actually is register-saving assembly in Nim already at vendor/nimbus-build-system/vendor/Nim/lib/arch/x86/{i386,amd64}.S. I'm not sure how much to trust that code for something this critical though: Something rarely observed but bad in the most subtle ways when it occurs. Also ARM isn't handled.

I don't think relying on register-saving assembly is a great idea, though. It is possible a compiler might store pointers in additional CPU registers that are part of a CPU extension, and not saved by any particular home-grown function. Again, because it's such a critical function for the GC, and missing something would lead to a very subtle bug.

That circles back to -fno-omit-frame-pointer with -momit-leaf-frame-pointer being a good solution as proposed by @arnetheduck.

@edubart
Copy link
Contributor

edubart commented Sep 14, 2021

I've hit this bug when working on my GC implementation using gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1), and then found this issue occasionally when researching for the problem, I've used the __builtin_frame_address available in GCC/Clang to get RBP address and scan it, using this builtin could fix the issue in GCC/Clang at least and looks like it should work with multiple architectures.

@Varriount
Copy link
Contributor

@edubart That sounds like a good solution! Just make sure to guard against old versions of GCC/Clang that may not have that intrinisc (if it's relatively recent) and situations where the backend compiler is VCC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants