Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glibc 2.27 stack guard pages are moving #47863

Closed
cuviper opened this issue Jan 30, 2018 · 5 comments · Fixed by #47912
Closed

glibc 2.27 stack guard pages are moving #47863

cuviper opened this issue Jan 30, 2018 · 5 comments · Fixed by #47912
Labels
A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows C-bug Category: This is a bug. O-linux Operating system: Linux T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@cuviper
Copy link
Member

cuviper commented Jan 30, 2018

From bug 22637, glibc 2.27 will be allocating the stack guard page for new threads just past the end of the stack, rather than within it. This has also reached Fedora 27 from rhbz 1527887.

With these versions of glibc, the location calculated in guard::current() will be incorrect. This causes run-pass/out-of-stack to fail, as our SIGSEGV handler thinks the fault is something other than a guard page. (rather than identifying it as a stack overflow and using rtabort!)

We could get out of this guessing game by allocating the thread stack and guard page ourselves. If there's no objection, I will attempt this approach. Other ideas are welcome!

(note: this doesn't affect the main thread, as that stack and guard page are managed by the kernel.)

@cuviper cuviper added A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows O-linux Operating system: Linux T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-bug Category: This is a bug. labels Jan 30, 2018
@nagisa
Copy link
Member

nagisa commented Jan 30, 2018

We should not change how one of the core tasks are done (stack allocation) for the purpose of giving users nicer diagnostics when the stack is overflowed (what the guard::current is used for IIRC).

If it is possible to adjust the guard::current logic to return a correct value – great. Alternatively, we should just live with the notion that this check may be false-negative.

@cuviper
Copy link
Member Author

cuviper commented Jan 30, 2018

I'm having trouble seeing how to cleanly free a manual stack allocation anyway -- when the thread may be detached from where it was spawned, no one but libpthread really knows about it anymore.

I was thinking of trying something like pthread_attr_setguardsize(attr, 0), leaving libpthread to allocate the stack, and then alter the last page with mprotect(PROT_NONE) ourselves. But that too may be fragile, if libpthread tries to do anything more than munmap with that area on cleanup, when it doesn't know it's protected there.

I'll take another look at guard::current. One thought is to have this return an address range, rather than just an edge, and then for Linux targets we can cover both within and just past the end of stack. Worst case is that this would misdiagnose an unrelated segv that just happened to be right below the thread stack.

@cuviper
Copy link
Member Author

cuviper commented Jan 30, 2018

BTW, android is ignored in this test due to #20004. #41618 tried to enable it again, and the log has the same assertion failed: `(left == right)` (left: `Some(11)`, right: `Some(6)`), which is SIGSEGV vs. SIGABRT. Android probably puts the guard below the stack, as that's really the proper way to do it.

Actually musl is ignored too, and probably for the same reason. The glibc-focused assumption that the guard was within the stack is likely wrong for any other libc. (and now glibc is finally fixing this.)

@alexcrichton
Copy link
Member

@cuviper it seems reasonable to me to print out a message for anything within a range of the predicted guard page, if a fault happens and it wasn't actually a stack overflow, yet we printed, it's probably not the end of the world

@cuviper
Copy link
Member Author

cuviper commented Jan 31, 2018

Great, I have a patch that does just that. 😀

I'll clean it up and send a PR tomorrow.

cuviper added a commit to cuviper/rust that referenced this issue Jan 31, 2018
Previously, the `guard::init()` and `guard::current()` functions were
returning a `usize` address representing the top of the stack guard,
respectively for the main thread and for spawned threads.  The `SIGSEGV`
handler on `unix` targets checked if a fault was within one page below
that address, if so reporting it as a stack overflow.

Now `unix` targets report a `Range<usize>` representing the guard
memory, so it can cover arbitrary guard sizes.  Non-`unix` targets which
always return `None` for guards now do so with `Option<!>`, so they
don't pay any overhead.

For `linux-gnu` in particular, the previous guard upper-bound was
`stackaddr + guardsize`, as the protected memory was *inside* the stack.
This was a glibc bug, and starting from 2.27 they are moving the guard
*past* the end of the stack.  However, there's no simple way for us to
know where the guard page actually lies, so now we declare it as the
whole range of `stackaddr ± guardsize`, and any fault therein will be
called a stack overflow.  This fixes rust-lang#47863.
kennytm added a commit to kennytm/rust that referenced this issue Feb 3, 2018
…ichton

Use a range to identify SIGSEGV in stack guards

Previously, the `guard::init()` and `guard::current()` functions were
returning a `usize` address representing the top of the stack guard,
respectively for the main thread and for spawned threads.  The `SIGSEGV`
handler on `unix` targets checked if a fault was within one page below that
address, if so reporting it as a stack overflow.

Now `unix` targets report a `Range<usize>` representing the guard memory,
so it can cover arbitrary guard sizes.  Non-`unix` targets which always
return `None` for guards now do so with `Option<!>`, so they don't pay any
overhead.

For `linux-gnu` in particular, the previous guard upper-bound was
`stackaddr + guardsize`, as the protected memory was *inside* the stack.
This was a glibc bug, and starting from 2.27 they are moving the guard
*past* the end of the stack.  However, there's no simple way for us to know
where the guard page actually lies, so now we declare it as the whole range
of `stackaddr ± guardsize`, and any fault therein will be called a stack
overflow.  This fixes rust-lang#47863.
kennytm added a commit to kennytm/rust that referenced this issue Feb 4, 2018
…ichton

Use a range to identify SIGSEGV in stack guards

Previously, the `guard::init()` and `guard::current()` functions were
returning a `usize` address representing the top of the stack guard,
respectively for the main thread and for spawned threads.  The `SIGSEGV`
handler on `unix` targets checked if a fault was within one page below that
address, if so reporting it as a stack overflow.

Now `unix` targets report a `Range<usize>` representing the guard memory,
so it can cover arbitrary guard sizes.  Non-`unix` targets which always
return `None` for guards now do so with `Option<!>`, so they don't pay any
overhead.

For `linux-gnu` in particular, the previous guard upper-bound was
`stackaddr + guardsize`, as the protected memory was *inside* the stack.
This was a glibc bug, and starting from 2.27 they are moving the guard
*past* the end of the stack.  However, there's no simple way for us to know
where the guard page actually lies, so now we declare it as the whole range
of `stackaddr ± guardsize`, and any fault therein will be called a stack
overflow.  This fixes rust-lang#47863.
kennytm added a commit to kennytm/rust that referenced this issue Feb 4, 2018
…ichton

Use a range to identify SIGSEGV in stack guards

Previously, the `guard::init()` and `guard::current()` functions were
returning a `usize` address representing the top of the stack guard,
respectively for the main thread and for spawned threads.  The `SIGSEGV`
handler on `unix` targets checked if a fault was within one page below that
address, if so reporting it as a stack overflow.

Now `unix` targets report a `Range<usize>` representing the guard memory,
so it can cover arbitrary guard sizes.  Non-`unix` targets which always
return `None` for guards now do so with `Option<!>`, so they don't pay any
overhead.

For `linux-gnu` in particular, the previous guard upper-bound was
`stackaddr + guardsize`, as the protected memory was *inside* the stack.
This was a glibc bug, and starting from 2.27 they are moving the guard
*past* the end of the stack.  However, there's no simple way for us to know
where the guard page actually lies, so now we declare it as the whole range
of `stackaddr ± guardsize`, and any fault therein will be called a stack
overflow.  This fixes rust-lang#47863.
ojab pushed a commit to ojab/BLFS that referenced this issue May 24, 2018
…s will be fixed in rustc-1.25.0, rust-lang/rust#47863.

git-svn-id: svn://svn.linuxfromscratch.org/BLFS/trunk@19880 af4574ff-66df-0310-9fd7-8a98e5e911e0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows C-bug Category: This is a bug. O-linux Operating system: Linux T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants