Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

valgrind BSD #67153

Closed
irevoire opened this issue Dec 8, 2019 · 24 comments
Closed

valgrind BSD #67153

irevoire opened this issue Dec 8, 2019 · 24 comments
Labels
A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows C-bug Category: This is a bug. O-freebsd Operating system: FreeBSD

Comments

@irevoire
Copy link

irevoire commented Dec 8, 2019

Hello,
I’m trying to profile a simple program on FreeBSD 11.2 with rustc 1.39.0 and valgrind 3.10.1.

I always get the following error:

thread '<unnamed>' panicked at 'failed to allocate a guard page', src/libstd/sys/unix/thread.rs:336:17
stack backtrace:
   0:           0x116af8 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hf24f776311cdf529
   1:           0x12eed0 - core::fmt::write::h01edf6dd68a42c9c
   2:           0x115174 - std::io::stdio::_print::hf62f3360234422f1
   3:           0x11870d - rust_oom
   4:           0x1183e3 - rust_oom
   5:           0x118d57 - std::panicking::rust_panic_with_hook::h9a662f58cf3f8ffe
   6:           0x118b36 - std::panicking::begin_panic_fmt::hb07c3a07faf6b5b3
   7:           0x119436 - std::rt::lang_start_internal::h7d3aa0ed326f9560
   8:           0x110dc3 -
   9:           0x110c6b -
  10:           0x110a25 -
  11:          0x4023000 - <unknown>
fatal runtime error: failed to initiate panic, error 94376704
error: valgrind command failed
@jonas-schievink jonas-schievink added A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows O-freebsd Operating system: FreeBSD C-bug Category: This is a bug. labels Dec 8, 2019
@Mark-Simulacrum
Copy link
Member

This is at least plausibly a valgrind bug? Could you try with a more recent version (I have 3.15 locally, at least)?

In any case I suspect that since it works outside of valgrind (I assume, given you didn't mention otherwise) that this would be a bug inside valgrind.

@irevoire
Copy link
Author

irevoire commented Dec 8, 2019

Yes it does works outside of valgrind, and valgrind is also working for other C program.

I don’t know how to get a more recent version valgrind, in the pre compiled package and in the port of FreeBSD the latest version of valgrind is 3.10.1.

And when I try to compile valgrind from source it does not work because at some point my cpu is not supported (amd64).

If you are also running under FreeBSD could you send me your valgrind binary?

@Mark-Simulacrum
Copy link
Member

I am not running under FreeBSD. I suspect this bug is unlikely to make progress as-is; it sounds like it's probably a valgrind bug (missing emulation or lack of support for guard pages, something along those lines).

@ranma42
Copy link
Contributor

ranma42 commented Dec 8, 2019

Would it be possible to disable guard pages to check if it makes a difference?
(with a compile-time option or an environment variable?)

@Mark-Simulacrum
Copy link
Member

I think no (they're sort of inherently protecting you, i.e., a safety feature). But I'm not sure. You could plausibly comment out the stack guard support in libstd, I guess, and recompile -- https://github.com/rust-lang/rust/blob/master/src/libstd/sys/unix/stack_overflow.rs

@nagisa
Copy link
Member

nagisa commented Dec 8, 2019

Guard pages are a prerequisite for ensuring soundness in case stack overflow occurs. C works, because it doesn’t do anything along the lines for you.

That being said, there seems to be more stuff going wrong here than just inability to map a page:

fatal runtime error: failed to initiate panic, error 94376704

Either way, this is more attributable to valgrind rather than Rust.

@asomers
Copy link
Contributor

asomers commented Feb 2, 2021

I opened a bug with Valgrind. We'll see what they say.
https://bugs.kde.org/show_bug.cgi?id=432440

@paulfloyd
Copy link

paulfloyd commented Feb 3, 2021

FreeBSD Valgrind maintainer here.

I'm not at all familiar with Rust. Is this code creating a stack for the main process or for a secondary thread?

With ktrace I'm seeing a lot of mmap'ing going on - much more than for 'grep'.

From what I understand for C/C++ applications, on startup mmap a MAP_GUARD region above the user stack, similar to what this code is doing but much larger and without needing to use mprotect. Valgrind is leaving space for this mapping, and I don't think that it likes client code mmap'ing into it.

@paulfloyd
Copy link

Can anyone comment on my previous question?

Is this happening on the main thread during program startup?
Why isn't rust using a MAP_GUARD region (designed for this purpose, or so it seems to me) rather than mmap'ing then calling mprotect?

@paulfloyd
Copy link

paulfloyd commented Mar 12, 2021

More questions:

What is Linux doing differently? Is it also mmap'ing a guard page? And if so, how does it determine the address to use?

My impression on FreeBSD is that the rust runtime is getting the user stack via the "kern.usrstack" sysctl. That's what I see with ktrace, but looking at the rust source I also see calls to pthread_attr_getstack.

The Valgrind code that allocates the user stack is here starting on line 1048. The FreeBSD code does the same, though obviously there are differences in the setup_client_stack function.

I've started doing some tests to see if I can intercept the sysctl for kern.usrstack to return a value that is correct for the guest.

@paulfloyd
Copy link

paulfloyd commented Mar 13, 2021

A few tests on Linux. For a C++ executable, when running standalone the user stack is around 0x7ffeffffff. When running under Valgrind 0x7ffeffff00 is where VG's stack is and the guests stack is around 0x1ffeffffff. With strace I see the following 4k mmaps (I'm assuming the stack guard is 4k).

standalone
paulf> strace ./hello 2>&1 | grep mmap | grep 4096
mmap(0x7fed1b45e000, 4096, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x4000) = 0x7fed1b45e000
mmap(0x7fed1b45f000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x4000) = 0x7fed1b45f000

That fits with what I see with the standalone C++ exe.

Under Valgrind

paulf> strace -ff /home/paulf/tools/valgrind/bin/valgrind -q ./hello 2>&1 | grep mmap | grep 4096
mmap(0x4000000, 4096, PROT_READ, MAP_PRIVATE|MAP_FIXED, 4, 0) = 0x4000000
mmap(0x402e000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, 0, 0) = 0x402e000
mmap(0x482f000, 4096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1000) = 0x482f000
mmap(0x4830000, 4096, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x4830000

mmap(0x48b6000, 4096, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x4000) = 0x48b6000
mmap(0x48b7000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x4000) = 0x48b7000

This time I don't see an mmap near where the user stack is.

@asomers
Copy link
Contributor

asomers commented Mar 27, 2021

@paulfloyd I'm not highly familiar with the internals of the Rust compiler, but I can easily answer one of your questions:

  • Rust does not use MAP_GUARD because that's a FreeBSD-specific feature, and it's younger than Rust's stack overflow detection code. It wasn't added until FreeBSD 11.1. We could change it, but IIRC Rust currently tries to support back to FreeBSD 10.0 or something. I disagree with that, but it's not up to me to change it.

@paulfloyd
Copy link

Fair enough.

However, I still don't have answers to many of my questions.

What is being done differently on Linux?

How do you determine the address for this guard page on FreeBSD?

Without this information it is going to be difficult for me to progress.

@asomers
Copy link
Contributor

asomers commented Mar 29, 2021

@paulfloyd I did some work on the guard page stuff this weekend. It's not my area of expertise, but I'm making some progress in fixing stack overflow detection. Fingers crossed, that might fix Valgrind too.

@paulfloyd
Copy link

paulfloyd commented Mar 29, 2021 via email

@paulfloyd
Copy link

paulfloyd commented Mar 29, 2021

Quick test on Linux - seems OK.

paulf> ./sa
a addr 0x7ffd6fab4aec
stack addr 0x7ffd6f2b7000 stack size 8380416

paulf> valgrind -q ./sa
a addr 0x1ffefffe0c
stack addr 0x1ffe801000 stack size 8384512

The glibc code to get the pthread attributes is a lot more complicated.

There seems to be a general case where it reads thread->stackblock. Possibly this is the case for secondary threads. Anyway, this branch isn't taken for this small example. Indeed, that is what the comment for the else block says. So for the initial thread it reads /proc/self/maps and also uses the internal variable __libc_stack_end.

There's a lot of code in VG to handle /proc/self/maps but I'm not familiiar with how it works.

@paulfloyd
Copy link

paulfloyd commented Mar 29, 2021 via email

@paulfloyd
Copy link

paulfloyd commented Mar 29, 2021

diff --git a/coregrind/m_main.c b/coregrind/m_main.c
index f3a0d1c27..56f9c6cbf 100644
--- a/coregrind/m_main.c
+++ b/coregrind/m_main.c
@@ -3843,6 +3843,13 @@ UWord voucher_mach_msg_set ( UWord arg1 )
 #endif
 
 
+Word VG_(get_usrstack)(void)
+{
+   return VG_PGROUNDDN(the_iicii.clstack_end - the_iifii.clstack_max_size);
+}
+
+
+
 /*--------------------------------------------------------------------*/
 /*--- end                                                          ---*/
 /*--------------------------------------------------------------------*/
diff --git a/coregrind/m_syswrap/syswrap-freebsd.c b/coregrind/m_syswrap/syswrap-freebsd.c
index 318721f62..b9508a4d4 100644
--- a/coregrind/m_syswrap/syswrap-freebsd.c
+++ b/coregrind/m_syswrap/syswrap-freebsd.c
@@ -1983,6 +1983,19 @@ PRE(sys___sysctl)
       }
    }
 
+   if (SARG2 >= 2 && ML_(safe_to_deref)(name, 2*sizeof(int))) {
+      if (name[0] == 1 && name[1] == 33) {
+         // kern.userstack
+         Word tmp = VG_(get_usrstack)();
+         size_t* out = (size_t*)ARG3;
+         size_t* outlen = (size_t*)ARG4;
+         *out = tmp;
+         *outlen = sizeof(size_t);
+         SET_STATUS_Success(0);
+      }
+   }
+
+
    PRE_REG_READ6(int, "__sysctl", int *, name, vki_u_int32_t, namelen, void *, oldp,
                  vki_size_t *, oldlenp, void *, newp, vki_size_t, newlen);
 
diff --git a/coregrind/pub_core_aspacemgr.h b/coregrind/pub_core_aspacemgr.h
index 0f34782d3..cf25699ca 100644
--- a/coregrind/pub_core_aspacemgr.h
+++ b/coregrind/pub_core_aspacemgr.h
@@ -384,6 +384,9 @@ extern Bool VG_(am_search_for_new_segment)(Addr *start, SizeT *size,
                                            UInt *prot);
 #endif
 
+extern Word VG_(get_usrstack)(void);
+
+
 #endif   // __PUB_CORE_ASPACEMGR_H
 
 /*--------------------------------------------------------------------*/

@paulfloyd
Copy link

Can you confirm whether the above patch works OK? It doesn't break any of the Valgrind amd64 regression tests, and 'rg' and a hello world rust exe both seem to work.

@asomers
Copy link
Contributor

asomers commented Mar 31, 2021

@paulfloyd I'll probably have time to test it this weekend.

@asomers
Copy link
Contributor

asomers commented Apr 7, 2021

@paulfloyd Your patch works for me. HOWEVER, even without your patch, valgrind works for me on executables built with the latest Rust compiler (rustc 1.53.0-nightly (d32238532 2021-04-05)). And that's true both for the valgrind in your GH repo and the valgrind-devel in ports. I think the problem is probably fixed by #83771 .

@paulfloyd
Copy link

I've integrated my fix to Valgrind as well.

@asomers
Copy link
Contributor

asomers commented Apr 7, 2021

Now that I see the full version of Paul's fix I understand why #83771 "fixed" the problem. After #83771 , Rustc no longer tries to add a guard page on FreeBSD. However, it still needs to know the address of the stack for the purposes of stack overflow detection. So without Paul's fix in paulfloyd/freebsd_valgrind@5923237 stack overflow detection won't work.

@jonas-schievink we can close this issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows C-bug Category: This is a bug. O-freebsd Operating system: FreeBSD
Projects
None yet
Development

No branches or pull requests

7 participants