-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime, x/net/internal/socket: thread sanitizer failing on ppc64le #35547
Comments
Once again, I am not able to reproduce this failure. Maybe I'm not building and running this like is done with the buildlet. I checked out the latest golang.org/x/net and rebuilt it with upstream Go. The tests all passed on my debian buster power9 machine. I see the test says it is testing in module mode. Not exactly sure what that means. Is there documentation on how to run all the golang.org/x repos? I did copy the old go.mod file it was using and set up GOMOD to point to it and it still passed. |
I hate unreproducible bugs :( Try checking out Go and x/net from the two hashes listed in that error log. I don't think this failure would have anything to do with modules. Maybe just try writing a helloworld race detector example and see if it works? |
I ran all the race tests I know about in golang. They all passed on my power8 and power9. in runtime/race:
In golang.org/x/net at the commit from the failure:
I do not have gomote access. |
If I look at the code from LLVM from the error output I see this:
So I believe that error message is saying the test is supposed to be able to set the personality to ADDR_NO_RANDOMIZE but that failed. |
Can someone disable ASLR in the ppc64le builders (i.e. |
Interesting. So that check is failing, so it fails to re-exec itself (which presumably would work?). I don't understand what that check is trying to achieve - is it just that -1 is a special case meaning "failure"? Anyone have the docs to the Instead of fixing the reexec in the race detector, we could set ADDR_NO_RANDOMIZE on the binary that |
Look up the documentation for the Linux personality syscall (you can google it). In this case the value of ADDR_NO_RANDOMIZE is not a setting on the binary but a setting for the personality of the process. If the syscall returns -1 (EINVAL) that means the kernel could not change the personality. To me it looks like even though the syscall fails it continues to try to ReExec regardless of the syscall result but the comments say the race detector doesn't work in ASLR mode (due to unexpected address ranges) and in this test it is supposed to detect a race condition but does not so fails. I don't know why it would fail to change the personality in this case -- it must work in some cases because this code is executed whenever using race on ppc64le and distros have had ASLR enabled for a while. Maybe if we had the errno that would help. |
OK I think I found out the issue. When a Docker container is set up, there is a default seccomp security profile with a list of syscalls that are not allowed. The personality syscall is on that list. There is a way to define the container so it allows this syscall. |
Can I get gomote access so I can try to understand this problem a bit more. It is curious as to why some race tests work fine (I think they must all have to do the personality syscall) but this one does not. Also I am not able to reproduce it. |
Sorry for the delay in getting back to you @laboger. I wanted to update you that we're not able to give out gomote access at this time, unfortunately. We're still working on fixing that, but it will take more time. I realize it's hard to make progress on this issue because it doesn't reproduce easily outside the ppc64le builder, and I'll try to look into what else we can do in the meantime to help make investigating this easier. I just wanted to share this update for now. |
Since my last post I was able to reproduce this in a container. I have confirmed the theory that was mentioned in my post on Nov. 18. When running LLVM's thread sanitizer code (used by Go's race detector) on a system with ASLR enabled, the address mappings needed for the thread sanitizer are not always usable on ppc64le so it calls the personality syscall to disable ASLR. Outside of a container this works fine, but within a container that is running with the default seccomp profile, it fails as in this testcase. This is because the default seccomp profile restricts the use of the personality syscall. One solution would be to have an alternate security profile which allows the personality syscall and then use that profile for the container where the testing is done. If you don't care about the seccomp profile for the container then the container can be run using --security-opt seccomp=unconfined. The test works if that option is used. This does not fail on amd64 because they don't have the same problem with address ranges apparently. It is needed on arm64 but I don't think they have the race detector working for Go yet. |
Thanks for figuring that out @laboger ! It sounds to me like there's nothing Go can do to fix or work around this (is that true?). If so, maybe we can add something to the documentation and resolve this bug? |
@aclements, it sounds to me like we need to add |
Yes, I think that would be the easiest for now. There could be something documented to state that the use of the race detector on ppc64le within a container requires this option when starting the container. |
Change https://golang.org/cl/214919 mentions this issue: |
…iners This fixes race tests; the thread sanitizer needs to check its personality, which seccomp defaults prevent apparently. Updates golang/go#35547 (needs to be deployed first, then bug can be closed) Change-Id: I8b87618f63ef2b7a75b72290098c09bf04298d86 Reviewed-on: https://go-review.googlesource.com/c/build/+/214919 Reviewed-by: Alexander Rakoczy <[email protected]> Run-TryBot: Alexander Rakoczy <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
I've redeployed |
Change https://golang.org/cl/214979 mentions this issue: |
The thread sanitizer failure was fixed on ppc64le with CL 214919. |
The discussion above suggests that this issue is fixed, so closing. Please comment if you disagree. |
Thread sanitizer is failing on linux-ppc64le-buildlet and linux-ppc64le-power9osu
The test is from the
x/net
repository, but I'm not sure that matters much.From https://build.golang.org/log/7ba95758a5a8b138b85ba4a9f87a543c3177e884 :
Some internal check in tsan is failing, causing this test to fail.
We claim to support the race detector in this config, so this test should pass.
@laboger
The text was updated successfully, but these errors were encountered: