Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove @fence from the language #11650

Closed
kprotty opened this issue May 13, 2022 · 4 comments · Fixed by #21585
Closed

remove @fence from the language #11650

kprotty opened this issue May 13, 2022 · 4 comments · Fixed by #21585
Labels
accepted This proposal is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@kprotty
Copy link
Member

kprotty commented May 13, 2022

So LLVM's ThreadSanitizer (TSan) is a tool for detecting race conditions. It detects races by rewriting memory operations to call their own functions for tracking purposes.

There's an issue in that TSan doesn't properly support atomic fences. To get around this, projects using fences have to either specialize around TSAN or move their algorithms to normal atomic operations (which can result in perf loss). In zig, we're currently doing the specialization approach but it begs to question whether this is the best way of going about it.

The issue extends past TSan for if (and when) Zig wants to provide its own tooling to detect race conditions. It may struggle a similar fate and that only means more specialization; which doesn't scale. I'm making this issue to discuss how to deal with this as, with stage2 and other backends approaching, it'll have to be dealt with eventually.

Extinguish

One option is to remove fences entirely. Fences are generally used to synchronize with atomic operations on other atomic variables without doing an operation themselves. There's an introspection problem here in that it doesn't communicate to the compiler exactly what atomic variables or operations its synchronizing with. If fence is removed, it means all synchronization relationships (at least through the language) for compiler tools become explicit.

Unfortunately, this probably wouldn't work. This would mean the language prevents you from generating optimal solutions compared to other languages like C, C++, and Rust (as fences are used for optimizaiton). Users would probably resort to using inline asm and basically specializing in reverse (if not TSan then asm fence). It's also mentioned that there could be algorithms in the wild which cannot be rewritten without fences (although I anecdotally haven't come across such).

Extend

Another option is to extend the @fence builtin to take an (optional) memory address to denote which atomic variable is synchronizes with. This solves the general issue where TSan barfs on fences, moves the specialization into the compiler, and keeps the fence builtin.

Downside is that we change (or remove in the case of extinguish) the fence's API which makes it differ from C11's. This difference would need to be documented but it's still something to document for people coming from other languages.

Embrace

Last option I can think of is to keep specializing under TSan for now. After all, it's the most common solution (used in Zig and Rust stdlib + other projects) and currently works with the most popular race detector. This is the base case and I wanted to reverse the order for the meme but it didn't flow right.


This topic came up from implementation concerns in std.atomic.Atomic where it was attempted to be removed. The driving reason was it containing code that should be in the compiler with secondary reasoning being that its multiple ways to do the same thing (stdlib vs builtin). Arguments against it generally fell into the "Atomic(T) highlights misuse bugs vs (T + builtin)" bucket and the "introduced useful operations (e.g. bitRmw) that aren't currently exposed by builtins" bucket.

Feel free to read up on comments in the PR linked above, but any more opinions are appreciated.

@andrewrk andrewrk added this to the 0.11.0 milestone May 13, 2022
@andrewrk andrewrk added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. use case Describes a real use case that is difficult or impossible, but does not propose a solution. labels May 13, 2022
@matu3ba
Copy link
Contributor

matu3ba commented May 15, 2022

I dont think option "Extinguish" is viable, because the C11 memory model has several problems and is revisited see #6396 (comment).
A rewrite could not build on a "clear+futureproof design" and would rather be a TSAN port.
Upstream LLVM TSAN (compiler-rt/lib/tsan), as of 094fb13b88b36ecfa475cb877d2c6e9d90b4d1a5, 20220511 has ~26024 LOC C++ code from which 2329 LOC are tests.
(tokei -e tests compiler-rt/lib/tsan/ and tokei compiler-rt/lib/tsan/ installed with cargo install tokei)

For the same reason I suspect that extending things significantly might not be good as the research into models is very active (the linked paper in the comment describes it as very hard to follow due to activity).
1.Do you happen to know which paper is or are the reference ones that TSAN main developers are using as "intended semantics" or how they handle semantic problems?
2. Could moving things into the compiler prevent use cases like user extensions similar to how Valgrind allows Kernel memory/object/state tracking? Or how do you estimate the risk here?
3. How much runtime API surface for TSAN would be exposed to the user or will this only contain the compilation flag configurations?

@kprotty
Copy link
Member Author

kprotty commented May 16, 2022

It's not immediately clear why the "Extinguish" option isn't viable and how the C11 memory ordering problems affect it. One can still have C11 semantics without fences - they can still exist in formal reasoning without being exposed or used at the implementation level. This does not imply a rewrite of TSan, nor is said rewrite (if it were to happen) need to built on a "clear+futureproof design" I don't think.

The "Extend" option shouldn't be significant: it only makes the synchronizes-with relationships for fences more explicit. It also shouldn't change any existing happens-before reasoning making it unrelated to continuing research on memory models. Anecdotally, fence synchronization hasn't really been "very hard to follow".

  1. No. I also don't think their implementation has to be backed by a paper to correctly implement C11 race detection.. If there are semantic mapping issues with TSan, those would be great to know though.
  2. I'm not sure what you refer to by "user extensions" w.r.t. moving TSan workarounds into the compiler
  3. None, just like at the moment. There could be std.tsan.* for utils like flushing TSan memory but those aren't currently used by the stdlib or compiler.

@matu3ba
Copy link
Contributor

matu3ba commented Jun 4, 2022

I'm not sure what you refer to by "user extensions" w.r.t. moving TSan workarounds into the compiler

https://valgrind.org/docs/manual/drd-manual.html section 8.2.5. Client Requests and https://valgrind.org/docs/manual/hg-manual.html shows a pile of macros, which TSan might choose to support in one or the other form.

@andrewrk andrewrk modified the milestones: 0.11.0, 0.12.0 Jun 19, 2023
@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Aug 14, 2024
@andrewrk
Copy link
Member

Extinguish

@andrewrk andrewrk added accepted This proposal is planned. and removed use case Describes a real use case that is difficult or impossible, but does not propose a solution. labels Sep 24, 2024
@andrewrk andrewrk changed the title Handling atomic fences under ThreadSanitizer remove @fence from the language Sep 24, 2024
Ultra-Code pushed a commit to Ultra-Code/zig that referenced this issue Oct 8, 2024
richerfu pushed a commit to richerfu/zig that referenced this issue Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants