Remove unnecessary explicit memory barriers on ARM#14567
Merged
straight-shoota merged 1 commit intocrystal-lang:masterfrom May 7, 2024
Merged
Conversation
Reading the LLVM atomics documentation, the notes for code generation
state that weak architectures should generate memory barriers, which
implies that all memory orders, except for relaxed, should always
generate the necessary memory barriers when asked to.
Compiling and disassembling programs for ARMv6 and ARMv7 (you must
specify a CPU or CPU feature to target these), we can indeed notice that
memory barriers are properly emitted by the LLVM code generation. For
example the assembly contains `dmb ish` for ARMv7 and
`mcr 15, 0, r1, cr7, cr10, {5}` for ARMv6.
Compiling for older ARMv4 or ARMv5 the atomics call the `__sync_*`
symbols from `libgcc` where memory ordering is a noop (not supported).
This patch thus removes the explicit memory barriers as they duplicate
what LLVM backends are already doing anyway.
Collaborator
Author
|
This time it should really be the last time we deal with this topic 😅 The initial error was that we used a lazy set to clear atomics when we should have used explicit memory orders. Only on x86, in very specific scenarios, can we get away with an lazy set. |
straight-shoota
approved these changes
May 6, 2024
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reading the LLVM atomics documentation, the notes for code generation state that weak architectures should generate memory barriers, which implies that all memory orders, except for relaxed, should always generate the necessary memory barriers when asked to.
Compiling and disassembling programs for ARMv6 and ARMv7 (you must specify a CPU or CPU feature to target these, otherwise LLVM defaults to ARMv4 or ARMv5), we can indeed notice that memory barriers are properly emitted by the LLVM code generation. For example the assembly contains
dmb ishfor ARMv7 andmcr 15, 0, r1, cr7, cr10, {5}for ARMv6.Compiling for older ARMv4 or ARMv5 the atomics call the
__sync_*symbols fromlibgccwhere memory ordering is a noop (not supported by the micro architecture).This patch thus removes the explicit memory barriers. They duplicate what LLVM backends are already doing (optimized away in some cases, but there may be edge cases where it's not), and this noticeably simplifies our code.