Skip to content
This repository has been archived by the owner on May 7, 2024. It is now read-only.

Consider open-coding subword atomics or move to libgcc #12

Closed
sorear opened this issue Oct 26, 2016 · 21 comments
Closed

Consider open-coding subword atomics or move to libgcc #12

sorear opened this issue Oct 26, 2016 · 21 comments

Comments

@sorear
Copy link

sorear commented Oct 26, 2016

The problem

Code which uses __atomic_xxx builtins or C11 atomics requires to be linked with -latomic on RISC-V, but not on other common architectures (including but not limited to ARM, x86).

Example code:

unsigned char ach;
int main() {
  __atomic_fetch_add(&ach,1,__ATOMIC_RELAXED);
  return 0;
}

Testing with the tool at gcc.godbolt.org:

  • x86, x86_64, ARM, and ARM64 have native support for byte atomics
  • PowerPC, MIPS, MIPS64 open-codes a full word LL/increment one byte/SC loop
  • MSP430 generates a call to an external helper function
  • the AVR compiler is too old to have __atomic_fetch_add

Option 1: Use existing libgcc helper

Modify gcc to generate a call to __sync_fetch_and_add_1 instead of __atomic_fetch_add_1, and similarly for other size-1 and size-2 atomics.

Advantages: code size, avoids polluting the libgcc ABI with weirdness

Option 2: Move helpers from libatomic to libgcc

Make libgcc provide a subset of the libatomic names.

Advantages: Probably the easiest in the short term

Disadvantages: Creates an ABI mess that we'll probably regret later, especially for things like compiler-rt (LLVM's compatible reimplementation of libgcc)

Option 3: Inline code generation

As MIPS/MIPS64/PowerPC are already doing.

Advantages: sidesteps ABI issues entirely

Option 4: Add --as-needed -latomic to the spec file

Arguably gcc should be doing this on all platforms, _Atomic(char[32]) is in C11 and requiring a non-POSIX -l flag to use standard C features is quite dodgy.

Disadvantages: Depletes the RISC-V weirdness budget

Option 5: Do nothing and change the build scripts of affected software

So far this includes QEMU and the Boehm GC. I haven't analyzed all of the Fedora build failures so it might have caused others.

Disadvantages: Depletes the RISC-V weirdness budget much more since it affects non-compiler software

@kito-cheng
Copy link
Collaborator

Option 3 + 1

@aswaterman
Copy link
Contributor

  • Option 1 seems pretty reasonable. It's theoretically less performant, since the __sync variants don't support the relaxed memory models, but in practice it's probably irrelevant for these LR/SC sequences.
  • I dislike Option 2 for the reason you mentioned.
  • Option 3 sounds more like an optimization you'd enable only for -O3, as it will bloat the code--especially for ADD/SWAP/MIN/MAX, which need to be turned into lengthy LR/SC sequences. For AND/OR/XOR, it's not so bad.
  • Option 4 seems pretty reasonable. For few users does it deplete the weirdness budget.
  • I dislike Option 5 for the reason you mentioned.

@aswaterman
Copy link
Contributor

I lean towards Option 4. I don't see why libatomic exists at all if it's considered weird to link against it.

@sorear
Copy link
Author

sorear commented Oct 26, 2016

@aswaterman I'm only calling option 4 "weird" because changing the default linker command line is something that GCC should do across the board, not one architecture at a time.

(libatomic exists (AIUI) primarily to support double-word and wider atomics using the ifunc mechanism. When the T extension is finalized (or a DW-CAS extension, potentially), libatomic can be modified to use it for __atomic_compare_exchange_16. That way you can write an algorithm which uses DW-CAS, have it run on DW-CAS capable hardware, and also run (with lower concurrency due to contention the lock table) on hardware without DW-CAS.

Dynamically linking libatomic is required for ifuncs to work, and also allows entirely new instructions to be used without recompiling programs. libgcc, by contrast, is statically linked by default.

For 1, 2 byte atomics the fallback code is already lock free so being able to replace it at runtime is much less valuable. If you had a chip with byte/word AMOs, the savings from using them would probably be of the same order as the PLT call overhead.)

@aswaterman
Copy link
Contributor

@sorear understood. I don't have much of a problem being an avant garde linker-invoker, but it does seem this should eventually change globally. Until then, would you mind submitting a PR implementing option 4?

@sorear
Copy link
Author

sorear commented Oct 26, 2016

@aswaterman If you don't think 4 will cause upstreaming problems ("Why did you change that? The link order is supposed to be the same everywhere"), then I'm fine with it. Can do a PR shortly. (Option 3 might still make sense at -O3, at least because it clobbers fewer registers than a shared library function call; for another time.)

@aswaterman
Copy link
Contributor

@sorear hard to know. Maybe it will nudge the maintainers towards an appropriate global fix. Regardless, I'm inclined to cross that bridge when we get there. If the maintainers object, we can fall back to option 1. (I concur w.r.t. Option 3.)

@sorear
Copy link
Author

sorear commented Oct 27, 2016

#15 is another bit of related weirdness which does not seem to materially affect software. I think options 1 and 3 would resolve #15, the others wouldn't.

aswaterman added a commit that referenced this issue Feb 1, 2017
@aswaterman
Copy link
Contributor

2c4857d implements option 4.

@rishikhan
Copy link

where is this libatomic? I don't see it in /opt/riscv on my system after installing riscv-gnu-toolchain.

@jim-wilson
Copy link
Collaborator

libatomic requires some OS support. It is built for linux targets. It is not built for bare metal embedded targets.

@rishikhan
Copy link

rishikhan commented Dec 18, 2017 via email

@jim-wilson
Copy link
Collaborator

The posix support uses pthread_mutex_lock and pthread_mutex_unlock. The mingw (Windows) support uses CreateMutex, WaitForSingleObject, and ReleaseMutex. It would be possible to handle locks other ways. No one has written code to do that.

I would expect that the simplest atomic operations get expanded inline, and you only call libatomic for operations that can't easily be open coded.

@rishikhan
Copy link

rishikhan commented Dec 18, 2017 via email

@sorear
Copy link
Author

sorear commented Feb 10, 2018

@rishikhan
Copy link

rishikhan commented Feb 10, 2018 via email

@jyknight
Copy link

jyknight commented Oct 16, 2019

The intent of libatomic has ben that for a given object size/alignment, the compiler should either call libatomic's _atomic* for all the operations, or call it for no operations, rather than mix-and-match per operation.

Concretely, the compiler should either call __atomic_* functions for all the 1-byte atomic operations (load, fetch-add, store, etc), or it should call them for none. Since in RISCV with A extension allows these to be implemented lock-free, it should not be calling __atomic_* for any of them.

The valid choices are to call an external helper function (which can be implemented in the static part of libgcc, since it has no state), or open-code it.

I'd suggest open-coding, but option 1 would be fine too.

@jim-wilson
Copy link
Collaborator

The plan is to open code all of the atomics, but there is no volunteer to do the work, and unclear when it will happen.

gktrk added a commit to gktrk/llvm-project that referenced this issue Oct 25, 2019
The check for 'HAVE_CXX_ATOMICS_WITHOUT_LIB' may create false
positives in RISC-V. This is reproducible when compiling LLVM natively
using GCC on a rv64gc (rv64imafdgc) host. Due to the 'A' (atomic)
extension, g++ replaces calls to libatomic operations on the
std::atomic<int> type with the native hardware instructions. As a
result, the compilation succeeds and the build system thinks it
doesn't need to pass '-latomic'.

To complicate matters, even when the 'A' extension is forcibly
disabled (by passing '-march=rv64imfd'), g++ compiles the test code
snippet just fine.

To the contrary, '__atomic' builtins or C++11 atomics requires linking
against '-latomic' in RISC-V (See
riscvarchive/riscv-gcc#12).

This causes the following build failure for dsymutil:

: && /usr/bin/riscv64-unknown-linux-gnu-g++  -O2 -pipe -fPIC -fvisibility-inlines-hidden -Werror=date-time -std=c++14 -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-class-memaccess -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment -fdiagnostics-color -ffunction-sections -fdata-sections  -Wl,-O1 -Wl,--as-needed -Wl,-allow-shlib-undefined    -Wl,-rpath-link,/var/tmp/portage/sys-devel/llvm-10.0.0.9999/work/llvm-10.0.0.9999-abi_riscv_lp64d.lp64d/./lib64/lp64d  -Wl,-O3 -Wl,--gc-sections tools/dsymutil/CMakeFiles/dsymutil.dir/dsymutil.cpp.o tools/dsymutil/CMakeFiles/dsymutil.dir/BinaryHolder.cpp.o tools/dsymutil/CMakeFiles/dsymutil.dir/CFBundle.cpp.o tools/dsymutil/CMakeFiles/dsymutil.dir/CompileUnit.cpp.o tools/dsymutil/CMakeFiles/dsymutil.dir/DebugMap.cpp.o tools/dsymutil/CMakeFiles/dsymutil.dir/DeclContext.cpp.o tools/dsymutil/CMakeFiles/dsymutil.dir/DwarfLinker.cpp.o tools/dsymutil/CMakeFiles/dsymutil.dir/DwarfStreamer.cpp.o tools/dsymutil/CMakeFiles/dsymutil.dir/MachODebugMapParser.cpp.o tools/dsymutil/CMakeFiles/dsymutil.dir/MachOUtils.cpp.o tools/dsymutil/CMakeFiles/dsymutil.dir/NonRelocatableStringpool.cpp.o tools/dsymutil/CMakeFiles/dsymutil.dir/SymbolMap.cpp.o  -o bin/dsymutil  -Wl,-rpath,"\$ORIGIN/../lib64/lp64d" lib64/lp64d/libLLVM-10svn.so -lpthread && :
/usr/lib/gcc/riscv64-unknown-linux-gnu/9.1.0/../../../../riscv64-unknown-linux-gnu/bin/ld: tools/dsymutil/CMakeFiles/dsymutil.dir/dsymutil.cpp.o: in function `.LEHB79':
dsymutil.cpp:(.text._ZNSt17_Function_handlerIFvvESt5_BindIFZ4mainEUlSt10shared_ptrIN4llvm14raw_fd_ostreamEENS3_8dsymutil11LinkOptionsEE_S5_S7_EEE9_M_invokeERKSt9_Any_data+0x168): undefined reference to `__atomic_fetch_and_1'
/usr/lib/gcc/riscv64-unknown-linux-gnu/9.1.0/../../../../riscv64-unknown-linux-gnu/bin/ld: tools/dsymutil/CMakeFiles/dsymutil.dir/dsymutil.cpp.o: in function `.L0 ':
dsymutil.cpp:(.text._ZNSt17_Function_handlerIFvvESt5_BindIFZ4mainEUlSt10shared_ptrIN4llvm14raw_fd_ostreamEENS3_8dsymutil11LinkOptionsEE_S5_S7_EEE9_M_invokeERKSt9_Any_data+0x2ee): undefined reference to `__atomic_fetch_and_1'
/usr/lib/gcc/riscv64-unknown-linux-gnu/9.1.0/../../../../riscv64-unknown-linux-gnu/bin/ld: dsymutil.cpp:(.text.startup.main+0xefc): undefined reference to `__atomic_fetch_and_1'
/usr/lib/gcc/riscv64-unknown-linux-gnu/9.1.0/../../../../riscv64-unknown-linux-gnu/bin/ld: tools/dsymutil/CMakeFiles/dsymutil.dir/dsymutil.cpp.o: in function `.L2508':
dsymutil.cpp:(.text.startup.main+0x1b2a): undefined reference to `__atomic_fetch_and_1'
collect2: error: ld returned 1 exit status

This is likely because dsymutil is defining a variable of type
'std::atomic_char' (See:
https://github.com/llvm/llvm-project/blob/release/9.x/llvm/tools/dsymutil/dsymutil.cpp#L542)

Improve the reliability of the 'HAVE_CXX_ATOMICS_WITHOUT_LIB' test in
two steps:

1. Force a pre-increment on x (++x), which should force a call to a
libatomic function per the prototype:

  __int_type
  operator++(int) volatile noexcept
  { return fetch_add(1); }

2. Because step 1 would resolve the increment to 'amoadd.w.aq' under
the 'A' extension, force the same operation on sub-word types, for
which there is no hardware support. This effectively forces g++ to
insert calls to libatomic functions.

Differential Revision: https://reviews.llvm.org/D68964

Signed-off-by: Gokturk Yuksek <[email protected]>
luismarques pushed a commit to llvm/llvm-project that referenced this issue Feb 17, 2020
In some systems, such as RISC-V, atomic support requires explicit linking
against '-latomic' (see riscvarchive/riscv-gcc#12).

Reviewers: davezarzycki, hhb, beanz, jfb, JDevlieghere
Reviewed By: beanz, JDevlieghere
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69003
arichardson pushed a commit to arichardson/llvm-project that referenced this issue Apr 1, 2020
In some systems, such as RISC-V, atomic support requires explicit linking
against '-latomic' (see riscvarchive/riscv-gcc#12).

Reviewers: davezarzycki, hhb, beanz, jfb, JDevlieghere
Reviewed By: beanz, JDevlieghere
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69003
@alexfanqi
Copy link

alexfanqi commented Jan 7, 2022

Since glibc 2.34, pthread is builtin and recent cmake FindThreads module find -pthread is unneccessary therefore doesn't include -pthread flag in linking. I have to condition on CMAKE_USE_PTHREADS_INIT to explicitly pass -pthread as a workaround. In this case, does 2c4857d or https://github.com/gcc-mirror/gcc/blob/041cfa0ce44d4c207903d41e6eabccdab2dfa90b/gcc/config/riscv/linux.h#L42 still valid?

@jim-wilson
Copy link
Collaborator

Development work happens upstream. FSF GCC developers do not follow this repo. They follow the upstream repo. This site is really only useful for helping newbies. Anything development related needs to be discussed upstream. Long term, something needs to happen, but it won't happen here, it will happen upstream.

Yes, the riscv/linux.h line 42 is still valid. GCC supports more than one version of glibc, and gcc docs still say that -pthread is required. Also note that not all UNIX systems use glibc, so we can't change how things work just because of a glibc change, unless perhaps that change is conditional on use of glibc and the glibc version.

It is possible for distros to work around the problem. I think debian puts a AS_NEEDED (libatomic.so) in libc.so for instance. Or a distro could force libatomic to be included in libpthread. There may also be other ways to work around the problem.

fanghuaqi pushed a commit to riscv-mcu/riscv-gcc that referenced this issue Feb 10, 2022
…ntal-v

fixed rvp's ext_version add rvv's ext_flags
@eli-schwartz
Copy link

The Meson build system now has a feature request that we add a new feature just to enable passing -latomic into packages in accordance with "Option 5: Do nothing and change the build scripts of affected software".

This feels like a pretty high weirdness budget even compared to the general case of "non-compiler software". I wonder if a solution can be found...

2c4857d implements option 4.

This was apparently reverted to only be invoked during pthread as part of a giant squashed upstream merge beyond which nothing is clearly visible? Anyway, it seems this ticket is no longer actually completed.

mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
In some systems, such as RISC-V, atomic support requires explicit linking
against '-latomic' (see riscvarchive/riscv-gcc#12).

Reviewers: davezarzycki, hhb, beanz, jfb, JDevlieghere
Reviewed By: beanz, JDevlieghere
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69003
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants