Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<atomic>: ARM64 should use LDAR and STLR for weaker than full SC #83

Closed
BillyONeal opened this issue Sep 9, 2019 · 12 comments · Fixed by #3399
Closed

<atomic>: ARM64 should use LDAR and STLR for weaker than full SC #83

BillyONeal opened this issue Sep 9, 2019 · 12 comments · Fixed by #3399
Labels
ARM64 Related to the ARM64 architecture fixed Something works now, yay! performance Must go faster

Comments

@BillyONeal
Copy link
Member

According to ARM experts, we can't change our existing load + barrier without an ABI break for full SC support, but we can do so for the plain acquire_release orderings.

@BillyONeal BillyONeal self-assigned this Sep 9, 2019
@StephanTLavavej StephanTLavavej added the enhancement Something can be improved label Sep 9, 2019
@StephanTLavavej StephanTLavavej added the performance Must go faster label Sep 19, 2019
@BillyONeal BillyONeal removed their assignment Oct 28, 2019
@StephanTLavavej StephanTLavavej removed the enhancement Something can be improved label Feb 6, 2020
@AlexGuteniev
Copy link
Contributor

@BillyONeal
Copy link
Member Author

There are no ARM devices we target with ldar or stlr.

@AlexGuteniev
Copy link
Contributor

Can't ARM64 device run 32-bit ARM code directly? Or it doesn't make sense?

@BillyONeal
Copy link
Member Author

Yes, but I don't think ARM is doing the Intel-like thing where they bring all the features to 32 bit. (Unlike ia32 -> amd64, there are significant perf reasons to move to aarch64 in basically all cases, such as the doubled register count.)

@jpark37
Copy link

jpark37 commented Apr 15, 2021

Are there plans to incorporate LDAR/STLR for full SC at some point? Otherwise it seems like developers targeting ARM64 with MSVC should avoid std::atomic and roll their own.

@BillyONeal
Copy link
Member Author

Are there plans to incorporate LDAR/STLR for full SC at some point?

We can't under the current ABI, that's what I meant by:

we can't change our existing load + barrier without an ABI break for full SC support

LDAR/STLR only provide full SC when used together, they don't provide it when used with any other form of barriers.

@jpark37
Copy link

jpark37 commented Apr 15, 2021

I get that you don't want (or can't have) an ABI break, but doesn't having a slow seq_cst implementation devalue the class significantly? We were sold a future where seq_cst would be fast enough that we could ignore relaxed atomics and their pitfalls.

@BillyONeal
Copy link
Member Author

I get that you don't want (or can't have)

All the developers on this project want an ABI break very much :D.

doesn't having a slow seq_cst implementation devalue the class significantly?

Yep! But nothing can be done about it now since intrinsics to ask the compiler for LDAR and STLR didn't exist when arm64 first shipped (locking our ABI)

@obfuscated
Copy link

Can you clarify what "ABI break" means in this case?
Would we have problems if we mix synchronisation primitives which use LDAR and STLR (our own implementation of atomic) and std::atomic from MS STL? By mix I mean "we have them in the application".

@BillyONeal
Copy link
Member Author

BillyONeal commented May 11, 2021

Can you clarify what "ABI break" means in this case?

If you have an .obj with a function like:

void do_things(std::atomic<int>& a) {
a.fetch_add(42);
}

compiled with VS 2017, the function do_things is going to have barriers compiled into it, not LDAR or STLR. If you have another function

void do_different_things(std::atomic<int>& a) {
a.fetch_add(1729);
}

compiled with VS2022 and we changed std::atomic to use LDAR and/or STLR, the guarantees atomic provides won't hold anymore, because LDAR and STLR do not synchronize with the barriers used in the compiled-with-2015 function. (According to the Qualcomm folks we asked)

Would we have problems if we mix synchronisation primitives which use LDAR and STLR (our own implementation of atomic) and std::atomic from MS STL? By mix I mean "we have them in the application".

Just being in the same application isn't an issue; you could have my_foobar_atomic that is effectively layout-incompatible with std::atomic which used LDAR and/or STLR if you want. (But if you tried to reinterpret_cast between them or similar then doom)

@obfuscated
Copy link

So the problem happens only if we mix barrier and LDAR/STLR code on the same variables/memory locations? If we have variable a using barriers and variable b using LDAR/STLR all would be fine?

@BillyONeal
Copy link
Member Author

BillyONeal commented May 11, 2021

Yes, but std::atomic has no means of distinguishing between A and B. atomic<int> in the .obj from 2017 must remain the same when used from an .obj in 2021. (That’s kind of the whole meaning of ABI stability)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ARM64 Related to the ARM64 architecture fixed Something works now, yay! performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants