-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std::swap of arrays, why is there no specialization for trivial types #2683
Comments
I think this is a compiler issue, as libstdc++'s implementation of Perhaps we can use the same mechanism as that of |
@monamimani , note that we can't simply use @frederick-vs-ja , I believe that the vectorized Lines 461 to 463 in 314f65f
Lines 6129 to 6143 in 314f65f
|
@StephanTLavavej, of course it would need temporary storage like swapping ints. Sorry that I wasn't clear enough. :) Because I feel this is more an issue with the vectorizer I also opened an issue on the Developper Community |
I can implement this - it just needs a bit of surgery in |
Thank you! But I found that |
Great thanks! |
I need to revert my PR as this broke users of |
As I figured, at the core it might be a Vectorizer issue I also opened an issue over at the Developper Community, here They have acknowledge the issue and are working on it, it may take time. People that have access might be able to know more. I at least wanted to tell you they are aware of this. I don't exactly know what the mechanism of Why not do something stupid like this?
I am pretty sure that something that I don't know about will make this none desirable, but this should get vectorized. |
I might add that from their own page The 13xx reason codes apply to the vectorizer.
It says it should emit memcpy. but it doesn't. :) |
Thanks - the autovectorizer may not be able to deal with the swap algorithm, but we do plan to properly fix this in the STL in the future. (Our vectorized algorithm is faster than |
There's a possibility to implement that in headers using See how copying by 32-bit portions is vectorized using SSE2 and AVX2: https://godbolt.org/z/YcEPz848W Note how despite using immediate buffer in C++ it is not used in the assembly. Can do this in two or more loops with descending sizes to handle the variety of sizes. The exact portions size and the way of handling the tail ( |
I was looking at what swap was doing for arrays and in the MSVC stl it loops trough the array and swap each element. (libstdc++ and libc++ do the same thing).
The thing is, it was an array of std::byte, I was surprised that none of the STL have specializations for trivially copyable/movable types that would call memcopy. Now I figure, this is maybe how the spec is. So my first question is there something preventing those specialization from existing.
My second question which is maybe off-topic for this is even tough clang and gcc also do a loop, "I believe" their auto vectorizer see the pattern and emits SSE instructions but MSVC doesn't it still loops.
Here is a compiler explorer link
https://godbolt.org/z/W8GxPv1ov
Now this is maybe something more for the compiler team, I don't know.
In any case I saw the differences and I thought I would point it out.
Thank you
The text was updated successfully, but these errors were encountered: