You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While working on #3641 I've noticed that the compiler currently does the optimizations on its own most of the time for 1 and 2 byte elements.
The exception I found is a case with zero value and global variable, see DevCom-10334032.
But otherwise the optimization is there - either vectorized implementation, or rep stos or memset call.
However, the optimization is not engaged for larger types, specifically 4 and 8 bytes, except for compile-time zero.
For compile-time zero, compiler substitutes memset on its own.
As mentioned in #3641 (comment), it could be possible to remove the 1-byte memset optimization, but we'd need to be very careful to avoid regressing behavior, and we're not super eager to do this any time soon (removing the optimization in the STL's sources would simplify our code, but the cost in code churn to get there mostly defeats the purpose).
A proof that it works with at least raw pointers: https://godbolt.org/z/sq83rM49M
Regarding unwrapping, if optimizer fails with unwrapping, may consider keeping only unwrapping, but not manual substitution.
While working on #3641 I've noticed that the compiler currently does the optimizations on its own most of the time for 1 and 2 byte elements.
The exception I found is a case with zero value and global variable, see DevCom-10334032.
But otherwise the optimization is there - either vectorized implementation, or
rep stos
ormemset
call.However, the optimization is not engaged for larger types, specifically 4 and 8 bytes, except for compile-time zero.
For compile-time zero, compiler substitutes
memset
on its own.I'm not sure how to proceed:
wmemset
forfill
optimization #3641 without merging ?std::fill
optimization ?memset
optimization ?The text was updated successfully, but these errors were encountered: