Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<xutility>: Drop memset optimization? #3642

Open
AlexGuteniev opened this issue Apr 8, 2023 · 2 comments
Open

<xutility>: Drop memset optimization? #3642

AlexGuteniev opened this issue Apr 8, 2023 · 2 comments
Labels
performance Must go faster

Comments

@AlexGuteniev
Copy link
Contributor

AlexGuteniev commented Apr 8, 2023

While working on #3641 I've noticed that the compiler currently does the optimizations on its own most of the time for 1 and 2 byte elements.

The exception I found is a case with zero value and global variable, see DevCom-10334032.
But otherwise the optimization is there - either vectorized implementation, or rep stos or memset call.

However, the optimization is not engaged for larger types, specifically 4 and 8 bytes, except for compile-time zero.
For compile-time zero, compiler substitutes memset on its own.

I'm not sure how to proceed:

@AlexGuteniev AlexGuteniev changed the title <xutility>: Drop memcpy optimization <xutility>: Drop memcpy optimization? Apr 8, 2023
@AlexGuteniev AlexGuteniev changed the title <xutility>: Drop memcpy optimization? <xutility>: Drop memset optimization? Apr 8, 2023
@StephanTLavavej StephanTLavavej added performance Must go faster decision needed We need to choose something before working on this labels Apr 8, 2023
@StephanTLavavej
Copy link
Member

As mentioned in #3641 (comment), it could be possible to remove the 1-byte memset optimization, but we'd need to be very careful to avoid regressing behavior, and we're not super eager to do this any time soon (removing the optimization in the STL's sources would simplify our code, but the cost in code churn to get there mostly defeats the purpose).

@StephanTLavavej StephanTLavavej removed the decision needed We need to choose something before working on this label Apr 12, 2023
@AlexGuteniev
Copy link
Contributor Author

A proof that it works with at least raw pointers: https://godbolt.org/z/sq83rM49M
Regarding unwrapping, if optimizer fails with unwrapping, may consider keeping only unwrapping, but not manual substitution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

No branches or pull requests

2 participants