Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc++] Vectorize all the algorithms #84663

Open
4 of 26 tasks
philnik777 opened this issue Mar 10, 2024 · 3 comments
Open
4 of 26 tasks

[libc++] Vectorize all the algorithms #84663

philnik777 opened this issue Mar 10, 2024 · 3 comments
Assignees
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. metabug Issue to collect references to a group of similar or related issues. performance

Comments

@philnik777
Copy link
Contributor

philnik777 commented Mar 10, 2024

There are quite a few algorithms which can be vectorized, or one should at least look into doing so. This issue lists all the algorithms which I think can be vectorized (and a few that shouldn't be with explanations).

To be vectorized:

  • find (already done partially by forwarding to {w,}memchr)
  • find_end
  • find_first_of (possibly not worth it, since clang does it automatically in some cases)
  • adjacent_find
  • mismatch - [libc++] Vectorize mismatch #73255
  • search (not checked whether it actually makes sense)
  • search_n
  • remove (requires efficient compressstore)
  • remove_copy (requires efficient compressstore)
  • unique (might require efficient compressstore)
  • unique_copy (might require efficient compresstore)
  • is_sorted_until
  • includes
  • min_element
  • max_element
  • minmax (range overload): Should first be checked whether it can be rewritten to allow auto-vectorization - [libc++] Optimize ranges::minmax #87335
  • minmax_element - Vectorize minmax_element. #112397
  • lexicographical_compare: Should use mismatch for the actual vectorization, but only forward to it if it's known to vectorize
  • lexicographical_compare_three_way: same as lexicographical_compare
  • replace_if autovectorized since [LV] Support generating masks for switch terminators. #99808

Should be checked whether they can be vectorized:

  • set_union
  • set_intersection
  • set_difference
  • set_symmetric_difference
  • merge
  • rotate is using memmove when possible, but may make sense to optimize for small ranges

Shouldn't be vectorized:

  • contains: forwards to find, which should be vectorized instead
  • count: is auto-vectorized
  • equal: Already forwards to memcmp for trivially equality comparable types and doesn't seem worth it for floats
  • fold (and friends): is auto-vectorized
  • copy (and friends): can be forwarded to memmove
  • swap_ranges: is auto-vectorized
  • replace (and friends): is auto-vectorized
  • fill (and friends): is auto-vectorized
  • reverse_copy: is auto-vectorized
  • rotate_copy: is forwarded to two memmove calls if possible
  • shift_left/shift_right: is forwarded to memmove if possible
  • is_sorted: forwards to is_sorted_until, which should be vectorized instead
  • min, max (range overload): is auto-vectorized
  • reverse: auto-vectorized since [SCEV] Support addrec in right hand side in howManyLessThans #92560
@philnik777 philnik777 added libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. performance labels Mar 10, 2024
@philnik777 philnik777 self-assigned this Mar 10, 2024
@EugeneZelenko EugeneZelenko added the metabug Issue to collect references to a group of similar or related issues. label Mar 10, 2024
@mrdaybird
Copy link
Contributor

@philnik777 std::reverse auto-vectorized by #92560

@philnik777
Copy link
Contributor Author

@mrdaybird thanks, that's great! I've updated the list.

@sjoerdmeijer
Copy link
Collaborator

FYI: vectorization of find_first_of for AArch64 is implemented in #101976

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. metabug Issue to collect references to a group of similar or related issues. performance
Projects
None yet
Development

No branches or pull requests

4 participants