Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix copy_str performance #2475

Closed
wants to merge 1 commit into from
Closed

Conversation

Roman-Koshelev
Copy link
Contributor

Contrary to expectations, optimized overloading is not called. std::copy is perfectly optimized in std::memmove (like std::uninitialized_copy_n, and other analogs that are used in the project) and in many more cases than this implementation suggests

@vitaut
Copy link
Contributor

vitaut commented Aug 31, 2021

core.h doesn't depend on <algorithm> so std::copy can't be used there. When optimized overload is not called?

@vitaut
Copy link
Contributor

vitaut commented Aug 31, 2021

Also copy_str does character conversion if necessary.

@vitaut vitaut closed this Aug 31, 2021
@foonathan
Copy link
Contributor

Also std::copy is only constexpr in C++20.

@Roman-Koshelev
Copy link
Contributor Author

@vitaut Why close PR without solving the problem and waiting for an answer? I wrote a great working implementation in which optimizations not only work but are also applicable in large cases. (See code)

@Roman-Koshelev
Copy link
Contributor Author

@foonathan also like copy_str (which depends on std::is_constant_evaluated ())

@vitaut
Copy link
Contributor

vitaut commented Aug 31, 2021

Please explain in more detail what problem you are trying to solve.

@Roman-Koshelev
Copy link
Contributor Author

Roman-Koshelev commented Aug 31, 2021

@vitaut Example:

#include <cstdio>

template <typename InputIt, typename OutputIt>
void fun(InputIt begin, InputIt end, OutputIt out) {
  printf("N1\n");
}

void fun(const char* begin, const char* end, char* out) {
  printf("N2\n");
}

int main() {
  char buf[10];
  char buf2[10];
  fun(buf, buf + 10, buf2);
}

What overload will be caused by the result? Obviously the first, but we want the second

@Roman-Koshelev
Copy link
Contributor Author

The best way I can think of to fix this behavior is

#include <cstdio>
#include <type_traits>

template <typename InputIt, typename OutputIt>
void fun(InputIt begin, InputIt end, OutputIt out) {
  printf("N1\n");
}

template <typename InputIt, typename OutputIt>
std::enable_if_t
<
  std::is_same_v<std::remove_const_t<InputIt>, OutputIt> &&
  std::is_trivially_copy_assignable_v<OutputIt>,
  void
>
fun(InputIt* begin, InputIt* end, OutputIt* out) {
  printf("N2\n");
}

int main() {
  char buf[10];
  char buf2[10];
  fun(buf, buf + 10, buf2);
}

This works great for pointer iterators (we need separate body movements for wrappers, but that's enough for us)

@alexezeder
Copy link
Contributor

Obviously the first, but we want the second

Why it's obvious? https://godbolt.org/z/PPx3rve9h

@Roman-Koshelev
Copy link
Contributor Author

Roman-Koshelev commented Aug 31, 2021

Just call you in different ways. copy_str<char>() and copy_str().

@alexezeder
Copy link
Contributor

Ok, so here is the updated usage with copy_str<char>() in both cases: https://godbolt.org/z/GcbnEMnYe. I just took

  char buf[10];
  char buf2[10];
  fun(buf, buf + 10, buf2);

from examples without properly adapting it.
Both of these usages invoke copy_str implementation with while-loop inside, and since all usages of {fmt} look like copy_str<T>(), then I guess there are some problems with copy_str.

@alexezeder
Copy link
Contributor

But can we reopen this PR, since the second one is wrong because of incorrect memcpy usage anyway?

@vitaut
Copy link
Contributor

vitaut commented Aug 31, 2021

I can reopen this PR but this would require closing the other one to make github happy.

@vitaut
Copy link
Contributor

vitaut commented Aug 31, 2021

Since this PR cannot be reopen because of a force-push, please move the discussion to #2477.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants