Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster iox::vector on POD data #2082

Closed
lucabart97 opened this issue Nov 8, 2023 · 18 comments · Fixed by #2104 or #2195
Closed

faster iox::vector on POD data #2082

lucabart97 opened this issue Nov 8, 2023 · 18 comments · Fixed by #2104 or #2195
Labels
enhancement New feature

Comments

@lucabart97
Copy link
Contributor

Brief feature description

The current implementation of iox::vector calls constructor and destructor also on POD data, this may slow the application if some big different resizes are done.

Detailed information

Small example, a pointcloud with 1M points

struct Point
{
    float x;
    float y;
    float z;
};
constexpr uint64_t MAX_SIZE = 1234567;

int main()
{
    iox::vector<Point, MAX_SIZE> cloud;

    auto start = high_resolution_clock::now();
    cloud.resize(MAX_SIZE);
    std::cout << "resize time 0: " << duration_cast<milliseconds>(high_resolution_clock::now() - start).count() << " (ms)" << std::endl;

    cloud.clear();

    start = high_resolution_clock::now();
    cloud.resize(MAX_SIZE);
    std::cout << "resize time 1: " << duration_cast<milliseconds>(high_resolution_clock::now() - start).count() << " (ms)" << std::endl;

    return 0;
}

result:

resize time 0: 8 (ms)
resize time 1: 5 (ms)

The results are calculated on my PC with Intel processor, when you use ARM (Eg. jetson) trust me, the time is very high.

Improve the code (my suggestion)

Check the template data with std::is_pod<> to avoid constructor and deconstructor calls on POD data.

@elBoberido
Copy link
Member

@lucabart97 go for it :)

Just to be sure, is_pod would be false for this type, right?

struct Point
{
    float x{};
    float y{};
    float z{};
};

There is another optimization opportunity for copy and move for trivially copyable and trivially movable types. Here a simple memcpy would also speed up the operations.

@elBoberido elBoberido added the enhancement New feature label Nov 8, 2023
@elBoberido
Copy link
Member

I just discovered that std::is_pod is deprecated with C++20. I think std::is_trivial can be used as 1 to 1 replacement. Can you confirm?

@elfenpiff
Copy link
Contributor

@elBoberido @lucabart97

So the idea would be to specialize the vector for types that satisfy std::is_trivial and that we use memcpy for copy operations and do not call the destructor when we usually would.

I like it.

@elBoberido
Copy link
Member

@elfenpiff exactly. The same would apply to inserting elements. Just a memcpy and no move or copy opreations

@lucabart97
Copy link
Contributor Author

lucabart97 commented Nov 9, 2023

@elBoberido

Just to be sure, is_pod would be false for this type, right?

struct Point
{
   float x{};
   float y{};
   float z{};
};

This type is a POD, so it will be true.
A leave there a small test with std::is_trivial given that std::is_pod is departed in C++20

#include <type_traits>
#include <iostream>

struct Pt0{
    float x;
    float y;
};

struct Pt1{
    float x;
    float y;
    Pt1() = default;
};

struct Pt2{
    float x = 0;
    float y;
};

struct Pt3{
    float x;
    float y;
    Pt3(){ x = 0;}
};

int main(){
    std::cout<<(std::is_trivial<Pt0>::value)<<"\n"; // 1
    std::cout<<(std::is_trivial<Pt1>::value)<<"\n"; // 1
    std::cout<<(std::is_trivial<Pt2>::value)<<"\n"; // 0
    std::cout<<(std::is_trivial<Pt3>::value)<<"\n"; // 0
    return 0;
}

There is another optimization opportunity for copy and move for trivially copyable and trivially movable types. Here a simple memcpy would also speed up the operations.

Yes

@lucabart97
Copy link
Contributor Author

lucabart97 commented Nov 9, 2023

@elfenpiff

So the idea would be to specialize the vector for types that satisfy std::is_trivial and that we use memcpy for copy operations and do not call the destructor when we usually would.

I like it.

For code legibility and length, I prefer if constexpr(std::is_trivial<T>::value) instead of different specializations.

@elfenpiff
Copy link
Contributor

elfenpiff commented Nov 9, 2023

@elfenpiff

So the idea would be to specialize the vector for types that satisfy std::is_trivial and that we use memcpy for copy operations and do not call the destructor when we usually would.

I like it.

For code legibility and length, I prefer if constexpr(std::is_trivial<T>::value) instead of different specializations.

@lucabart97 You are right, this is better. But please don't use ifdefs to see if its c++17 or not, just use if constexpr.

@elBoberido I think we should start to use C++17 now officially. I think we already have an agreement across all committers that we want to have this. Any objections?

@mossmaurice
Copy link
Contributor

@elfenpiff

I think we should start to use C++17 now officially. I think we already have an agreement across all committers that we want to have this. Any objections?

AFAIK the agreement was to switch to C++17 after the v3.0 C-release, but not before. So for now no if constexpr :(

@elBoberido
Copy link
Member

@mossmaurice @elfenpiff in the September developer meetup I argued for a switch after the v3.0 release and the fellow developers agreed. But I come more and more to the conclusion that the switch to C++17 might actually be one of the main reasons to do a v3.0 release. As you know, there a no new major features in iceoryx itself but "only" some bigger refactorings. Up until now everybody was happy to hear about a switch to C++17 and nobody objected, which was one of the reasons I proposed to do it after the v3.0 release. Therefore I would like to reconsider the decision made in September and move to C++17 ASAP. If there are objections we could discuss this in the next developer meetup.

@elBoberido
Copy link
Member

@lucabart97 thanks for checking all the variants with is_trivial. AFAIK the example with float x {}; should be equivalent to float x = 0;. In this case there is no reason to not continue with the proposal since the structs with the uninitialized data members would still be uninitialized after calling the ctor.

@lucabart97
Copy link
Contributor Author

@elBoberido
yes, and we did not consider the vector with basic type, e.g. float, int, that will be faster during resizes!

I'm waiting for the decision for the C++17, we could use an ifdef or wait to implement this later.

@elBoberido
Copy link
Member

@lucabart97 sounds good. We will have a decision at the end of the week. I'll ping you once I know more

@elBoberido
Copy link
Member

@lucabart97 we decided to move to C++17 with the v3.0 release and I will open an PR later today. Once that is merged you can use the whole beauty of the C++17 features

@lucabart97
Copy link
Contributor Author

I do not understand this part of the code, why there is an operator new and not just an operator=?

template <typename T, uint64_t Capacity>
template <typename... Targs>
inline bool vector<T, Capacity>::emplace_back(Targs&&... args) noexcept
{
if (m_size < Capacity)
{
// AXIVION Next Line AutosarC++19_03-A5.0.1, FaultDetection-IndirectAssignmentOverflow: Size guaranteed by T. Evaluation order is inconsequential.
new (&at(m_size++)) T(std::forward<Targs>(args)...);
return true;
}
return false;
}

@elBoberido
Copy link
Member

@lucabart97 because the memory at that position is not initialized and using copy assignment would be undefined behavior for non trivial types. Therefore it needs to be constructed, either by placement new with forwarding of the ctor arguments or by placement new with the copy constructor.

lucabart97 added a commit to lucabart97/iceoryx that referenced this issue Nov 18, 2023
Optimize the vector class by specializing it for plain old data (POD)
types, thereby eliminating the need for explicit constructors,
destructors, and copy operators in the vector operations.

Signed-off-by: Luca Bartoli <[email protected]>
lucabart97 added a commit to lucabart97/iceoryx that referenced this issue Nov 28, 2023
Optimize the vector class by specializing it for plain old data (POD)
types, thereby eliminating the need for explicit constructors,
destructors, and copy operators in the vector operations.

Signed-off-by: Luca Bartoli <[email protected]>
lucabart97 added a commit to lucabart97/iceoryx that referenced this issue Nov 28, 2023
Optimize the vector class by specializing it for plain old data (POD)
types, thereby eliminating the need for explicit constructors,
destructors, and copy operators in the vector operations.

Signed-off-by: Luca Bartoli <[email protected]>
lucabart97 added a commit to lucabart97/iceoryx that referenced this issue Dec 2, 2023
Optimize the vector class by specializing it for plain old data (POD)
types, thereby eliminating the need for explicit constructors,
destructors, and copy operators in the vector operations.

Signed-off-by: Luca Bartoli <[email protected]>
lucabart97 added a commit to lucabart97/iceoryx that referenced this issue Dec 5, 2023
Optimize the vector class by specializing it for plain old data (POD)
types, thereby eliminating the need for explicit constructors,
destructors, and copy operators in the vector operations.

Signed-off-by: Luca Bartoli <[email protected]>
elBoberido added a commit that referenced this issue Dec 8, 2023
@elBoberido
Copy link
Member

@lucabart97 I think I screwed up a bit and we need to revert something from the last PR. The vector(uint64_t count) ctor needs to call the ctor of the data type every time, even for is_trivial and is_class.

I totally overlooked that new (&at_unchecked(i)) T(); does value initialization also for trivial classes and zeros fundamental types. Something like

struct Point
{
    float x;
    float y;
    float z;
};

would end up with x, y and z being zero.

I think this optimization should only be done if the user explicitly opts in, e.g. via a vector(Uninitialized_t, uint64_t count) ctor and vector(uint64_t count) would always initialize the data as long as it is possible. There is one scenario where the member would remain uninitialized but that is nothing we can detect. The scenario would be when the default ctor of the type is explicit, e.g. Point() {}.

Do you have time to revert that piece of code? If not, I can take care of it. Sorry for the inconvenience.

@elBoberido elBoberido reopened this Dec 9, 2023
@lucabart97
Copy link
Contributor Author

Depends on how quickly it needs to be done!
I could take care of it, but I do not know when I'll have time.

@elBoberido
Copy link
Member

No problem. I can sneak it into my next PR :)

elBoberido added a commit that referenced this issue Feb 18, 2024
…ation-for-iox-vector-elements

iox-#2082 Always initialize members in 'iox::vector(count)'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature
Projects
None yet
4 participants