Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds ankerl::unordered_dense{segmented_map, segmented_set} #58

Merged
merged 1 commit into from
Apr 8, 2023

Conversation

martinus
Copy link
Owner

@martinus martinus commented Jan 7, 2023

This new underlying container has a much smoother memory allocation curve
than the default underlying std::vector.

  • Much smoother memory usage, memory usage increases continuously.
  • No high peak memory usage.
  • Faster insertion because elements never need to be moved to new allocated blocks
  • Slightly slower indexing compared to std::vector because an additional
    indirection is needed.

Abseil is fastest for this simple inserting test, taking a bit over 0.8 seconds.
It's peak memory usage is about 430 MB. Note how the memory usage goes down after
the last peak; when it goes down to ~290MB it has finished rehashing and could free
the previously used memory block.

ankerl::unordered_dense::segmented_map doesn't have these peaks, and instead has
a smooth increase of memory usage. Note there are still sudden drops & increases in
memory because the indexing data structure needs still needs to increase by a fixed
factor. But due to holding the data in a separate container we are able to first free
the old data structure, and then allocate a new, bigger indexing structure; thus we
do not have peaks.

bump to 4.0.0

@martinus martinus force-pushed the 2022-10-stable-references-experiment branch 3 times, most recently from 1c939f5 to fb2c3ec Compare January 8, 2023 09:15
@martinus martinus force-pushed the 2022-10-stable-references-experiment branch 3 times, most recently from fa2c0d7 to b844618 Compare March 1, 2023 05:36
Copy link

@bigerl bigerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For your consideration: Supporting Fancy Pointers

I've had a look at the implementation with regards to support allocators that use fancy pointers. This is especially relevant for boost::interprocess::offset_ptr which are used to work in shared memory and especially memory mapped files.

I think the changes required to support allocators that use fancy pointers are minimal. I changed the relevant typedefs and function calls where I found them. The changes are untested and probably incomplete.

A simple way to test that it works with offset pointers is either a thin wrapper around std::allocator that uses boost's offset pointers or to use metall.

Comment on lines 583 to 587
using difference_type = std::ptrdiff_t;
using reference = T&;
using const_reference = T const&;
using pointer = T*;
using const_pointer = T const*;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
using difference_type = std::ptrdiff_t;
using reference = T&;
using const_reference = T const&;
using pointer = T*;
using const_pointer = T const*;
using difference_type = std::allocator_traits<allocator_type>::difference_type;
using reference = T&;
using const_reference = T const&;
using pointer = std::allocator_traits<allocator_type>::pointer;
using const_pointer = std::allocator_traits<allocator_type>::const_pointer;

increase_capacity();
}
auto* ptr = static_cast<void*>(&operator[](m_size));
auto& ref = *new (ptr) T(std::forward<Args>(args)...);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto& ref = *new (ptr) T(std::forward<Args>(args)...);
auto& ref = std::allocator_traits<allocator_type>::construct(get_allocator(), std::forward<Args>(args)...);

static constexpr auto num_elements_in_block = 1U << num_bits;
static constexpr auto mask = num_elements_in_block - 1U;

std::vector<T*, vec_alloc> m_blocks{};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::vector<T*, vec_alloc> m_blocks{};
vec_type<std::allocator_traits<T>::pointer, vec_alloc> m_blocks{};

std::vector does not work with allocators that require fancy pointers.

Suggestion: make the type of m_blocks configurable via a template parameter

*/
template <bool IsConst>
class iter_t {
using ptr_t = typename std::conditional_t<IsConst, T const* const*, T**>;
Copy link

@bigerl bigerl Mar 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
using ptr_t = typename std::conditional_t<IsConst, T const* const*, T**>;
using ptr_t = typename std::conditional_t<IsConst, std::pointer_traits<std::allocator_traits<T>::pointer>::template rebind<std::allocator_traits<T>::const_pointer> const, std::pointer_traits<std::allocator_traits<T>::pointer>::template rebind<std::allocator_traits<T>::pointer>>;

Comment on lines 487 to 505
using reference = typename std::conditional<IsConst, value_type const&, value_type&>::type;
using pointer = typename std::conditional<IsConst, value_type const*, value_type*>::type;
using iterator_category = std::forward_iterator_tag;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See line 587

@martinus
Copy link
Owner Author

martinus commented Mar 7, 2023

Hi @bigerl! Is the segmented vector of interest to you? It has a much more smooth allocation pattern than std:: vector.

I'll add more tests to make sure it works with fancy pointers 👍

@bigerl
Copy link

bigerl commented Mar 7, 2023

Hi @martinus. The map with segmented vector would be very interesting. A reference-stable hash map without the overhead of a node-based design sounds very promising. It makes it easy to store things there and still keep stable (offset-)pointers on them. That's something I was looking for in some of my projects for quite some time.

In combination with good iteration speed (I would expect it to be similar to tsl::sparse_map with the segmented vector), fast lookups, amoderate memory footprint and fast insertions it should have very exposed spot along the Pareto front. ;)

@martinus
Copy link
Owner Author

martinus commented Mar 7, 2023

Note that this implementation still won't have stable references, at least not on erase(). Insertion will be stable though

@bigerl
Copy link

bigerl commented Mar 7, 2023

Ah, right. Just had a look into the code again. Tombstones would probably be the only option to maintain reference validity on erase. It comes with serious issues though: E.g., inserting entries 1...n and erasing entries 1...(n-1) would leaf the map with (n-1) tombstones. MostSome cases could probably be handled via intelligent bookkeeping involving merging of segments with values in distinct positions and removal of empty segments.
It would make the code of the segment vector complex, probably slower and would not be a drop-in replacement for std::vector anymore. I guess it's out of discussion then.

Having said that, also without stable references it would use it in shared memory. Especially the fact that the memory footprint stays always close to linear to the number of entries is very nice.

@martinus
Copy link
Owner Author

martinus commented Mar 8, 2023

I could do stable references without tombstones, with an in-place freelist in the segmented_vector. This would give stable references without any lookup/insert performance penalty, erase would actually become much faster, only iterating would become slower. That's also the problem, iterating would need a complete rewrite and that's not really compatible with how it's currently done.

@martinus martinus force-pushed the 2022-10-stable-references-experiment branch from a9c4a71 to f336a9b Compare April 8, 2023 08:31
@martinus martinus changed the title Implements segmented_vector Adds ankerl::unordered_dense{segmented_map, segmented_set} Apr 8, 2023
@martinus martinus self-assigned this Apr 8, 2023
@martinus martinus marked this pull request as ready for review April 8, 2023 08:33
This new underlying container has a much smoother memory allocation curve
than the default underlying `std::vector`.

* Much smoother memory usage, memory usage increases continuously.
* No high peak memory usage.
* Faster insertion because elements never need to be moved to new allocated blocks
* Slightly slower indexing compared to `std::vector` because an additional
  indirection is needed.

Abseil is fastest for this simple inserting test, taking a bit over 0.8 seconds.
It's peak memory usage is about 430 MB. Note how the memory usage goes down after
the last peak; when it goes down to ~290MB it has finished rehashing and could free
the previously used memory block.

`ankerl::unordered_dense::segmented_map` doesn't have these peaks, and instead has
a smooth increase of memory usage. Note there are still sudden drops & increases in
memory because the indexing data structure needs still needs to increase by a fixed
factor. But due to holding the data in a separate container we are able to first free
the old data structure, and then allocate a new, bigger indexing structure; thus we
do not have peaks.

bump to 4.0.0
@martinus martinus force-pushed the 2022-10-stable-references-experiment branch from f336a9b to 3b51559 Compare April 8, 2023 08:34
@martinus martinus merged commit ec970e9 into main Apr 8, 2023
@martinus martinus deleted the 2022-10-stable-references-experiment branch April 8, 2023 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants