-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimum alignment #241
Comments
FWIW, it isn't just the overhead of wasted space that is a potential downside here. Bumpalo also supports use cases where, if care is taken on the types of things allocated within the arena, you can read the allocated chunks out from the arena and use them directly for things like zero-copy serialization/deserialization. Inserting undefined padding bytes can break that use case. I would expect that, for scenarios where we are allocating a type of a particular, fixed layout (eg using That said, I suppose shaving off ~2 instructions from a sequence of 10-15 instructions could very well lead to a 12% speed up. I'm not necessarily opposed to something like this, but at the same time I don't want to proliferate cargo features endlessly. They have maintenance cost, especially with regards to testing and correctness (as you noted). One thing I'd like to see worked out before going too much deeper into this would be that story. |
Those are all good considerations. Though as I am sure you already know using min alignment does not preclude reading allocated chunks directly, it just means that if you do so all allocations need to meet the minimum alignment. For maintainability, I was thinking we could define a constant the specifies the minimum alignment. #[cfg(feature = "min_align")]
pub const MIN_ALIGN: usize = core::mem::align_of::<usize>();
#[cfg(not(feature = "min_align"))]
pub const MIN_ALIGN: usize = core::mem::align_of::<u8>(); the only spot in non-test code that we would change is in let ptr = ptr.wrapping_sub(layout.size());
let aligned_ptr = round_mut_ptr_down_to(ptr, layout.align()); to this: #[cfg(not(feature = "min_align"))]
let aligned_ptr = {
let ptr = ptr.wrapping_sub(layout.size());
round_mut_ptr_down_to(ptr, layout.align())
};
#[cfg(feature = "min_align")]
let aligned_ptr = {
let size = (layout.size() + MIN_ALIGN - 1) & !(MIN_ALIGN - 1);
let ptr = ptr.sub(size);
if layout.align() > MIN_ALIGN {
round_mut_ptr_down_to(ptr, layout.align())
} else {
ptr
}
}; Since the old behavior is equivalent to However this feature does rely on the compiler constant propagating layout. If I use fn alloc_layout<T: Default>(n: usize) {
let arena = bumpalo::Bump::with_capacity(n * std::mem::size_of::<T>());
for _ in 0..n {
let arena = black_box(&arena);
let layout = std::alloc::Layout::new::<T>();
let ptr = arena.alloc_layout(black_box(layout));
unsafe {ptr.as_ptr().write(Default::default())};
black_box(ptr);
}
} This is because the compiler is not able to inline |
I think this could benefit from being factored out into two different versions of a helper function that is marked
This sounds good to me. |
I was reading your blog post while looking through the code and noticed that it did not utilize the "minimum alignment trick" mentioned at the end. I wanted to see how much difference it made. I implemented it and run the benchmarks but didn't see any difference in the "small" and "big" ones. The small one obviously won't benefit from alignment and the big benchmark is dominated by the overhead of copying. However I created a medium benchmark with word sized values.
In this benchmark I saw 12% throughput improvement when using minimum alignment. There is a similar improvement for allocations up to 8 words in size (much larger than that and the copying overhead starts to dominate the benchmark).
This seems like a sizable improvement. But I also understand that maybe not everyone wants to waste the extra memory if they are doing small or odd-sized allocations. Would you be open to adding this as an optional feature? Updating the code is fairly easy, but the harder part would be updating the tests, since many of them (like quickcheck) hardcode the assumption that there is no min alignment. Let me know what you think. I would submit the PR if this was something you were interested in.
The text was updated successfully, but these errors were encountered: