-
-
Notifications
You must be signed in to change notification settings - Fork 7
Custom derive for deep_size_of #22
Comments
Thanks! It sounds like this would be almost exactly like the heapsize / heapsize_derive crates except that their Do you know of any other differences that would be required compared to heapsize? |
There are a couple of design options indeed! It's not clear what is the best behavior for It might be fun to add a |
Regarding the reference counted pointers, let's make that decision up front: for your use case would you prefer to have Arc and Rc accounted accurately at the expense of whatever performance it costs to maintain the context? |
I think accurate counts are much more valuable. |
How about three methods for:
Unfortunately, although it’s a brilliant idea at first glance, a fake serializer wouldn’t be wholly sufficient. Flattened types, skipped values, different enum tag representations, etc., can all give incorrect results. However, it may be close enough to still be worth investigating. Also, for reference counted pointers. We could divide their size by the total number of strong references for a “portion shared” size? |
Hm, not sure: 3 shouldn't be significantly slower than 2 for any case, and it may be exponentially faster if there's many sharing. 1 seems redundant with
Interesting idea! Will it always give an accurate count? |
I think 3 being slower is fine, like a "last resort" full scan operation to be as accurate as possible. That seems required if we can't hook into the allocator itself. 2 and 3 would be equivalent if there is no nesting, but 3 is required for nesting. As for the accuracy of the reference counts, it's as accurate as an atomic integer load for Perhaps it could be a crate feature as to whether it should count an |
The question about double counting struct Example<'a>(&'a u32, &'a u32);
fn main() {
let number = &42;
let example = Example(number, number);
let size = example.deep_size_of();
// What would size be?
} Assuming 64 bit, should the size be 18 or 22? 18 makes the most sense, but that means that you would have to track every reference, as well as the decision for tracking I've been working on a basic implementation of the trait and default implementations, but I haven't implemented anything for the references. |
I think we can only ever hope to count owned or partially owned data, so in that case only the size of the reference/pointer itself should be counted. References or pointers can use the stack, heap, or even other esoteric things, which are all presumably handled elsewhere. A good example of this is with |
It might be good to have two options (maybe a compile time feature), one for only counting owned data ( On the implementation side, one way of handling the simpler "Reference" cases would be to store a list of raw pointers to the inner data of each object, and to compare any new "reference" object with that list to see if it has already been counted. There may be some safety issues with getting the raw pointers, but that would take care of double counting in most cases (excluding self referential structs). It would count any partially owned object exactly once, which (I think) would make intuitive sense in most cases. |
My current work is here, but I'm not sure if it's currently good enough to publish to crates.io / add to this list. This is my first "real" crate, so if anyone has code style suggestions or other comments, please tell me. |
Honestly you could probably make jemalloc an optional dep of heapsize and just use that. Or fork it. An interesting addition would be something that helps construct a nested tree representation for reporting. |
jemalloc isn't an option on some targets, most notably MSVC. I may work on this myself soon. The idea of a fake serializer wouldn't work, but it inspired an idea of arbitrary "Counter" implementations to handle the different considerations described here, such as references. Perhaps even one that does use jemalloc on targets that support it. Additionally, this absolutely needs a serde-like attribute system for deep-counting foreign types. |
Can someone read through my implementation to see if anything is obviously wrong? I've finished up the basic implementation, and as far as I can tell it works in most cases. I know that there are still some things to add, but it should work as a basic framework. https://github.com/Aeledfyr/deepsize Currently it has special cases for each type of allocated type, which makes it so that it doesn't have to use jemalloc, but that opens up opportunities for incorrect guesses. If anyone knows a way of accurately verifying the byte counts, that would be quite useful for the testing. Also, does the structure of the trait work? It would be good to reduce the number of methods, but you would at least need |
Does anyone have preferences on which standard library types this is implemented for? Also, how would I add integration for some of the dynamic storage crates (such as slotmap), without making it a dependency (only using the impl if the crate is already being used)? I'm not sure if that can actually be done, or how it is normally handled. |
I filed some issues on your issue tracker and @matklad will do the same if they have feedback. I think we are ready to close out this issue. Thanks and nicely done! |
It might be useful to understand how many bytes in memory a particular datastructure occupies.
If doesn't have any heap-allocated storage, figuring out the answer is easy: it's just
std::mem::size_of
.However for data structures which include vectors, maps, boxes, etc, size_of will account only for (comparatively insignificant) stack part of the storage.
One approach to the problem is servo/heapsize, which communicates with the allocator to learn about which blocks of memory are allocated. This is a relatively precise approach, because it accounts for allocator's own overheads, but it requires the use of jemalloc.
A potentially more robust but less precise approach is to walk the datastructure recursively (that is, iterate all the vectors, hash maps, etc) and sum
size_of
. Looks like this can be done with a proc macro which would custom-derive the following trait:The text was updated successfully, but these errors were encountered: