process: Deduplicate paths in parseMappings()#403
process: Deduplicate paths in parseMappings()#403rockdaboot merged 1 commit intoopen-telemetry:mainfrom
Conversation
athre0z
left a comment
There was a problem hiding this comment.
Ah nice, exploiting the ordering is a good idea. LGTM! :)
| if path == lastPath { | ||
| // Take advantage of the fact that mappings are sorted by path | ||
| // and avoid allocating the same string multiple times. | ||
| path = lastPath | ||
| } else { |
There was a problem hiding this comment.
Can we skip the assignment of path = lastPath if path == lastPath?
| if path == lastPath { | |
| // Take advantage of the fact that mappings are sorted by path | |
| // and avoid allocating the same string multiple times. | |
| path = lastPath | |
| } else { | |
| if path != lastPath { |
There was a problem hiding this comment.
Can we skip the assignment of path = lastPath if path == lastPath?
No, this is exactly where the de-duplication happens.
path == lastPath compares the string content, and here both contents are at different memory locations. So if the contents are the same, the path = lastPath changes the memory location of path to the already existing lastPath.
|
Taking the idea further. Would it make sense to intern the strings globally? E.g. libc is mapped to practically all processes. So have a global |
Since Go 1.23 we have the unique package, but it comes with downsides (e.g. how to clean up?). An LRU Set makes more sense to me, as it cleans up automatically and the GC keeps track of strings that have been evicted but are still in use. |
We can tackle this in a follow-up PR. Created an issue for a global de-deplication of strings to not forget about it: #404 |
Reduce number of allocations significantly in
parseMappings(). It also has some positive effects on CPU usage.The mappings are from a locally running Firefox. The benchmark code is:
Just for reproducibility:
Firefox mappings