Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance comparisons (mimalloc, malloc, jemalloc) + cpu/memory Profiling #29

Closed
6 of 11 tasks
m4drat opened this issue Jun 12, 2020 · 1 comment
Closed
6 of 11 tasks
Assignees
Labels

Comments

@m4drat
Copy link
Owner

m4drat commented Jun 12, 2020

Specifically measure: Using valgrind to measure cache misses

Implement performance comparison with these memory allocators:

  • gcpp (only speed gain after compacting)
  • mimalloc (only allocation/deallocation speed)
  • rpmalloc (only allocation/deallocation speed)
  • jemalloc (only allocation/deallocation speed)
  • hoard (only allocation/deallocation speed)
  • supermalloc (only allocation/deallocation speed)
  • ptmalloc3 (only allocation/deallocation speed)
  • ptmalloc2 (only allocation/deallocation speed) - use latest libc
  • mem++ (speed gain after compacting + allocation/deallocation speed)
  • Mesh (only allocation/deallocation speed)
  • tcmalloc (only allocation/deallocation speed)

Useful links:

  1. rpmalloc-benchmark
  2. how-can-i-profile-c-code-running-on-linux
  3. easy_profiler
  4. prof
  5. VISUAL BENCHMARKING in C++ (how to measure performance visually)
  6. Google benchmark
  7. "Performance Matters" by Emery Berger
  8. Coz: Finding Code that Counts with Causal Profiling
  9. P2329-move_at_scale
  10. CppCon 2015: Chandler Carruth "Tuning C++: Benchmarks, and CPUs, and Compilers! Oh My!"
  11. Intel v-tune
  12. github-gprof2dot
  13. How to benchamrk correctly: llvm

Don't forget to use clang stabilizer!

Add PROFILING definition and option to build

#!/bin/bash

# build the program (no special flags are needed)
g++ -std=c++11 cpuload.cpp -o cpuload

# run the program with callgrind; generates a file callgrind.out.12345 that can be viewed with kcachegrind
valgrind --tool=callgrind ./cpuload

# open profile.callgrind with kcachegrind
kcachegrind profile.callgrind
#!/bin/bash

# build the program; For our demo program, we specify -DWITHGPERFTOOLS to enable the gperftools specific #ifdefs
g++ -std=c++11 -DWITHGPERFTOOLS -lprofiler -g ../cpuload.cpp -o cpuload

# run the program; generates the profiling data file (profile.log in our example)
./cpuload

# convert profile.log to callgrind compatible format
pprof --callgrind ./cpuload profile.log > profile.callgrind

# open profile.callgrind with kcachegrind
kcachegrind profile.callgrind
@m4drat m4drat added the Medium label Jun 12, 2020
@m4drat m4drat self-assigned this Jul 31, 2020
@m4drat
Copy link
Owner Author

m4drat commented May 28, 2022

What we should test:

  1. Some project with heavy usage of data structures, and where it is possible to use custom allocator interface
  2. Allocation speed, deallocation speed, gc-speed (if presented)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants