- mem++
C++ memory allocator with smart features, such as garbage collection, and heap compacting.
Current library version: 0.3.5
Warning: the API is not stable!
- Garbage Collection
- Fast memory allocations (using bump allocators techniques)
- Fast free algorithms
- Advanced compacting algorithms, which are used to reduce memory fragmentation and improve cache locality: heuristic-layouting
- The library is still in the development. It's definitely not recommended to use in production!
- All Unix-like systems (where it is possible to use
mmap
) - g++ or clang++ compilers
- Currently supports only single-threaded applications
- References/Pointers to GcPtr's are always invalidated after
GarbageCollect()
- Install latest build systems:
apt install cmake g++ clang
- Clone just the library:
git clone https://github.com/m4drat/memplusplus/
- Clone the library (with tests support):
git clone --recurse-submodules=./libraries/googletest https://github.com/m4drat/memplusplus/
- If you want to run benchmarks, take a look at this: link
-
run cmake from the root directory (memplusplus):
cmake \ -S . \ -B build \ -DCMAKE_BUILD_TYPE=Release \ -DMPP_BUILD_SHARED_LIBS=ON \
-
compile and install:
sudo cmake \ --build build \ --target install \ -j 8
-
After the library is installed, in your CMakeLists.txt use find_package, to actually use the library:
cmake_minimum_required(VERSION 3.13) project(<YOUR_PROJECT_NAME>) find_package(mpp 0.3.5 REQUIRED) add_executable(<YOUR_PROJECT_NAME> <YOUR_SOURCES>) target_link_libraries(<YOUR_PROJECT_NAME> PRIVATE mpp::mpp)
-
After that you will be able to include library headers in your sources like that:
#include <mpplib/mpp.hpp>
-
just copy the whole directory with libmemplusplus to your project root directory.
-
In your project's cmake file add this:
add_subdirectory(libmemplusplus) target_link_libraries(<YOUR_PROJECT_NAME> PRIVATE mpp::mpp)
-
After that you will be able to include library headers in your sources like that:
#include "mpplib/mpp.hpp"
Global options:
MPP_ENABLE_COVERAGE
- build with code coverage supportMPP_BUILD_FUZZER
- build fuzzer project (will build the library with sanitizers)MPP_BUILD_EXAMPLE
- build example projectMPP_BUILD_TESTS
- build testsMPP_BUILD_DOCS
- build documentation
Library options:
MPP_BUILD_SHARED_LIBS
- build shared or static librariesMPP_FULL_DEBUG
- build in full debug mode (adds extended security checks in debug build)MPP_SECURE
- build in secure mode with additional security featuresMPP_PROFILE
- enable profiling instrumentationMPP_SANITIZERS
- add sanitizers to the buildMPP_VALGRIND
- adds support for valgrindMPP_COLOUR_DEBUG_OUTPUT
- Add colors to debug outputMPP_STATS
- Add statistics instrumentation.MPP_ENABLE_LOGGING
- Enable logging (even in release mode)
-
MPP_DUMP_OBJECTS_GRAPH=1
/MPP_DUMP_OBJECTS_GRAPH=2
- dump objects graph to fileobjects.dot
, while performingCollectGarbage()
(only possible in debug mode) -
MPP_SHOW_STATISTICS=1
- display statistics after program termination (should be built withMPP_STATS
set to ON) -
MPP_LOG_LEVEL
- set log level (default:FATAL
). Supported values:TRACE
,DEBUG
,INFO
,WARNING
,ERROR
,FATAL
,DISABLED
-
Automatic memory management
#include <mpplib/mpp.hpp> ... // create smart pointer, that will automatically deallocate object, when needed mpp::MakeShared<Object> object = mpp::MakeShared<Object>(<constructor params>); // create array of Objects, that will be automatically deallocated, when goes out of scope mpp::MakeShared<Object[]> objects = mpp::MakeSharedN<Object>(<array size>, <constructor params>); ... // collect all garbage + compact memory (should be called manually) mpp::CollectGarbage();
-
Manual memory management- deprecated#include <mpplib/mpp.hpp> ... // will call constructor automatically, like new Object* obj = mpp::Allocate<Object>(<constructor params>); // create raw pointer (behaves like malloc) void* ptr = mpp::Allocate(128); ... // will call destructor automatically, like delete mpp::Deallocate(obj); // only deallocates raw pointer (behaves like free) mpp::Deallocate(ptr);
Memplusplus provides different debug-like features, such as data visualizers, profilers, and statistics collectors.
-
Backtrace functionality for MPP_ASSERT:
Add these flags to your project's CMakeLists.txt, if you want to see a backtrace when
MPP_ASSERT
fails:# For GCC target_compile_options(${PROJECT_NAME} PRIVATE -g -O0) target_link_libraries(${PROJECT_NAME} PRIVATE mpp::mpp -export-dynamic) # For Clang target_compile_options(${PROJECT_NAME} PRIVATE -g -O0) target_link_libraries(${PROJECT_NAME} PRIVATE mpp::mpp -Wl,--export-dynamic)
-
Address Sanitizer support (ASAN):
Enable
MPP_SANITIZERS
before building the library. Then compile your project with-fsanitize=address
flag. As a result, you will get asan-compatible build which will help you to debug any memory management issues.Example buggy code:
void* p1 = mpp::Allocate(128); mpp::Deallocate(p1); *(uint32_t*)p1 = 0x13371337; // Use-after-free write
Example ASAN report for this code:
================================================================= ==26738==ERROR: AddressSanitizer: use-after-poison on address 0x19b297400010 at pc 0x557f3c1aaea3 bp 0x7ffcb838c440 sp 0x7ffcb838c438 WRITE of size 4 at 0x19b297400010 thread T0 #0 0x557f3c1aaea2 in main /home/madrat/memplusplus/build/../example_project/src/main.cpp:137:20 #1 0x7f16c401a0b2 in __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:308:16 #2 0x557f3c0e7d2d in _start (/home/madrat/memplusplus/build/example_project/example_project-d+0x45d2d) (BuildId: cca1c1f77a22aeae021502831561465b63cc9f19) Address 0x19b297400010 is a wild pointer inside of access range of size 0x000000000004. SUMMARY: AddressSanitizer: use-after-poison /home/madrat/memplusplus/build/../example_project/src/main.cpp:137:20 in main ......
-
Valgrind support:
Enable
MPP_VALGRIND
before building the library. Then compile your project with-g
flag. As a result, you will get valgrind-compatible build which will help you to track-down any memory leaks. You also might want to disableMPP_SANITIZERS/MPP_BUILD_FUZZER
and build usingg++
.Example buggy code:
void leak() { void* p1 = mpp::Allocate(128); // mpp::Deallocate(p1); }
Example valgrind report for this code:
==21304== HEAP SUMMARY: ==21304== in use at exit: 160 bytes in 1 blocks ==21304== total heap usage: 24 allocs, 23 frees, 82,582 bytes allocated ==21304== ==21304== Searching for pointers to 1 not-freed blocks ==21304== Checked 150,048 bytes ==21304== ==21304== 160 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==21304== at 0x151812: mpp::MemoryManager::Allocate(unsigned long) (memory_manager.cpp:181) ==21304== by 0x152A33: mpp::Allocate(unsigned long) (memory_manager.cpp:352) ==21304== by 0x150238: leak() (main.cpp:131) ==21304== by 0x15024C: main (main.cpp:144) ==21304== ==21304== LEAK SUMMARY: ==21304== definitely lost: 160 bytes in 1 blocks ==21304== indirectly lost: 0 bytes in 0 blocks ==21304== possibly lost: 0 bytes in 0 blocks ==21304== still reachable: 0 bytes in 0 blocks ==21304== suppressed: 0 bytes in 0 blocks ==21304== ==21304== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
-
Data visualizers:
In Debug builds, you can dump .dot representation of the ChunkTreap and GcGraph (only using specific environment variable). Later you can render these .dot files using dot from graphviz.
ChunkTreap visualizer
How to dump ChunkTreap in code (don't forget to redirect program output to file treap.dot):
// 1. Get the arena you want to dump the tree for std::unique_ptr<Arena>& tmpArena = mpp::GetArenaList()[0]; // 2. Extract ChunkTreap from this arena, and dump it // (in this example everything was dumped to std::cout). tmpArena->GetFreedChunks().GenerateGraphvizLayout(std::cout);
How to generate an .svg file with dot:
dot -Tsvg treap.dot -o treap.svg
After that you will get .svg file with your freed chunks treap:
GcGraph visualizer
To visualization the GcGraph you have to:
-
Build the library in debug mode, and set
MPP_DUMP_OBJECTS_GRAPH=1
(basic visualization) orMPP_DUMP_OBJECTS_GRAPH=2
(advanced visualization) before running the target app. On each GC cycle, it will dump the objects graph to file "objects_cycle<current cycle number>.dot". -
Then generate an .svg file using dot as follows:
dot -Tsvg objects_cycle<N>.dot -o objects_cycle<N>.svg
For example, for this code (it creates a linked list and tree):
// Linked List node struct Node { uint32_t data; SharedGcPtr<Node> prev; SharedGcPtr<Node> next; Node(uint32_t t_data, SharedGcPtr<Node> t_p, SharedGcPtr<Node> t_n) : data{t_data}, prev{t_p}, next{t_n} { } }; // Create Linked List SharedGcPtr<Node> n1 = MakeShared<Node>(1, nullptr, nullptr); SharedGcPtr<Node> n2 = MakeShared<Node>(2, nullptr, nullptr); SharedGcPtr<Node> n3 = MakeShared<Node>(3, nullptr, nullptr); SharedGcPtr<Node> n4 = MakeShared<Node>(4, nullptr, nullptr); n1->prev = nullptr; n1->next = n2; n2->prev = n1; n2->next = n3; n3->prev = n2; n3->next = n4; n4->prev = n3; n4->next = nullptr; // Tree node struct TreeNode { uint32_t data; SharedGcPtr<TreeNode> left; SharedGcPtr<TreeNode> right; SharedGcPtr<TreeNode> up; TreeNode(uint32_t t_data, SharedGcPtr<TreeNode> t_left, SharedGcPtr<TreeNode> t_right, SharedGcPtr<TreeNode> t_up) : data{t_data}, left{t_left}, right{t_right}, up{t_up} { } }; // Create a random tree SharedGcPtr<TreeNode> root = MakeShared<TreeNode>(0, nullptr, nullptr, nullptr); SharedGcPtr<TreeNode> treeNode1 = MakeShared<TreeNode>(1, nullptr, nullptr, nullptr); SharedGcPtr<TreeNode> treeNode2 = MakeShared<TreeNode>(2, nullptr, nullptr, nullptr); SharedGcPtr<TreeNode> treeNode3 = MakeShared<TreeNode>(3, nullptr, nullptr, nullptr); SharedGcPtr<TreeNode> treeNode4 = MakeShared<TreeNode>(4, nullptr, nullptr, nullptr); SharedGcPtr<TreeNode> treeNode5 = MakeShared<TreeNode>(5, nullptr, nullptr, nullptr); root->up = nullptr; root->left = treeNode1; root->right = treeNode2; treeNode1->up = root; treeNode1->left = nullptr; treeNode1->right = nullptr; treeNode2->up = root; treeNode2->left = treeNode3; treeNode2->right = treeNode4; treeNode3->up = treeNode2; treeNode3->left = nullptr; treeNode3->right = nullptr; treeNode4->up = treeNode2; treeNode4->left = treeNode5; treeNode4->right = nullptr; treeNode5->up = treeNode4; treeNode5->left = nullptr; treeNode5->right = nullptr; CollectGarbage();
Executing this code with
MPP_DUMP_OBJECTS_GRAPH=1
will generate this graph:Executing this code with
MPP_DUMP_OBJECTS_GRAPH=2
will generate this graph: -
-
On-demand objects graph visualizer
To create object graph at the arbitrary point in your program, add this code:
mpp::g_memoryManager->GetGC().DumpCurrentObjectsGraph();
You might want to specify dump type (by default it's
ADVANCED
) and the filename to save.dot
graph to (default: "objects-graph.dot"). -
Profiler
Using the compilation flag
MPP_PROFILE
, you can build the library with the profiler enabled. To do so:- Set the
MPP_PROFILE
before building the library - Run the application until it exits
- In your current directory, find the file called mpplib-profiling.json
- Open this file using the chrome tracing tool. To do so, open chrome and navigate to the
chrome://tracing
page.
Example output:
- Set the
-
Statistics collector
This feature allows you to dump memory allocator statistics. To build the library with this feature, add
MPP_STATS
while building the project. You can use this feature in 2 ways:In runtime using c++ API:
Example code to dump statistics:
utils::Statistics::GetInstance().DumpStats(std::cout, true, false, false) << std::endl;
Example output:
+============= STATISTICS DUMP START =============+ MPP: MIN_CHUNK_SIZE : 32 bytes MPP: CHUNK_HEADER_SIZE : 16 bytes MPP: DEFAULT_ARENA_SIZE : 32.000 MB MPP: PAGE_SIZE : 4.000 KB ~~~~~~~~~~~~~~~~ All active arenas ~~~~~~~~~~~~~~~~ -------------- Arena: 0x1ffaf50 -------------- MPP - Total arena size : 32.000 MB MPP - Currently Allocated Space : 960.000 Bytes MPP - Amount of free memory : 31.999 MB (99.9971% of total arena size) MPP - Amount of allocated memory : 960.000 Bytes (0.0029% of total arena size) MPP - Allocated memory == 0 : false MPP - Memory used for metadata : 160.000 Bytes (16.6667% of currently allocated space) MPP - TopChunk : [0x7f329dd803c0](96, 33553472|InUse:1|PrevInUse:1) MPP - Arena begin pointer : 0x7f329dd80000 MPP - Arena end pointer : 0x7f329fd80000 MPP - Freed chunks nodes (0) MPP - Chunks in use (10) ~~~~~~~~~~~~~~~~~~ General stats ~~~~~~~~~~~~~~~~~~ ----------------- Arena: 1 ----------------- MPP - Total amount of allocated memory inside arena : 960.000 Bytes MPP - Total amount of freed memory inside arena : 0 bytes MPP - total freed == total allocated : false MPP - Biggest allocation size : 96.000 Bytes MPP - Smallest allocation size : 96.000 Bytes MPP - Full size of arena : 32.000 MB MPP - Arena was allocated for big chunk : false MPP - Arena was allocated by GC : false ----------------- Arena: 2 ----------------- MPP - Total amount of allocated memory inside arena : 960.000 Bytes MPP - Total amount of freed memory inside arena : 0 bytes MPP - total freed == total allocated : false MPP - Biggest allocation size : "No allocations have been made yet" MPP - Smallest allocation size : "No allocations have been made yet" MPP - Full size of arena : 32.000 MB MPP - Arena was allocated for big chunk : false MPP - Arena was allocated by GC : true ~~~~~~~~~~~~~ Garbage Collector stats ~~~~~~~~~~~~~ ----------------- Cycle: 1 ----------------- GC - Time wasted inside GC::Collect() : 2.0000 ms GC - Memory cleaned after GC::Collect() : 0 bytes GC - Total size of active objects : 960.000 Bytes +============== STATISTICS DUMP END ==============+
After program termination, setting environment variable
MPP_SHOW_STATISTICS=1
. Important note. Runtime stats cannot be dumped using this approach.Example output:
+============= STATISTICS DUMP START =============+ MPP: MIN_CHUNK_SIZE : 32 bytes MPP: CHUNK_HEADER_SIZE : 16 bytes MPP: DEFAULT_ARENA_SIZE : 32.000 MB MPP: PAGE_SIZE : 4.000 KB ~~~~~~~~~~~~~~~~~~ General stats ~~~~~~~~~~~~~~~~~~ ----------------- Arena: 1 ----------------- MPP - Total amount of allocated memory inside arena : 960.000 Bytes MPP - Total amount of freed memory inside arena : 0 bytes MPP - total freed == total allocated : false MPP - Biggest allocation size : 96.000 Bytes MPP - Smallest allocation size : 96.000 Bytes MPP - Full size of arena : 32.000 MB MPP - Arena was allocated for big chunk : false MPP - Arena was allocated by GC : false ----------------- Arena: 2 ----------------- MPP - Total amount of allocated memory inside arena : 960.000 Bytes MPP - Total amount of freed memory inside arena : 0 bytes MPP - total freed == total allocated : false MPP - Biggest allocation size : "No allocations have been made yet" MPP - Smallest allocation size : "No allocations have been made yet" MPP - Full size of arena : 32.000 MB MPP - Arena was allocated for big chunk : false MPP - Arena was allocated by GC : true ~~~~~~~~~~~~~ Garbage Collector stats ~~~~~~~~~~~~~ ----------------- Cycle: 1 ----------------- GC - Time wasted inside GC::Collect() : 2 ms GC - Memory cleaned after GC::Collect() : 0 bytes GC - Total size of active objects : 960.000 Bytes +============== STATISTICS DUMP END ==============+
-
Allocate/Deallocate hooks
This feature allows you to install your hooks on Allocate and Deallocate. Example:
// Lambda as an Allocate hook. Important, all hooks should be static! static std::function<void*(std::size_t)> allocHook = [&](std::size_t t_AllocSize) { mpp::SetAllocateHook(nullptr); void* ptr = mpp::Allocate(t_AllocSize); mpp::SetAllocateHook(allocHook); std::cout << "[mpp] Allocate(" << t_AllocSize << ") -> "; mpp::Chunk::DumpChunk(std::cout, mpp::Chunk::GetHeaderPtr(ptr)) << std::endl; return ptr; }; // Set actual Allocate hook. mpp::SetAllocateHook(allocHook); // Lambda as an Deallocate hook. Important, all hooks should be static! static std::function<bool(void*)> deallocHook = [&](void* t_Addr) { mpp::SetDeallocateHook(nullptr); bool res = mpp::Deallocate(t_Addr); mpp::SetDeallocateHook(deallocHook); std::cout << "[mpp] Deallocate(" << t_Addr << ") -> " << res << std::endl; return res; }; // Set actual Deallocate hook. mpp::SetDeallocateHook(deallocHook);
Right now algorithm consists of 2 parts:
Divide the graph of all objects into connected components. Each component is a set of objects that are reachable from each other. By dividing the graph into components we can solve the layout problem for each component separately, and what is more important, we can do it in parallel. Also, this step can be called optimal layout in some sense, because it is guaranteed that all objects in the component will be located in the same arena (contiguous memory block).
The next step is to layout objects from components in a way that allows you to access them in a cache-friendly way. This is achieved by using a heuristic layout algorithm. Currently, it's a proof of concept that works only for objects that have a single pointer field (to be precise, it should work for any data structure, that can be represented as a singly linked list). The algorithm is based on the following assumptions
- If we have a singly linked list, then most likely the next element should be located right after the current one. If it is not (take a look at the image below), then it's a good idea to move it there. By doing so we can reduce cache misses and improve performance.
Benchmark results for the algorithm are presented here: Benchmark results.
We can see that the algorithm is able to improve memory access time by 4-8 times (depending on the size of the list). Of course, it should be noted that the benchmark shows the corner case when the list is fully randomized in memory. In real life, the list might be layouted in a way that is already cache-friendly, so the algorithm will not be able to improve the performance.
Closing words. The algorithm is definitely not perfect, but it's a good start. It can be improved in many ways, for example, by adding more heuristics. Also, it can be used as a base for a more complex layouting algorithm, that will be able to layout any data structure
- Standalone benchmarks: memplusplus-benchmarks.
- Continuous benchmarking results: continuous-benchmarking
-
Install doxygen, sphinx, breathe
-
Configure:
cmake \ -S . \ -B build \ -DCMAKE_BUILD_TYPE=Release \ -DMPP_BUILD_FUZZER=OFF \ -DMPP_BUILD_EXAMPLE=OFF \ -DMPP_BUILD_TESTS=OFF \ -DBUILD_SHARED_LIBS=OFF \ -DMPP_BUILD_DOCS=ON
-
Build docs:
cmake \ --build build \ -j 8
-
Now docs can be found in "./build/docs/doxygen/html" or "./build/docs/sphinx"
-
Configure tests:
cmake \ -S . \ -B build \ -DCMAKE_BUILD_TYPE=Release \ -DBUILD_SHARED_LIBS=OFF \ -DMPP_BUILD_TESTS=ON
-
Build tests:
cmake \ --build build \ --target unit_tests \ -j 8
-
Run tests:
cd build && ctest
Some tests are disabled because they require you to turn off ASLR (for example
GcGraphTest.GenerateGraphvizLayout
). This is considered unreliable, but if you still want to run them, you can do it like this:cd build && setarch `uname -m` -R ./build/tests/unit_tests --gtest_also_run_disabled_tests
-
install clang-tidy and clang-format
-
Configure:
cmake \ -S . \ -B build \ -DCMAKE_BUILD_TYPE=Release
-
Use clang-format:
cmake \ --build build \ --target clang-format
-
Use clang-tidy:
cmake \ --build build \ --target clang-tidy