Persistent memory friendly hashing index.
More details are described in our VLDB paper and SIGMOD Highlight below. If you use our work, please cite:
Baotong Lu, Xiangpeng Hao, Tianzheng Wang, Eric Lo:
Dash: Scalable Hashing on Persistent Memory.
PVLDB 13(8): 1147-1161 (2020)
Baotong Lu, Xiangpeng Hao, Tianzheng Wang, Eric Lo:
Scaling Dynamic Hash Tables on Real Persistent Memory.
SIGMOD Record 2021, Volume 50, Issue 1.
- Dash EH - Proposed Dash extendible hashing
- Dash LH - Proposed Dash linear Hashing
- CCEH - PMDK patched CCEH variant used in our benchmark
- Level Hashing - PMDK patched level hashing variant used in our benchmark
- Mini benchmark framework
- Example program - how to integrate Dash to your application
Fully open-sourced under MIT license.
We tested our build with Linux Kernel 5.5.3 and GCC 9.2. You must ensure that your Linux kernel version >= 4.17 and glibc >=2.29 since we use MAP_FIXED_NOREPLACE
in our customized PMDK.
The external dependencies are our customized PMDK and epoch manager, which are also open-sourced.
Assuming to compile under a build
directory:
git clone https://github.com/baotonglu/dash.git
cd dash
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DUSE_PMEM=ON ..
make -j
As stated in our paper, we run the tests in a single NUMA node with 24 physical CPU cores. We pin threads to physical cores compactly assuming thread ID == core ID (e.g., for a dual-socket system, we assume cores 0-23 are located in socket 0, and cores 24-47 in socket 1). To run benchmarks, use the test_pmem
executable in the build
directory. It supports the following arguments:
./build/test_pmem --helpshort
Usage:
./build/test_pmem [OPTION...]
-index the index to evaluate:dash-ex/dash-lh/cceh/level (default: "dash-ex")
-op the type of operation to execute:insert/pos/neg/delete/mixed (default: "full")
-n the number of warm-up workload (default: 0)
-p the number of operations(insert/search/delete) to execute (default: 20000000)
-t the number of concurrent threads (default: 1)
-r search ratio for mixed workload: 0.0~1.0 (default: 1.0)
-s insert ratio for mixed workload: 0.0~1.0 (default: 0.0)
-d delete ratio for mixed workload: 0.0~1.0 (default: 0.0)
-e whether to register epoch in application level: 0/1 (default: 0)
-k the type of stored keys: fixed/variable (default: "fixed")
-vl the length of the variable length key (default: 16)
Check out also the run.sh
script for example benchmarks and easy testing of the hash tables.
To know how to integrate the Dash into your application, check out example.cpp
under src
.
The executable is example
under your build directory.
Also check CMakeLists.txt
to know how to link with dependencies (customized PMDK and epoch manager) for correct build.
We noticed a possible mmap
bug on our testing environment: MAP_SHARED_VALIDATE
is incompatible with MAP_FIXED_NOREPLACE
(since Linux 4.17).
To ensure safe memory mapping, we modified the original PMDK to use MAP_SHARED
rather than MAP_SHARED_VALIDATE
, which has the same functionality as the former one except for extra flag validation.
For a more detailed explanation and minimal reproducible code, please check out our blog post about this issue.
For any questions, please contact us at [email protected]
and [email protected]
.