Explored dimensions:
- text type
- instance size (just adjust the test_case.config file for this)
- compile options
- index implementations
Pattern selection:
We use the benchmark code including the random pattern selection and test cases from the Pizza&Chili website.
-
bin: Contains the executables of the project.
build_idx_*
generates indexesquery_idx_*
executes the count experimentsinfo_*
outputs the space breakdown of an index.genpattern
pattern generation from Pizza&Chili website.
-
indexes: Contains the generated indexes.
-
results: Contains the results of the experiments.
-
src: Contains the source code of the benchmark.
-
visualize: Contains a
R
-script which generates a report.Files included in this archive from the Pizza&Chili website:
- src/genpatterns.c
- src/run_quries_sdsl.cpp is a customized version of the Pizza&Chili file run_queries.c .
- For the visualization you need the following software:
- The construction of the 200MB indexes requires about 1GB of RAM.
make timing
compiles the programs, downloads test the 200MB Pizza&Chili test cases, builds the indexes, runs the performance tests, and generated a report located atvisualize/count.pdf
. The raw numbers of the timings can be found inresults/all.txt
. Indexes and temporary files are stored in the directoryindexes
andtmp
. For the 5 x 200 MB of Pizza&Chili data the project will produce about 7.2 GB of additional data. On my machine (MacBookPro Retina 2.6GHz Intel Core i7, 16GB 1600 Mhz DDR3, SSD) the benchmark, invoced bymake timing
, took about 11 minutes (excluding the time to download the test instances). Have a look at the generated report.- All created indexes and test results can be deleted
by calling
make cleanall
.
The project contains several configuration files:
- index.config: Specify data structures' ID, sdsl-class, and LaTeX-name for the report.
- test_case.config: Specify test cases' ID, path, LaTeX-name for the report, and download URL.
- compile_options.config: Specify compile options' ID and string.
Note that the benchmark will execute every combination of your choices.
Finally, the visualization can also be configured:
- visualize/index-filter.config: Specify which indexes should be listed in the report.