GitHub - pkusys/Auncel: Vector search with bounded performance.

1. Introduction

This repository contains one version of the source code for our NSDI'23 paper "Fast, Approximate Vector Queries on Very Large Unstructured Datasets" [Paper].

2. Content

Auncel/
- The source code of Auncel implementation and design (fork from Faiss 1.15.2)
LAET/
- The source code of sigmod20 paper, "Improving Approximate Nearest Neighbor Search through Learned Adaptive Early Termination" (fork from LAET and add new datasets)
faiss/
- The source code of vector search engine, Faiss (fork from Faiss 1.15.2 and change its ELP (Autotune.cpp) from average case to bounded case)

3. Environment requirement

Hardware
- AWS c5.4xlarge & c5.metal
Software
- Intel MKL & clang & OpenMP
Datasets
- The 10M-dataset is a random 10M slice of the whole 1B-dataset (SIFT DEEP TEXT GIST). You can download the preprocessed(e.g., normalized for text) datasets here data-link-1 or data-link-2(7w3r) (I recommend you to use the provided datasets if you want to use our configuration)

4. How to run

Compile
- Run the following commands: cd ./Auncel && ./configure --without-cuda && ./build.sh && cd ../ to compile the code of Auncel
- Run the following commands: cd ./LAET && ./configure --without-cuda && ./build.sh && cd ../ to compile the code of LAET
- Run the following commands: cd ./faiss && ./configure --without-cuda && ./build.sh && cd ../ to compile the code of Faiss
Run
- Overall : Before running the python programs to generate the figures, you are supposed to run the corresponding program to get result log files. Run cd ./Auncel/eval/ && ./run.sh && cd - to get log files of Auncel. Run cd ./LAET/benchs/learned_termination/ && ./run.sh && cd - to get log files of LAET. Run cd ./faiss/eval/run.sh && && ./run.sh && cd - to get log files of Faiss. Run cd ./figures/overall/ && ./overall.sh && cd - to get the three figures.
- Effectiveness : Before running the python programs to generate the figures, you are supposed to run the corresponding program to get result log files. Run cd ./Auncel/eval/ && ./effect.sh && cd - to get log files of Auncel. Run cd ./figures/effect/ && ./effect.sh && cd - to get the two figures.
- Validation : The log files are automatically generated when you run cd ./Auncel/eval/ && ./run.sh && cd -. (Please set <repo>/Auncel/IVF_pro.h/struct Trace -> bs as 1 to capture every point in the $\varphi - U$ map. ) To draw the figures, please run cd ./figures/validation && ./validation.sh && cd -.
- Overhead : Run cd ./Auncel/eval/ && ./overhead.sh && cd - and you will get the corresponding experimental data on the terminal.
- Dist : Please refer <repo>/Auncel/dist/README.md for the details of distributed experiment. The figure script is <repo>/figures/dist/figure16.py

5. Contact

For any question, please contact zzlcs at pku dot edu dot cn.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1. Introduction

2. Content

3. Environment requirement

4. How to run

5. Contact

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Auncel		Auncel
LAET		LAET
faiss		faiss
figures		figures
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

pkusys/Auncel

Folders and files

Latest commit

History

Repository files navigation

1. Introduction

2. Content

3. Environment requirement

4. How to run

5. Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages