Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for remote servers #293

Open
wahajali opened this issue May 20, 2024 · 1 comment
Open

Support for remote servers #293

wahajali opened this issue May 20, 2024 · 1 comment

Comments

@wahajali
Copy link

As I understand big-ann-benchmark is used for benchmarking algorithms. Looking at ann-benchmarks, they also have support for databases, such as postgres (pgvector) and redis.
I have a few questions:

  1. Can something similar be done with the big-ann-benchmark. For example, we add a docker container for pgvector, and have ann-benchmark test against it?
  2. As a follow up to [1] - in case I want to test out a server running remotely, would that be possible with big-ann-benchmark. For example, if I have a database server running remotely that I want to benchmark. Per my understanding, all test are currently run in the same hardware where the benchmark is.
  3. For each test/track there is a baseline algo defined. I'm trying to understand what is intended by the baseline. Is this the index that candidates are supposed to build on top of?
    For example, for the Neurips 23 ood track, the basaline is DiskAnn, why is it that the Dockerfile for pinecode-ood also includes the DiskAnn py lib to build the index?
@maumueller
Copy link
Collaborator

Hi @wahajali! I'm involved in both ann-benchmarks and this project, and I based the very initial code base largely on ann-benchmarks. It diverged a bit over the years, but the core architecture is still shared.

  1. Yes, the ann-benchmarks wrappers should be easy to translate into wrappers for this project.
  2. You would have to set up a bit of the infrastructure yourself and provide a wrapper module that translates the calls, e.g., for building the index, or carrying out a search. An approach like pgvector would work here in the same way.
  3. The framework is build around competitions that we organized at NeurIPS 2021 and NeurIPS 2023. With the proposed challenge, we provided baselines to provide examples and get some concrete performance/quality measurements to compare other solutions to. Some participants in these challenges build upon these baselines, others provided their own solution.

Hope that helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants