A curated list of awesome vector search framework/engine, library, cloud service and research papers to vector similarity search
- Apache Cassandra 5.0 – Vector search (cep-30), Strict Serialisable ACID (cep-15), horizontally scaling database
- Qdrant - Vector Similarity Search Engine with extended filtering support
- Vald - A Highly Scalable Distributed Vector Search Engine
- Milvus - A cloud-native vector database with high-performance and high scalability.
- Weaviate - A cloud-native, real-time vector search engine
- OpenDistro Elasticsearch KNN - A machine learning plugin which supports an approximate k-NN search algorithm for Open Distro for Elasticsearch
- Elastiknn - Elasticsearch plugin for nearest neighbor search
- Epsilla - A High Performance Vector Database Management System, Hippocampus For AI
- Vearch - A scalable distributed system for efficient similarity search of deep learning vectors
- pgANN - Fast Approximate Nearest Neighbor (ANN) searches with a PostgreSQL database
- Jina - Jina allows you to build deep learning-powered search-as-a-service.
- Infinity - The AI-native database built for LLM applications, providing incredibly fast vector and full-text search
- Aquila DB - Distribution focused k-NN search algorithm
- Redis HNSW - A redis module for similarity search based on HNSW
- Solr - Apache Solr - has a Dense Vector Search feature as of Solr 9.0
- Marqo - A semantic search engine which supports tensor search (sequence of vectors)
- txtai - Build semantic search applications and workflows
- Semantra - A multipurpose tool for semantically searching documents.
- SuperDuperDB - Bring AI to your favorite database
- TensorDB - High Performance Vector Database Supporting Heterogeneous Computing
- JVector - a pure Java, zero dependency, embedded vector search engine, used by DataStax Astra DB and Apache Cassandra.
- VQLite - Simple and Lightweight Vector Search Engine
- Vexvault - 100% browser based, open source, scalable, simple, zero-cost vector search
- Vespa.ai - Text search engine and ... fast approximate vector search (ANN)
- Vespa's large-scale ANN search using HNSW-IF indexes is described here
- LangStream - LangStream is an open-source project that combines the best of event-based architectures with the latest Gen AI technologies.
- CassIO - CassIO is the ultimate solution for seamlessly integrating Apache Cassandra® with generative artificial intelligence and other machine learning workloads
- JVector - A pure Java, zero dependency, embedded vector search engine used by some of the advanced distributed databases such as DataStax Astra DB & Apache Cassandra™
- Faiss - A library for efficient similarity search and clustering of dense vectors
- Distributed Faiss - Work with FAISS indexes which don't fit into a single server memory
- Autofaiss - Automatically create Faiss knn indices
- ScaNN - A library efficient vector similarity search at scale.
- NMSLIB - Non-Metric Space Library, an efficient similarity search library for generic non-metric spaces
- Annoy - C++ library with Python bindings to search for points
- FLANN - Library written in C++ and contains bindings for the following languages: C, MATLAB, Python, and Ruby
- LLM App - Open-source Python library for a real-time data KNN (K-Nearest Neighbors) indexing
- MRPT - Fast nearest neighbor search with random projection
- RPForest - Python library for approximate nearest neighbours search
- pgvector - Open-source vector similarity search extension for Postgres
- PASE - Ultra-High-Dimensional approximate nearest neighbor search extension for Postgres
- Pyserini - Toolkit for reproducible information retrieval research with sparse and dense representations
- NGT - Provides commands and a library for performing high-speed approximate nearest neighbor
- NearPy - Approximate search using different locality-sensitive hashing methods
- TOROS N2 - lightweight approximate Nearest Neighbor library
- PUFFINN - Parameterless and Universal Fast FInding of Nearest Neighbors
- SPTAG - A distributed approximate nearest neighborhood search (ANN) library
- PyNNDescent - A python nearest neighbor descent for approximate k nearest neighbors
- TarsosLSH - A Java library implementing practical nearest neighbour search algorithm for multidimensional vectors
- TorchPQ - Efficient implementations of Product Quantization and its variants using Pytorch and CUDA
- Granne - Graph-based retrieval of approximate nearest neighbors witten in rust
- Embeddinghub - A database built for machine learning embeddings
- Hora - Efficient approximate nearest neighbor search algorithm collections library written in Rust
- Voy - A WASM vector similarity search engine written in Rust
- Chroma - The open-source embedding database for building LLM apps in Python or JavaScript with memory
- USearch - Smaller & Faster Vector Search Engine for C++, Python, JavaScript, Rust, Java, GoLang, Wolfram
- Golang vector stores collection - Chroma, PGVector interfaces
- Scalable Vector Search (SVS) - A performance library for vector similarity search
- Epsilla Cloud - The fully managed serverless vector database with 10X faster, cheaper and better.
- DataStax Astra Vector - Multi-cloud, serverless vector DBaaS
- Relevance AI - Vector Platform From Experimentation To Deployment
- Pinecone - Managed vector search with filtering, live index updates, horizontal scaling, and a lot more
- MyScale - A managed vector database based on ClickHouse
- Redis Cloud - Managed vector database in Redis
- Zilliz Cloud - Cloud-native service for Milvus
List of methods on how approximate vector search algorithm can be implemented more effciently.
- SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search - NEURIPS 2021
- Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors - ECCV 2018
- Accelerating Large-Scale Inference with Anisotropic Vector Quantization
- Billion-scale similarity search with GPUs
- Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs
- Optimization of Indexing Based on k-Nearest Neighbor Graph for Proximity Search in High-dimensional Data
- On Approximately Searching for Similar Word Embeddings - ACL 2016