Skip to content

niklas-xgh-dev/tuskvector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python pgvector HNSW OpenAI ada2 FastAPI HTMX

TuskVector - API Platform 🐘

This API framework first transforms your data in 1536D vectors (as RAGs do), then employs HNSW indexing for efficient information retrieval (again, as RAGs do) using pgvector. In short - it enhances your database with search capabilities before plugging it into further queries. Check it out on https://tuskvector.com

Tech Stack

TuskVector is built with:

  • Python for the backend (no surprises there)
  • pgvector for Postgre DB vector functionality (elephants and vectors, get it?)
  • HNSW for fast approximate nearest neighbor search
  • OpenAI's ada2 for text embeddings
  • GPT 4o for LLM queries
  • FastAPI for building APIs
  • HTMX as frontend to dodge JavaScript (because apparently, that's a thing now)

Setup

It's packaged and uploaded to PyPI! - check out https://pypi.org/project/tuskvector/ and/or use

pip install tuskvector

API Endpoints - also found on https://tuskvector.com/docs

  1. Vector Embedding (POST /api/embed_text)

    • Utilizes OpenAI's text-embedding-ada-002 model
    • Generates 1536-dimensional embeddings
    • Automatically stores embeddings in pgvector-enabled PostgreSQL database
  2. Similarity Search (POST /api/similarity_search)

    • Implements cosine similarity metric
    • Utilizes HNSW (Hierarchical Navigable Small World) index for approximate nearest neighbor search
    • Configurable search parameters:
      • ef_search: Controls the trade-off between search speed and accuracy
      • Distance threshold: Filters results based on maximum allowed cosine distance
  3. Context-Aware LLM Queries (POST /api/query)

    • Integrates with OpenAI's GPT models
    • Enhances LLM responses with relevant context from the vector database
    • Implements a two-stage retrieval process:
      1. Vector similarity search to find relevant facts
      2. LLM query augmented with retrieved context

Configuration Options

  • HNSW_M: Maximum number of connections per layer in HNSW index (we went with 16)
  • HNSW_EF_CONSTRUCTION: Size of the dynamic candidate list for constructing the HNSW graph (we went with 64)
  • MAX_DISTANCE: Cosine distance threshold for similarity search (we went with 0.1)