Skip to content

araobp/compact-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

993f0b3 · Jan 26, 2025
Jan 26, 2025
Jan 26, 2025
Jan 26, 2025
Jan 26, 2025
Jan 26, 2025
Jan 26, 2025
Jan 26, 2025
Sep 30, 2024
Oct 3, 2024
Nov 2, 2024
Oct 17, 2024
Sep 30, 2024
Jan 26, 2025

Repository files navigation

Compact RAG

Background

I develop a compact RAG (Retrieval-Augmented Generation) that runs on Raspberry Pi. As the database for RAG, I adopt SQLite and implement a vector DB using sqlite-vec.

Goal of this project

  • Develop a compact RAG that runs on Raspberry Pi, supporting Hybrid RAG: SQL DB and Vector DB.
  • The RAG also works as an API server for my other projects: virtual-showroom.

Requrements

  • OpenAI API key
  • LLM model: gpt-4o-mini
  • Embeddings model: text-embedding-3-small
  • Raspberry Pi

Architecture

                                   Brain
                           [OpenAI API service]
Unity app                            |
[VirtualShowroom]-----+              |
                      |              |
Web apps              |        Compact RAG (app.py)
[Web Browser]---------+------- [Raspberry Pi]---+---USB---[Camera with mic]
                      |              |          |
GenAI                 |          SQLite DB      +---USB---[Speaker]
[Node-RED]------------+

Compiling sqlite-vec on Rapsberry Pi

$ git clone https://github.com/asg017/sqlite-vec
$ cd sqlite-vec
$ sudo apt-get install libsqlite3-dev
$ make loadable 

Find "vec0.so" in ./dist directory.

Reference documents, chunking and embeddings for RAG

Document sources

Chunking and Embeddings

Implementations

Partition keys and auxiliary columns supported by sqlite-vec

I use partition keys and auxiliary columns for filtering records on the database in this project:

CREATE VIRTUAL TABLE virtual_showroom
USING vec0(
context text partition key,
embedding float[1536],
+chunk text
)

Reference: https://alexgarcia.xyz/sqlite-vec/features/vec0.html

Unit tests

Running the API server

$ cd app
$ python app.py

The API server provides simple web apps. Access "http://<IP address of the API server>:5050" with a web browser.

virtual-showroom uses this API server to access the OpenAI API service.

Starting the API server automatically

Refer to this article to start the server automatically.

A sample service file is like this:

[Unit]
Description=Python Generative AI API server
After=network.target

[Service]
ExecStart=/usr/bin/python3 -m app --directory <Path to "app" folder>
WorkingDirectory=<Path to "app" folder>
Restart=always
RestartSec=10
User=<Your user name>
Group=users
Environment=PYTHONPATH=<Path to this repo on Raspberry Pi>:$PYTHONPATH OPENAI_API_KEY=<OpenAI API key>

[Install]
WantedBy=multi-user.target

After having created the service file, do this:

$ sudo systemctl daemon-reload
$ sudo systemctl start gen_ai.service

Confirm the daemon process running:

$ sudo systemctl start gen_ai.service

If something wrong happened, check the syslog:

$ tail /var/log/syslog

Extra: Some experiments with gpt-4o-mini

About

Compact RAG that runs on Raspberry Pi

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published