VLM-RAG app

The goal of this project is to create a quick deployment of RAG pipelines for textual and visual data.

Features

Index files for chatting
Vector database for persistence
Answer user questions with a LLM
REST api
Reference original documents in the answers
Guardrails to prevent inappropriate and irrelevant answers
Webpage
Docker-compose to run all services

Usage (dev version)

Installation

Install poetry and use system env (in case using conda or other):

poetry env use system

Install torch and torchvision with conda (or other):

conda install pytorch torchvision -c pytorch

Install dependencies:

poetry install

Install ollama to run LLMs locally by downloading the installer from their website. Pull a model and test it:

ollama run llama3:latest

System overview

Haystack is used as the main backend for building indexing and RAG pipelines.
Qdrant is used as a vector database for storing chunk embeddings.
Fire is the main entrypoint to run commands from cli.
User inputs are validated with pydantic.

The main pipelines are defined in the __init__ file as following:

from .pipelines import RAG as rag
from .pipelines import IndexingPipeline as index

They can be invoked in the cli as following:

python main.py <pipeline_name> --pipeline-args <command> --command-args

To find all arguments use --help argument of fire:

Running commands

Follow the steps below to index documents and chat with them.

Vector database

First, we run a qdrant vector storage with docker in order to store document reference:

docker compose up -d qdrant

Now qdrant's api is available at the 6333 port.

Indexing documents

Once qdrant is running, we can index our documents from local filesystem. Currently supported text formats are:

plain text
markdown
pdf

In order to index a folder with documents, run the following command:

❯ python main.py index \
        --store_params '{location:localhost,index:recipe_files}' \
        run --path=tests/data/recipe_files

Outputs:
5 documents have been written to 'localhost'

As a result, we have new documents indexed to the qdrant vector db. Later on, we will be able to interact with these documents using an LLM.

To add new documents without overwriting existing documents, ensure to skip recreating a collection by adding an extra parameter.

❯ python main.py index \
        --recreate-index False \
        --store_params '{location:localhost,index:recipe_files}' \
        run --path=tests/data/extra_files

Ask questions (RAG)

Once documents have been indexed into a qdrant collection, it is time to start chatting with them. This can be accomplished by invoking the RAG pipeline through cli:

NOTE: this step assumes ollama is running on the localhost.

❯ python main.py rag \
      --store_params '{location:localhost,index:recipe_files}' \
      run --query "how do you make a vegan lasagna?"

The command is supposed to output the following:

Based on the given context, to make a vegan lasagna, follow these steps:

1. Slice eggplants into 1/4 inch thick slices and rub both sides with salt.
2. Let the eggplant slices sit in a colander for half an hour to draw out excess wa
ter.
3. Roast the eggplant slices at 400°F (200°C) for about 20 minutes, or until they'r
e soft and lightly browned.
4. Meanwhile, make the pesto by blending together basil leaves, almond meal, nutrit
ional yeast, garlic powder, lemon juice, and salt to taste.
5. Make the macadamia nut cheese by blending cooked spinach, steamed tofu, drained
water from the tofu, macadamia nuts until smooth, and adjusting seasonings with gar
lic, lemon juice, and salt to taste.
6. Assemble the lasagna by layering roasted eggplant, pesto, and vegan macadamia nu
t cheese in a casserole dish. Top with additional cheese if desired (optional).
7. Bake at 350°F (180°C) for about 25 minutes, or until the cheese is melted and bu
bbly.
8. Serve and enjoy!

Continue by asking different questions.

Serialize pipelines

In order to re-use pipelines with the REST api, save them as yaml files using the export command:

❯ python main.py index --store_params '{location:localhost,index:recipe_files}' export --write-path tests/data/pipelines/index_recipe_files.yaml

❯ python main.py rag --store_params '{location:localhost,index:recipe_files}' export --write-path tests/data/pipelines/rag_recipe_files.yaml

These commands will create two yaml files (one per each pipeline) at the specified filepath location. Any storage/llm/chunking parameters can be passed on the cli command or updated inside a yaml file.

REST api

REST api is implemented using hayhooks, which is tightly integrated with haystack.

NOTE: use my fork of hayhooks until this PR is merged.

Run a fastapi sever and load the exported pipelines:

❯ hayhooks run --pipelines-dir tests/data/pipelines                             ─╯

Outputs:
INFO:     Pipelines dir set to: tests/data/pipelines
INFO:     Deployed pipeline: index_recipe_files
INFO:     Deployed pipeline: rag_recipe_files
INFO:     Started server process [44078]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://localhost:1416 (Press CTRL+C to quit)

Navigate to localhost swagger and try out pipeline endpoints.

Usage (docker)

To run with docker, start services from the docker-compose file (this expects ollama running on localhost):

docker compose up

Navigate to swagger and try invoking pipeline endpoints.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
tests/data		tests/data
vlm_rag_chat		vlm_rag_chat
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Hayhooks.dockerfile		Hayhooks.dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VLM-RAG app

Features

Usage (dev version)

Installation

System overview

Running commands

Vector database

Indexing documents

Ask questions (RAG)

Serialize pipelines

REST api

Usage (docker)

About

Releases

Packages

Languages

License

Rusteam/vlm-rag-chat

Folders and files

Latest commit

History

Repository files navigation

VLM-RAG app

Features

Usage (dev version)

Installation

System overview

Running commands

Vector database

Indexing documents

Ask questions (RAG)

Serialize pipelines

REST api

Usage (docker)

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages